CN113593565B - Intelligent home device management and control method and system - Google Patents

Intelligent home device management and control method and system Download PDF

Info

Publication number
CN113593565B
CN113593565B CN202111147183.6A CN202111147183A CN113593565B CN 113593565 B CN113593565 B CN 113593565B CN 202111147183 A CN202111147183 A CN 202111147183A CN 113593565 B CN113593565 B CN 113593565B
Authority
CN
China
Prior art keywords
voice
information
user
entity
intelligent equipment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111147183.6A
Other languages
Chinese (zh)
Other versions
CN113593565A (en
Inventor
黄晓娜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Da Sheng Jia Technology Co ltd
Original Assignee
Shenzhen Da Sheng Jia Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Da Sheng Jia Technology Co ltd filed Critical Shenzhen Da Sheng Jia Technology Co ltd
Priority to CN202111147183.6A priority Critical patent/CN113593565B/en
Publication of CN113593565A publication Critical patent/CN113593565A/en
Application granted granted Critical
Publication of CN113593565B publication Critical patent/CN113593565B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Telephonic Communication Services (AREA)
  • User Interface Of Digital Computer (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides an intelligent home equipment control method, which comprises the following steps: receiving user voice information; decomposing the voice information, extracting features through a neural network model, identifying the voice features of the user, and separating the user utterance; inputting the user utterance into a speech recognition model for speech recognition to generate text information; performing semantic analysis according to the generated text information, wherein the semantic information comprises intelligent equipment, operation and function states; identifying intelligent equipment, operation and function states according to the semantic information, sending a control instruction to the corresponding intelligent equipment, and enabling the corresponding intelligent equipment to execute corresponding operation and set to be in a corresponding function state; according to the method provided by the invention, the voice characteristics of the user are firstly identified in the mixed noise, the user words are separated, and then voice identification and semantic analysis are carried out, so that the intelligent equipment, operation and function states in the voice instruction of the user are accurately obtained, and therefore, the intelligent equipment can be accurately controlled, and the user experience is improved.

Description

Intelligent home device management and control method and system
Technical Field
The invention belongs to the field of intelligent home equipment, and particularly relates to a method and a system for managing and controlling intelligent home equipment.
Background
With the development of computer network technology, the application of artificial intelligence is becoming more and more extensive, such as manufacturing, entertainment, trade, etc., and various industries are dedicated to the research of artificial intelligence, especially for smart homes.
At present, a plurality of intelligent household products can meet the daily life demands of people. The user can realize the control and use of the smart home through a user interface running on terminal equipment (such as a handheld terminal and a smart phone). However, different smart homes often have different terminal devices or user interfaces, so that a user often needs to use a plurality of terminal devices or install a plurality of clients, and for the user, the operation is complex, the intelligence degree is not enough, and the user experience is not good; the mode of controlling the intelligent equipment through voice also begins to appear later, and the control is more and more convenient, but the problem of inaccurate voice control exists, and the user is also puzzled.
Disclosure of Invention
The invention aims to overcome the defects in the prior art and provides an intelligent home device control method.
The invention adopts the following technical scheme:
an intelligent home device management and control method comprises the following steps:
receiving user voice information;
decomposing the voice information, extracting the characteristics of the decomposed voice information through a neural network model, identifying the voice characteristics of the user, and separating the user utterance;
inputting the user utterance into a speech recognition model for speech recognition to generate corresponding text information; the voice recognition model comprises a sampling layer and a voice recognition layer, and the sampling layer performs convolution and down-sampling processing to obtain a local information characteristic vector; inputting the obtained local information characteristic vector into a voice recognition layer of a voice recognition model for processing, and obtaining text information corresponding to the user utterance;
performing semantic analysis according to the generated text information to acquire an entity in the text information to be analyzed; acquiring a structured entity vector corresponding to the entity according to the entity, wherein the structured entity vector is used for indicating the identification of the entity and the attribute of the entity; carrying out feature extraction on the structured entity vector to obtain entity features; fusing the entity features, the lexical features of the text and the syntactic features of the text to obtain semantic features of the text, wherein the semantic features are used for acquiring semantic information of the text, and the semantic information comprises intelligent equipment, operation and function states;
and identifying the intelligent equipment, the operation and the function state according to the semantic information, sending a control instruction to the corresponding intelligent equipment, and enabling the corresponding intelligent equipment to execute the corresponding operation and set the corresponding operation and function state.
Specifically, the method further comprises the following steps:
receiving feedback information sent by the intelligent equipment according to the control instruction;
and when the feedback information shows that the control command operation is successful, replying the information and finishing the operation.
Specifically, all the steps are preceded by:
and inputting the voice information of the user, and identifying the voice characteristics of the user according to the input voice information.
Specifically, decomposing the voice information, performing feature extraction on the decomposed voice information through a neural network model, and separating out the user utterance specifically includes:
firstly, decomposing voice information according to different frequency bands;
extracting the characteristics of the decomposed voice information through a neural network model, comparing the user voice characteristics input in advance to determine the user voice characteristics, and identifying the user voice characteristics;
and separating the user words according to the voice frequency band corresponding to the voice characteristics of the user.
Another embodiment of the present invention provides an intelligent home device management and control system, including:
a voice input unit: receiving user voice information;
separating a voice unit: decomposing the voice information, extracting the characteristics of the decomposed voice information through a neural network model, identifying the voice characteristics of the user, and separating the user utterance;
a voice recognition unit: inputting the user utterance into a speech recognition model for speech recognition to generate corresponding text information; the voice recognition model comprises a sampling layer and a voice recognition layer, and the sampling layer performs convolution and down-sampling processing to obtain a local information characteristic vector; inputting the obtained local information characteristic vector into a voice recognition layer of a voice recognition model for processing, and obtaining text information corresponding to the user utterance;
a semantic analysis unit: performing semantic analysis according to the generated text information to acquire an entity in the text information to be analyzed; acquiring a structured entity vector corresponding to the entity according to the entity, wherein the structured entity vector is used for indicating the identification of the entity and the attribute of the entity; carrying out feature extraction on the structured entity vector to obtain entity features; fusing the entity features, the lexical features of the text and the syntactic features of the text to obtain semantic features of the text, wherein the semantic features are used for acquiring semantic information of the text, and the semantic information comprises intelligent equipment, operation and function states;
an instruction operation unit: and identifying the intelligent equipment, the operation and the function state according to the semantic information, sending a control instruction to the corresponding intelligent equipment, and enabling the corresponding intelligent equipment to execute the corresponding operation and set the corresponding operation and function state.
Specifically, the method further comprises the following steps:
a feedback unit: receiving feedback information sent by the intelligent equipment according to the control instruction;
and when the feedback information shows that the control command operation is successful, replying the information and finishing the operation.
Specifically, the method further comprises the following steps:
a presetting unit: and inputting the voice information of the user, and identifying the voice characteristics of the user according to the input voice information.
Specifically, in the voice separation unit, the voice information is decomposed, and the voice information after decomposition is subjected to feature extraction through a neural network model, so as to separate the user utterance, specifically including:
firstly, decomposing voice information according to different frequency bands;
extracting the characteristics of the decomposed voice information through a neural network model, comparing the user voice characteristics input in advance to determine the user voice characteristics, and identifying the user voice characteristics;
and separating the user words according to the voice frequency band corresponding to the voice characteristics of the user.
Yet another embodiment of the present invention provides an electronic device, including a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, wherein the processor implements the intelligent home device management and control method as described above when executing the computer program.
Yet another embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, wherein the program, when executed by a processor, implements a smart home device management and control method as described above.
As can be seen from the above description of the present invention, compared with the prior art, the present invention has the following advantages:
(1) the invention provides an intelligent home equipment control method, which comprises the steps of firstly receiving user voice information; decomposing the voice information, extracting the characteristics of the decomposed voice information through a neural network model, identifying the voice characteristics of the user, and separating the user utterance; inputting the user utterance into a speech recognition model for speech recognition to generate corresponding text information; performing semantic analysis according to the generated text information, wherein the semantic information comprises intelligent equipment, operation and function states; identifying intelligent equipment, operation and function states according to the semantic information, sending a control instruction to the corresponding intelligent equipment, and enabling the corresponding intelligent equipment to execute corresponding operation and set to be in a corresponding function state; according to the method provided by the invention, the voice characteristics of the user are firstly identified in the mixed noise, the user words are separated, and then voice identification and semantic analysis are carried out, so that the intelligent equipment, operation and function states in the voice instruction of the user are accurately obtained, and therefore, the intelligent equipment can be accurately controlled, and the user experience is improved.
(2) According to the method provided by the invention, voice information is decomposed, and the decomposed voice information is subjected to feature extraction through a neural network model, so that the voice features of a user are identified, and the user words are separated; interference information is eliminated, intelligent equipment control words of a user are separated, and a foundation is provided for realizing accurate control of the intelligent equipment.
(3) According to the method provided by the invention, semantic analysis is carried out according to the generated text information, the invention realizes the fusion of heterogeneous information by fusing entity characteristics, lexical characteristics and syntactic characteristics, and the semantic information of three different vector spaces, namely the entity characteristics, the lexical characteristics and the syntactic characteristics, is combined together to identify semantics, so that the semantic understanding accuracy is improved, and the control accuracy of intelligent equipment is further improved.
Drawings
Fig. 1 is a flowchart of a method for managing and controlling smart home devices according to an embodiment of the present invention;
fig. 2 is a structural diagram of an intelligent home device management and control system according to an embodiment of the present invention;
fig. 3 is a schematic diagram of an embodiment of an electronic device according to an embodiment of the present invention;
fig. 4 is a schematic diagram of an embodiment of a computer-readable storage medium according to an embodiment of the present invention.
Detailed Description
The invention is further described below by means of specific embodiments.
The invention provides an intelligent home device control method, which comprises the steps of firstly identifying the voice characteristics of a user in mixed noise, separating the words of the user, then carrying out voice identification and semantic analysis, and accurately obtaining the intelligent device, operation and function states in a voice instruction of the user, thereby realizing accurate control of the intelligent device and improving user experience.
Fig. 1 is a flowchart of a method for managing and controlling smart home devices according to an embodiment of the present invention; the method specifically comprises the following steps:
s101: receiving voice information of a user;
when a user needs to use the intelligent equipment, only a voice instruction needs to be sent, such as 'turn on the air conditioner, set the temperature to be 26 degrees', 'turn on the television, set the channel as a central set', and the like; the method provided by the invention comprises the steps of firstly receiving voice information of a user;
before the step, the method also comprises the steps of inputting user voice information and identifying the voice characteristics of the user according to the input voice information;
since the method is to analyze according to the voice characteristics of the user, the voice information of the user needs to be input in advance, the voice characteristics are recognized according to the input voice information, and certainly, the voice information of a plurality of users can be input, so that other users can also realize the control of the intelligent home device.
S102: decomposing the voice information, extracting the characteristics of the decomposed voice information through a neural network model, identifying the voice characteristics of the user, and separating the user utterance;
decomposing the voice information, and performing feature extraction on the decomposed voice information through a neural network model to separate user utterances, which specifically comprises the following steps:
firstly, decomposing voice information according to different frequency bands;
extracting the characteristics of the decomposed voice information through a neural network model, comparing the user voice characteristics input in advance to determine the user voice characteristics, and identifying the user voice characteristics;
the neural network model is a CNN network or a ResNet network, the voice characteristics of the user are voiceprint characteristics, and the voiceprint characteristics are unique, the embodiment of the invention applies the CNN network, and the specific extraction process comprises the following steps:
the CNN network comprises a convolution layer, a pooling layer and a full-connection layer, and the voice matrix is input into the convolution layer to perform local feature extraction to obtain a feature vector C ═ C1 , c2 ,…,cm ]Where cj is a filter derived feature vector, cj=[c1 ,c2 ,...,cn-h+1]M is the number of convolution kernels, h is the size of the convolution kernel, ci is the local feature vector, and ciF (ω × xi: i + h-1 + b), wherein ciIs the local feature extracted by the convolution operation; f represents a non-linear function; ω is a filter of size h × d; h is the size of the convolution kernel, representing h words; x is the number ofi:i+h-1Is a vector formed by h words from i to i + h-1; b is a bias term; the pooling layer is used for matching the feature vector c by adopting a Max scaling technologyjAnd performing down-sampling to obtain an optimal solution Mj of the local value to obtain a feature vector M after down-sampling, and inputting the feature vector M into the full-connection layer to obtain a vector U.
And separating the user words according to the voice frequency band corresponding to the voice characteristics of the user.
When a user sends voice information to control the intelligent equipment, useful voice information usually contains background sounds, including the sound of a pet at home, the sound sent by a collision object and the like, so that the voice control is out of order or inaccurate;
therefore, firstly, the voice information is decomposed, and the decomposition method in the embodiment of the invention mainly decomposes the voice information into a plurality of frequency bands according to the frequency of sound; the more frequency bands, the higher the control accuracy, but the more complex the algorithm and the serious delay, considering the balance of the control accuracy and the delay, the embodiment of the invention is divided into 4 frequency bands, 300HZ, 300Hz-1KHz, 1KHz-3.4KHz and 3.4 KHz.
S103: inputting the user utterance into a speech recognition model for speech recognition to generate corresponding text information; the voice recognition model comprises a sampling layer and a voice recognition layer, and the sampling layer performs convolution and down-sampling processing to obtain a local information characteristic vector; inputting the obtained local information characteristic vector into a voice recognition layer of a voice recognition model for processing, and obtaining text information corresponding to the user utterance;
the sampling layer comprises a convolution layer and a pooling layer, convolution processing is carried out on the user words through the convolution layer in the sampling layer, convolution results are input into the pooling layer in the sampling layer, and downsampling operation is carried out on the convolution results through a maximum pooling method to obtain local information characteristic vectors. The maximum pooling method comprises the following steps: taking a point with the maximum value in a local acceptance domain; the specific operations of the convolution and downsampling are the contents of the prior art, and are not described in detail.
The embodiment of the invention adopts a voice recognition model long short term memory network (LSTM) and CTC (connecting temporal classification) algorithm, and the main process of voice recognition is as follows: (1) extracting acoustic features from the sound waveform; (2) converting the acoustic features into factors of pronunciation; (3) converting into readable text information by using decoding technology; the decoding process computes acoustic and language model scores for a given phoneme sequence and several hypothesized word sequences, taking the sequence with the highest overall output score as the recognition result.
S104: performing semantic analysis according to the generated text information to acquire an entity in the text information to be analyzed; acquiring a structured entity vector corresponding to the entity according to the entity, wherein the structured entity vector is used for indicating the identification of the entity and the attribute of the entity; carrying out feature extraction on the structured entity vector to obtain entity features; fusing the entity features, the lexical features of the text and the syntactic features of the text to obtain semantic features of the text, wherein the semantic features are used for acquiring semantic information of the text, and the semantic information comprises intelligent equipment, operation and function states;
the invention realizes the fusion of heterogeneous information by fusing the entity characteristics, the lexical characteristics and the syntactic characteristics, combines the semantic information of three different vector spaces of the entity characteristics, the lexical characteristics and the syntactic characteristics together to identify the semantics, thereby improving the accuracy of semantic understanding;
the semantic information includes the intelligent device, the operation and the function state, such as a voice command in "turn on the air conditioner, set the temperature to 26 degrees", the "air conditioner" is the intelligent device, the "turn on" is the operation, the "temperature to 26 degrees" is the function state, and further such as "turn on the television, the channel is the central set", "the television" is the intelligent device, the "turn on" is the operation, and the "channel is the central set".
S105: and identifying the intelligent equipment, the operation and the function state according to the semantic information, sending a control instruction to the corresponding intelligent equipment, and enabling the corresponding intelligent equipment to execute the corresponding operation and set the corresponding operation and function state.
After receiving the control instruction, the intelligent device executes corresponding actions and feeds back information, such as 'completed' or 'air conditioner opened, temperature set to 26 degrees' and the like;
receiving feedback information sent by the intelligent equipment according to the control instruction;
and when the feedback information shows that the control command operation is successful, replying the information and finishing the operation.
Reply messages, e.g. "receive, thank you"!
As shown in fig. 2, another embodiment of the present invention provides an intelligent home device management and control system, including:
the voice input unit 201: receiving voice information of a user;
when a user needs to use the intelligent equipment, only a voice instruction needs to be sent, such as 'turn on the air conditioner, set the temperature to be 26 degrees', 'turn on the television, set the channel as a central set', and the like; the method provided by the invention comprises the steps of firstly receiving voice information of a user;
in addition, the voice recognition system also comprises a preset unit, a voice recognition unit and a voice recognition unit, wherein the preset unit is used for inputting the voice information of the user and recognizing the voice characteristics of the user according to the input voice information;
since the method is to analyze according to the voice characteristics of the user, the voice information of the user needs to be input in advance, the voice characteristics are recognized according to the input voice information, and certainly, the voice information of a plurality of users can be input, so that other users can also realize the control of the intelligent home device.
Separate speech unit 202: decomposing the voice information, extracting the characteristics of the decomposed voice information through a neural network model, identifying the voice characteristics of the user, and separating the user utterance;
decomposing the voice information, and performing feature extraction on the decomposed voice information through a neural network model to separate user utterances, which specifically comprises the following steps:
firstly, decomposing voice information according to different frequency bands;
extracting the characteristics of the decomposed voice information through a neural network model, comparing the user voice characteristics input in advance to determine the user voice characteristics, and identifying the user voice characteristics;
and separating the user words according to the voice frequency band corresponding to the voice characteristics of the user.
When a user sends voice information to control the intelligent equipment, useful voice information usually contains background sounds, including the sound of a pet at home, the sound sent by a collision object and the like, so that the voice control is out of order or inaccurate;
therefore, firstly, the voice information is decomposed, and the decomposition method in the embodiment of the invention mainly decomposes the voice information into a plurality of frequency bands according to the frequency of sound; the more frequency bands, the higher the control accuracy, but the more complex the algorithm and the serious delay, considering the balance of the control accuracy and the delay, the embodiment of the invention is divided into 4 frequency bands, 300HZ, 300Hz-1KHz, 1KHz-3.4KHz and 3.4 KHz.
The voice recognition unit 203: inputting the user utterance into a speech recognition model for speech recognition to generate corresponding text information; the voice recognition model comprises a sampling layer and a voice recognition layer, and the sampling layer performs convolution and down-sampling processing to obtain a local information characteristic vector; inputting the obtained local information characteristic vector into a voice recognition layer of a voice recognition model for processing, and obtaining text information corresponding to the user utterance;
the embodiment of the invention adopts a voice recognition model long short term memory network (LSTM) and CTC (connecting temporal classification) algorithm, and the main process of voice recognition is as follows: (1) extracting acoustic features from the sound waveform; (2) converting the acoustic features into factors of pronunciation; (3) converting into readable text information by using decoding technology; the decoding process computes acoustic and language model scores for a given phoneme sequence and several hypothesized word sequences, taking the sequence with the highest overall output score as the recognition result.
The semantic analysis unit 204: performing semantic analysis according to the generated text information to acquire an entity in the text information to be analyzed; acquiring a structured entity vector corresponding to the entity according to the entity, wherein the structured entity vector is used for indicating the identification of the entity and the attribute of the entity; carrying out feature extraction on the structured entity vector to obtain entity features; fusing the entity features, the lexical features of the text and the syntactic features of the text to obtain semantic features of the text, wherein the semantic features are used for acquiring semantic information of the text, and the semantic information comprises intelligent equipment, operation and function states;
the invention realizes the fusion of heterogeneous information by fusing the entity characteristics, the lexical characteristics and the syntactic characteristics, combines the semantic information of three different vector spaces of the entity characteristics, the lexical characteristics and the syntactic characteristics together to identify the semantics, thereby improving the accuracy of semantic understanding;
the semantic information includes the intelligent device, the operation and the function state, such as a voice command in "turn on the air conditioner, set the temperature to 26 degrees", the "air conditioner" is the intelligent device, the "turn on" is the operation, the "temperature to 26 degrees" is the function state, and further such as "turn on the television, the channel is the central set", "the television" is the intelligent device, the "turn on" is the operation, and the "channel is the central set".
The instruction operation unit 205: and identifying the intelligent equipment, the operation and the function state according to the semantic information, sending a control instruction to the corresponding intelligent equipment, and enabling the corresponding intelligent equipment to execute the corresponding operation and set the corresponding operation and function state.
After receiving the control instruction, the intelligent device executes corresponding actions and feeds back information, such as 'completed' or 'air conditioner opened, temperature set to 26 degrees' and the like;
a feedback unit: receiving feedback information sent by the intelligent equipment according to the control instruction;
and when the feedback information shows that the control command operation is successful, replying the information and finishing the operation.
Reply messages, e.g. "receive, thank you"!
As shown in fig. 3, an embodiment of the present invention provides an electronic device 300, which includes a memory 310, a processor 320, and a computer program 311 stored in the memory 320 and running on the processor 320, where when the processor 320 executes the computer program 311, the intelligent home device management and control method provided by the embodiment of the present invention is implemented.
In a specific implementation, when the processor 320 executes the computer program 311, any of the embodiments corresponding to fig. 1 may be implemented.
Since the electronic device described in this embodiment is a device used for implementing a data processing apparatus in the embodiment of the present invention, based on the method described in this embodiment of the present invention, a person skilled in the art can understand the specific implementation manner of the electronic device in this embodiment and various variations thereof, so that how to implement the method in this embodiment of the present invention by the electronic device is not described in detail herein, and as long as the person skilled in the art implements the device used for implementing the method in this embodiment of the present invention, the device used for implementing the method in this embodiment of the present invention belongs to the protection scope of the present invention.
Referring to fig. 4, fig. 4 is a schematic diagram illustrating an embodiment of a computer-readable storage medium according to the present invention.
As shown in fig. 4, the present embodiment provides a computer-readable storage medium 400, on which a computer program 411 is stored, the computer program 411 implementing an intelligent home device management and control method provided by the present embodiment when executed by a processor;
in a specific implementation, the computer program 411 may implement any of the embodiments corresponding to fig. 1 when executed by a processor.
It should be noted that, in the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to relevant descriptions of other embodiments for parts that are not described in detail in a certain embodiment.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The invention provides an intelligent home equipment control method, which comprises the steps of firstly receiving user voice information; decomposing the voice information, extracting the characteristics of the decomposed voice information through a neural network model, identifying the voice characteristics of the user, and separating the user utterance; inputting the user utterance into a speech recognition model for speech recognition to generate corresponding text information; performing semantic analysis according to the generated text information, wherein the semantic information comprises intelligent equipment, operation and function states; identifying intelligent equipment, operation and function states according to the semantic information, sending a control instruction to the corresponding intelligent equipment, and enabling the corresponding intelligent equipment to execute corresponding operation and set to be in a corresponding function state; according to the method provided by the invention, the voice characteristics of the user are firstly identified in the mixed noise, the user words are separated, and then voice identification and semantic analysis are carried out, so that the intelligent equipment, operation and function states in the voice instruction of the user are accurately obtained, and therefore, the intelligent equipment can be accurately controlled, and the user experience is improved.
According to the method provided by the invention, voice information is decomposed, and the decomposed voice information is subjected to feature extraction through a neural network model, so that the voice features of a user are identified, and the user words are separated; interference information is eliminated, intelligent equipment control words of a user are separated, and a foundation is provided for realizing accurate control of the intelligent equipment.
According to the method provided by the invention, semantic analysis is carried out according to the generated text information, the invention realizes the fusion of heterogeneous information by fusing entity characteristics, lexical characteristics and syntactic characteristics, and the semantic information of three different vector spaces, namely the entity characteristics, the lexical characteristics and the syntactic characteristics, is combined together to identify semantics, so that the semantic understanding accuracy is improved, and the control accuracy of intelligent equipment is further improved.
The above description is only an embodiment of the present invention, but the design concept of the present invention is not limited thereto, and any insubstantial modifications made by using the design concept should fall within the scope of infringing the present invention.

Claims (8)

1. An intelligent home device management and control method is characterized by comprising the following steps:
receiving user voice information;
decomposing the voice information, extracting the characteristics of the decomposed voice information through a neural network model, identifying the voice characteristics of the user, and separating the user utterance;
decomposing the voice information, and performing feature extraction on the decomposed voice information through a neural network model to separate user utterances, which specifically comprises the following steps:
firstly, decomposing voice information according to different frequency bands;
extracting the characteristics of the decomposed voice information through a neural network model, comparing the user voice characteristics input in advance to determine the user voice characteristics, and identifying the user voice characteristics;
separating user words according to a voice frequency band corresponding to the voice characteristics of the user;
inputting the user utterance into a speech recognition model for speech recognition to generate corresponding text information; the voice recognition model comprises a sampling layer and a voice recognition layer, and the sampling layer performs convolution and down-sampling processing to obtain a local information characteristic vector; inputting the obtained local information characteristic vector into a voice recognition layer of a voice recognition model for processing, and obtaining text information corresponding to the user utterance;
performing semantic analysis according to the generated text information to acquire an entity in the text information to be analyzed; acquiring a structured entity vector corresponding to the entity according to the entity, wherein the structured entity vector is used for indicating the identification of the entity and the attribute of the entity; carrying out feature extraction on the structured entity vector to obtain entity features; fusing the entity features, the lexical features of the text and the syntactic features of the text to obtain semantic features of the text, wherein the semantic features are used for acquiring semantic information of the text, and the semantic information comprises intelligent equipment, operation and function states;
and identifying the intelligent equipment, the operation and the function state according to the semantic information, sending a control instruction to the corresponding intelligent equipment, and enabling the corresponding intelligent equipment to execute the corresponding operation and set the corresponding operation and function state.
2. The intelligent home device management and control method according to claim 1, further comprising:
receiving feedback information sent by the intelligent equipment according to the control instruction;
and when the feedback information shows that the control command operation is successful, replying the information and finishing the operation.
3. The intelligent home device management and control method according to claim 1, wherein all the steps are preceded by:
and inputting the voice information of the user, and identifying the voice characteristics of the user according to the input voice information.
4. The utility model provides an intelligent household equipment management and control system which characterized in that includes:
a voice input unit: receiving user voice information;
separating a voice unit: decomposing the voice information, extracting the characteristics of the decomposed voice information through a neural network model, identifying the voice characteristics of the user, and separating the user utterance;
in the separation speech unit, carry out the decomposition to speech information to carry out feature extraction to speech information after decomposing through neural network model, separate out the user's utterance, specifically include:
firstly, decomposing voice information according to different frequency bands;
extracting the characteristics of the decomposed voice information through a neural network model, comparing the user voice characteristics input in advance to determine the user voice characteristics, and identifying the user voice characteristics;
separating user words according to a voice frequency band corresponding to the voice characteristics of the user;
a voice recognition unit: inputting the user utterance into a speech recognition model for speech recognition to generate corresponding text information; the voice recognition model comprises a sampling layer and a voice recognition layer, and the sampling layer performs convolution and down-sampling processing to obtain a local information characteristic vector; inputting the obtained local information characteristic vector into a voice recognition layer of a voice recognition model for processing, and obtaining text information corresponding to the user utterance;
a semantic analysis unit: performing semantic analysis according to the generated text information to acquire an entity in the text information to be analyzed; acquiring a structured entity vector corresponding to the entity according to the entity, wherein the structured entity vector is used for indicating the identification of the entity and the attribute of the entity; carrying out feature extraction on the structured entity vector to obtain entity features; fusing the entity features, the lexical features of the text and the syntactic features of the text to obtain semantic features of the text, wherein the semantic features are used for acquiring semantic information of the text, and the semantic information comprises intelligent equipment, operation and function states;
an instruction operation unit: and identifying the intelligent equipment, the operation and the function state according to the semantic information, sending a control instruction to the corresponding intelligent equipment, and enabling the corresponding intelligent equipment to execute the corresponding operation and set the corresponding operation and function state.
5. The intelligent home device management and control system according to claim 4, further comprising:
a feedback unit: receiving feedback information sent by the intelligent equipment according to the control instruction;
and when the feedback information shows that the control command operation is successful, replying the information and finishing the operation.
6. The intelligent home device management and control system according to claim 4, further comprising:
a presetting unit: and inputting the voice information of the user, and identifying the voice characteristics of the user according to the input voice information.
7. An electronic device comprising a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, the processor implementing a smart home device management and control method according to any one of claims 1 to 3 when executing the computer program.
8. A computer-readable storage medium on which a computer program is stored, the program implementing a smart home device management method according to any one of claims 1 to 3 when executed by a processor.
CN202111147183.6A 2021-09-29 2021-09-29 Intelligent home device management and control method and system Active CN113593565B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111147183.6A CN113593565B (en) 2021-09-29 2021-09-29 Intelligent home device management and control method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111147183.6A CN113593565B (en) 2021-09-29 2021-09-29 Intelligent home device management and control method and system

Publications (2)

Publication Number Publication Date
CN113593565A CN113593565A (en) 2021-11-02
CN113593565B true CN113593565B (en) 2021-12-17

Family

ID=78242520

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111147183.6A Active CN113593565B (en) 2021-09-29 2021-09-29 Intelligent home device management and control method and system

Country Status (1)

Country Link
CN (1) CN113593565B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114925158A (en) * 2022-03-15 2022-08-19 青岛海尔科技有限公司 Sentence text intention recognition method and device, storage medium and electronic device

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6219645B1 (en) * 1999-12-02 2001-04-17 Lucent Technologies, Inc. Enhanced automatic speech recognition using multiple directional microphones
CN101944359A (en) * 2010-07-23 2011-01-12 杭州网豆数字技术有限公司 Voice recognition method facing specific crowd
CN108320742A (en) * 2018-01-31 2018-07-24 广东美的制冷设备有限公司 Voice interactive method, smart machine and storage medium
CN109410927A (en) * 2018-11-29 2019-03-01 北京蓦然认知科技有限公司 Offline order word parses the audio recognition method combined, device and system with cloud
CN109754804A (en) * 2019-02-21 2019-05-14 珠海格力电器股份有限公司 A kind of sound control method, device, storage medium and smart home system
CN112823341A (en) * 2018-10-05 2021-05-18 三菱电机株式会社 Voice operation support system, voice operation system, voice processing device, voice operation support method, and program
CN113205817A (en) * 2021-07-06 2021-08-03 明品云(北京)数据科技有限公司 Speech semantic recognition method, system, device and medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6219645B1 (en) * 1999-12-02 2001-04-17 Lucent Technologies, Inc. Enhanced automatic speech recognition using multiple directional microphones
CN101944359A (en) * 2010-07-23 2011-01-12 杭州网豆数字技术有限公司 Voice recognition method facing specific crowd
CN108320742A (en) * 2018-01-31 2018-07-24 广东美的制冷设备有限公司 Voice interactive method, smart machine and storage medium
CN112823341A (en) * 2018-10-05 2021-05-18 三菱电机株式会社 Voice operation support system, voice operation system, voice processing device, voice operation support method, and program
CN109410927A (en) * 2018-11-29 2019-03-01 北京蓦然认知科技有限公司 Offline order word parses the audio recognition method combined, device and system with cloud
CN109754804A (en) * 2019-02-21 2019-05-14 珠海格力电器股份有限公司 A kind of sound control method, device, storage medium and smart home system
CN113205817A (en) * 2021-07-06 2021-08-03 明品云(北京)数据科技有限公司 Speech semantic recognition method, system, device and medium

Also Published As

Publication number Publication date
CN113593565A (en) 2021-11-02

Similar Documents

Publication Publication Date Title
CN109326289B (en) Wake-up-free voice interaction method, device, equipment and storage medium
CN108962255B (en) Emotion recognition method, emotion recognition device, server and storage medium for voice conversation
CN111341325A (en) Voiceprint recognition method and device, storage medium and electronic device
CN104575504A (en) Method for personalized television voice wake-up by voiceprint and voice identification
CN111161726B (en) Intelligent voice interaction method, device, medium and system
CN111445898B (en) Language identification method and device, electronic equipment and storage medium
CN111243603B (en) Voiceprint recognition method, system, mobile terminal and storage medium
CN112669842A (en) Man-machine conversation control method, device, computer equipment and storage medium
CN115062143A (en) Voice recognition and classification method, device, equipment, refrigerator and storage medium
CN113593565B (en) Intelligent home device management and control method and system
CN111178081A (en) Semantic recognition method, server, electronic device and computer storage medium
CN109065026B (en) Recording control method and device
CN111179903A (en) Voice recognition method and device, storage medium and electric appliance
CN110674276A (en) Robot self-learning method, robot terminal, device and readable storage medium
CN112199498A (en) Man-machine conversation method, device, medium and electronic equipment for endowment service
CN111640450A (en) Multi-person audio processing method, device, equipment and readable storage medium
CN116361316A (en) Semantic engine adaptation method, device, equipment and storage medium
CN110853669A (en) Audio identification method, device and equipment
CN115985320A (en) Intelligent device control method and device, electronic device and storage medium
CN111508481B (en) Training method and device of voice awakening model, electronic equipment and storage medium
CN115098765A (en) Information pushing method, device and equipment based on deep learning and storage medium
CN114420103A (en) Voice processing method and device, electronic equipment and storage medium
CN114627859A (en) Method and system for recognizing electronic photo frame in offline semantic manner
CN113053416A (en) Speech emotion recognition system based on knowledge graph
CN112466287A (en) Voice segmentation method and device and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant