CN113778226A - Infrared AI intelligent glasses based on speech recognition technology control intelligence house - Google Patents

Infrared AI intelligent glasses based on speech recognition technology control intelligence house Download PDF

Info

Publication number
CN113778226A
CN113778226A CN202110987518.9A CN202110987518A CN113778226A CN 113778226 A CN113778226 A CN 113778226A CN 202110987518 A CN202110987518 A CN 202110987518A CN 113778226 A CN113778226 A CN 113778226A
Authority
CN
China
Prior art keywords
module
voice signal
sub
signal
infrared
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110987518.9A
Other languages
Chinese (zh)
Inventor
雷鸣
刘建曦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangxi Hengbida Industrial Co ltd
Original Assignee
Jiangxi Hengbida Industrial Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangxi Hengbida Industrial Co ltd filed Critical Jiangxi Hengbida Industrial Co ltd
Priority to CN202110987518.9A priority Critical patent/CN113778226A/en
Publication of CN113778226A publication Critical patent/CN113778226A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • General Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The invention discloses infrared AI intelligent glasses for controlling an intelligent home based on a voice recognition technology, which comprise: the acquisition module is used for acquiring a first voice signal of a user; the processing module is used for preprocessing the first voice signal to obtain a second voice signal; the analysis module is used for extracting text information in the second voice signal, performing semantic analysis on the text information to obtain an analysis result, and generating an infrared control instruction for the smart home according to the analysis result; and the first control module is used for regulating and controlling the smart home based on the infrared control instruction. Has the advantages that: through discerning user's pronunciation, and then the intelligent house in the control room improves user's experience and feels, realizes intelligent control.

Description

Infrared AI intelligent glasses based on speech recognition technology control intelligence house
Technical Field
The invention relates to the technical field of intelligent glasses, in particular to infrared AI intelligent glasses for controlling an intelligent home based on a voice recognition technology.
Background
At present, wearable equipment gradually walks into people's daily life, especially intelligent glasses can bring a lot of facilities for user's life. In the related art, the smart glasses have some notification prompting functions, for example, displaying an incoming call notification or a short message notification of a bound terminal, and reminding a user of relieving fatigue after wearing the smart glasses for a long time. However, these notifications are very simple and provide limited convenience to the user, so how to further extend the functions of the smart glasses becomes a technical problem to be solved at present.
Disclosure of Invention
The present invention is directed to solving, at least to some extent, one of the technical problems in the art described above. Therefore, the invention aims to provide the infrared AI intelligent glasses for controlling the intelligent home based on the voice recognition technology, which can control the indoor intelligent home by recognizing the voice of the user, improve the experience of the user and realize intelligent control.
In order to achieve the above object, an embodiment of the present invention provides infrared AI intelligent glasses for controlling an intelligent home based on a voice recognition technology, including:
the acquisition module is used for acquiring a first voice signal of a user;
the processing module is used for preprocessing the first voice signal to obtain a second voice signal;
the analysis module is used for extracting text information in the second voice signal, performing semantic analysis on the text information to obtain an analysis result, and generating an infrared control instruction for the smart home according to the analysis result;
and the first control module is used for regulating and controlling the smart home based on the infrared control instruction.
Further, infrared AI intelligent glasses based on speech recognition technology control intelligent house still include:
the judging module is used for judging whether the first voice signal meets a preset condition or not after the acquiring module acquires the first voice signal, and sending the first voice signal to the processing module when the first voice signal is determined to meet the preset condition; otherwise, the first voice signal is subjected to elimination processing.
Further, the judging module comprises:
the extraction unit is used for extracting the characteristics of the first voice signal to obtain the voiceprint characteristics of the first voice signal;
the matching unit is used for matching the voiceprint features with preset voiceprint features, calculating to obtain a matching degree, and when the matching degree is determined to be larger than the preset matching degree, indicating that the first voice signal meets a preset condition; otherwise, the first voice signal does not meet the preset condition.
Further, the generating of the infrared control instruction for the smart home according to the analysis result includes:
obtaining type information of the intelligent furniture according to the analysis result, inquiring a preset intelligent furniture infrared code value library according to the type information to obtain an infrared code value corresponding to the type information, and generating an infrared control instruction according to the infrared code value.
Further, infrared AI intelligent glasses based on speech recognition technology control intelligent house still include:
the Bluetooth module is used for establishing Bluetooth pairing connection with the user terminal and receiving music data sent by the user terminal;
and the playing module is used for playing the music data.
Further, the processing module comprises:
a suppression module to:
performing fast Fourier transform on the first voice signal to obtain a first frequency domain voice signal, performing nonlinear processing on the first frequency domain voice signal, and performing signal segmentation processing on the nonlinear-processed first frequency domain voice signal to obtain a plurality of sub-first frequency domain voice signals;
obtaining the frequency of each sub-first frequency domain voice signal, and screening out the sub-first frequency domain signals of which the frequency is less than a preset frequency to obtain a first set;
screening out the sub first frequency domain signals with the frequency greater than or equal to the preset frequency to obtain a second set;
acquiring and sequencing the first power of each sub-first frequency domain signal in the first set, screening out the maximum first power, and taking the maximum first power as a target power;
obtaining a second power of each sub-first frequency domain signal in the second set, comparing the second power with the target power, and performing power suppression processing on the sub-first frequency domain signals of which the second power is greater than the target power;
generating a second frequency domain signal according to the sub first frequency domain signals in the first set, the sub first frequency domain signals which are not subjected to power suppression processing in the second set and the sub first frequency domain signals which are subjected to power suppression processing, and performing inverse fast Fourier transform to obtain voice signals to be enhanced;
the enhancement module is connected with the suppression module and used for performing framing processing on the voice signal to be enhanced to obtain a plurality of frames of sub voice signals to be enhanced and respectively obtain the type of each frame of sub voice signals to be enhanced; the types include unvoiced frames and voiced frames;
taking the sub-to-be-enhanced voice signal with the type of unvoiced frame as a reserved signal;
taking the sub-to-be-enhanced voice signal with the type of the voiced frame as a signal to be processed;
calculating the spectrum envelope of each signal to be processed, extracting a formant on each spectrum envelope, and acquiring the amplitude of each formant; wherein, one signal to be processed corresponds to one spectrum envelope line;
determining an enhancement coefficient of a Hanning window filter according to the amplitude, and inputting a corresponding signal to be processed into the Hanning window filter for enhancement processing according to the enhancement coefficient;
and generating a second voice signal according to the reserved signal and the to-be-processed signal after the enhancement processing. Further, the respectively obtaining the types of the sub-to-be-enhanced voice signals of each frame includes:
and respectively inputting each frame of sub voice signals to be enhanced into a classification model trained in advance, and outputting the type corresponding to each frame of sub voice signals to be enhanced.
Further, infrared AI intelligent glasses based on speech recognition technology control intelligent house still include:
the power adjusting module is used for adjusting the receiving power of the Bluetooth module;
and the second control module is used for calculating the target receiving power of the Bluetooth module when the user terminal transmits the music data to the Bluetooth module, and controlling the power regulation module to regulate the receiving power of the Bluetooth module according to the target receiving power.
Further, the analysis module includes:
the text generation module is used for recognizing the second voice signal based on a voice recognition technology to obtain a first text;
the text processing module is connected with the text generation module and is used for:
judging whether the first text has the language meaning word or not, and deleting the language meaning word when the language meaning word is determined to exist in the first text;
performing word segmentation on the first text with the removed Chinese language word to obtain a plurality of first words, respectively judging whether each first word exists in a preset correct word bank, and screening out the first words existing in the preset correct word bank as effective words; screening out first participles which do not exist in a preset correct word bank as to-be-processed participles;
inputting each word to be processed into a pre-trained error correction model, outputting a correct word corresponding to each word to be processed, and performing replacement processing on the corresponding word to be processed according to the correct word;
generating a second text according to the effective word segmentation and the word segmentation to be processed after the replacement processing;
the keyword extraction module is connected with the text processing module and is used for:
performing word segmentation processing on the second text to obtain a plurality of second words, and respectively obtaining attribute information of each second word; the attribute information comprises the part of speech, the length and the position of the participle in the second text;
evaluating the corresponding second participles according to the attribute information to obtain an evaluation value of each second participle;
respectively comparing the evaluation value of each second participle with a preset evaluation value, and screening out the second participles with the evaluation values larger than the preset evaluation value as keywords;
and the semantic analysis module is connected with the keyword extraction module and used for performing semantic analysis on the keywords to obtain an analysis result.
Further, the playing module comprises a loudspeaker.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and drawings.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
FIG. 1 is a block diagram of infrared AI smart glasses for controlling smart home based on voice recognition technology according to the present invention;
FIG. 2 is a diagram of the main body of infrared AI smart glasses;
fig. 3 is a block diagram of infrared AI smart glasses for controlling smart home based on voice recognition technology according to an embodiment of the present invention;
fig. 4 is a block diagram of infrared AI smart glasses for controlling smart home based on voice recognition technology according to an embodiment of the present invention.
Reference numerals:
the system comprises an acquisition module 1, a processing module 2, an analysis module 3, a first control module 4, a lens 5, a spectacle frame 6, a nose support 7, spectacle legs 8, a suppression module 9, an enhancement module 10, a text generation module 11, a text processing module 12, a keyword extraction module 13 and a semantic analysis module 14.
Detailed Description
The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation.
The following describes infrared AI smart glasses for controlling a smart home based on a voice recognition technology according to an embodiment of the present invention with reference to fig. 1 to 4.
As shown in fig. 1, an infrared AI smart glasses for controlling a smart home based on a voice recognition technology includes:
the system comprises an acquisition module 1, a processing module and a processing module, wherein the acquisition module is used for acquiring a first voice signal of a user;
the processing module 2 is used for preprocessing the first voice signal to obtain a second voice signal;
the analysis module 3 is used for extracting text information in the second voice signal, performing semantic analysis on the text information to obtain an analysis result, and generating an infrared control instruction for the smart home according to the analysis result;
and the first control module 4 is used for regulating and controlling the smart home based on the infrared control instruction.
The working principle of the scheme is as follows: the acquisition module 1 is used for acquiring a first voice signal of a user; the processing module 2 is used for preprocessing the first voice signal to obtain a second voice signal; the analysis module 3 is used for extracting text information in the second voice signal, performing semantic analysis on the text information to obtain an analysis result, and generating an infrared control instruction for the smart home according to the analysis result; the first control module 4 is used for regulating and controlling the smart home based on the infrared control instruction.
The beneficial effect of above-mentioned scheme: the processing module 2 is used for preprocessing the first voice signal, so that the obtained second voice signal is clearer, the accuracy of the final recognition result is improved, the analysis module 3 is used for extracting text information in the second voice signal, performing semantic analysis on the text information to obtain an analysis result, and generating an infrared control instruction for the smart home according to the analysis result; the first control module 4 is used for controlling the smart home based on the infrared control instruction, so as to control the smart home in the room, improve the experience of the user and realize intelligent control.
As shown in fig. 2, an infrared AI intelligent glasses for controlling an intelligent home based on a voice recognition technology includes an infrared AI intelligent glasses main body, where the infrared A I intelligent glasses main body includes a lens 5, a frame 6, a nose pad 7, and glasses legs 8; the lens 5 is embedded on the frame 6; the nose pad 7 is arranged on the spectacle frame 6; the temple 8 is hinged to the frame 6.
The working principle and the beneficial effects of the scheme are as follows: the infrared AI intelligent glasses main body comprises lenses 5, a glasses frame 6, a nose support 7 and glasses legs 8; the lens 5 is embedded on the frame 6; the nose pad 7 is arranged on the spectacle frame 6; the temple 8 is hinged to the frame 6.
According to some embodiments of the present invention, the infrared AI smart glasses for controlling smart home based on the voice recognition technology further include:
the judging module is used for judging whether the first voice signal meets a preset condition or not after the acquiring module 1 acquires the first voice signal, and sending the first voice signal to the processing module 2 when the first voice signal is determined to meet the preset condition; otherwise, the first voice signal is subjected to elimination processing.
The working principle of the scheme is as follows: the judging module judges whether the first voice signal meets a preset condition after the acquiring module 1 acquires the first voice signal, and sends the first voice signal to the processing module 2 when the first voice signal is determined to meet the preset condition; otherwise, the first voice signal is subjected to elimination processing.
The beneficial effect of above-mentioned scheme: and setting a preset condition for the first voice signal, so that the uniqueness of the voice of the user is increased, and the experience of the user is further improved.
According to some embodiments of the present invention, the infrared AI smart glasses for controlling a smart home based on a voice recognition technology includes:
the extraction unit is used for extracting the characteristics of the first voice signal to obtain the voiceprint characteristics of the first voice signal;
the matching unit is used for matching the voiceprint features with preset voiceprint features, calculating to obtain a matching degree, and when the matching degree is determined to be larger than the preset matching degree, indicating that the first voice signal meets a preset condition; otherwise, the first voice signal does not meet the preset condition.
The working principle of the scheme is as follows: the extraction unit is used for extracting the characteristics of the first voice signal to obtain the voiceprint characteristics of the first voice signal; the matching unit is used for matching the voiceprint features with preset voiceprint features, calculating to obtain a matching degree, and when the matching degree is determined to be larger than the preset matching degree, indicating that the first voice signal meets a preset condition; otherwise, the first voice signal does not meet the preset condition.
The beneficial effect of above-mentioned scheme: every person's voiceprint characteristic all is different, through calculating the degree of matching of voiceprint characteristic and preset voiceprint characteristic increases the accuracy to first speech signal screening.
According to some embodiments of the present invention, the generating an infrared control instruction for the smart home according to the analysis result by using the infrared AI smart glasses for controlling the smart home based on the voice recognition technology includes:
obtaining type information of the intelligent furniture according to the analysis result, inquiring a preset intelligent furniture infrared code value library according to the type information to obtain an infrared code value corresponding to the type information, and generating an infrared control instruction according to the infrared code value.
The working principle of the scheme is as follows: obtaining type information of the intelligent furniture according to the analysis result, inquiring a preset intelligent furniture infrared code value library according to the type information to obtain an infrared code value corresponding to the type information, and generating an infrared control instruction according to the infrared code value.
The beneficial effect of above-mentioned scheme: and obtaining the type information of the intelligent furniture according to the analysis result, and inquiring a preset intelligent furniture infrared code value library according to the type information, so that the obtained infrared code value is more accurate, and the accuracy of the finally generated infrared control instruction is improved.
According to some embodiments of the present invention, the infrared AI smart glasses for controlling smart home based on the voice recognition technology further include:
the Bluetooth module is used for establishing Bluetooth pairing connection with the user terminal and receiving music data sent by the user terminal;
and the playing module is used for playing the music data.
The working principle of the scheme is as follows: the Bluetooth module is used for establishing Bluetooth pairing connection with the user terminal and receiving music data sent by the user terminal; the playing module is used for playing the music data.
The beneficial effect of above-mentioned scheme: music data are played, and the intelligence of the glasses is improved.
As shown in fig. 3, according to some embodiments of the invention, the processing module 2 comprises:
a suppression module 9 for:
performing fast Fourier transform on the first voice signal to obtain a first frequency domain voice signal, performing nonlinear processing on the first frequency domain voice signal, and performing signal segmentation processing on the nonlinear-processed first frequency domain voice signal to obtain a plurality of sub-first frequency domain voice signals;
obtaining the frequency of each sub-first frequency domain voice signal, and screening out the sub-first frequency domain signals of which the frequency is less than a preset frequency to obtain a first set;
screening out the sub first frequency domain signals with the frequency greater than or equal to the preset frequency to obtain a second set;
acquiring and sequencing the first power of each sub-first frequency domain signal in the first set, screening out the maximum first power, and taking the maximum first power as a target power;
obtaining a second power of each sub-first frequency domain signal in the second set, comparing the second power with the target power, and performing power suppression processing on the sub-first frequency domain signals of which the second power is greater than the target power;
generating a second frequency domain signal according to the sub first frequency domain signals in the first set, the sub first frequency domain signals which are not subjected to power suppression processing in the second set and the sub first frequency domain signals which are subjected to power suppression processing, and performing inverse fast Fourier transform to obtain voice signals to be enhanced;
the enhancement module 10 is connected with the suppression module 9 and is used for performing framing processing on the voice signal to be enhanced to obtain a plurality of frames of sub voice signals to be enhanced and respectively obtain the type of each frame of sub voice signals to be enhanced; the types include unvoiced frames and voiced frames;
taking the sub-to-be-enhanced voice signal with the type of unvoiced frame as a reserved signal;
taking the sub-to-be-enhanced voice signal with the type of the voiced frame as a signal to be processed;
calculating the spectrum envelope of each signal to be processed, extracting a formant on each spectrum envelope, and acquiring the amplitude of each formant; wherein, one signal to be processed corresponds to one spectrum envelope line;
determining an enhancement coefficient of a Hanning window filter according to the amplitude, and inputting a corresponding signal to be processed into the Hanning window filter for enhancement processing according to the enhancement coefficient;
and generating a second voice signal according to the reserved signal and the to-be-processed signal after the enhancement processing.
The working principle of the scheme is as follows: the suppression module 9 is configured to perform fast fourier transform on the first voice signal to obtain a first frequency-domain voice signal, perform nonlinear processing on the first frequency-domain voice signal, and perform signal segmentation processing on the nonlinear-processed first frequency-domain voice signal to obtain a plurality of sub-first frequency-domain voice signals; obtaining the frequency of each sub-first frequency domain voice signal, and screening out the sub-first frequency domain signals of which the frequency is less than a preset frequency to obtain a first set; screening out the sub first frequency domain signals with the frequency greater than or equal to the preset frequency to obtain a second set; acquiring and sequencing the first power of each sub-first frequency domain signal in the first set, screening out the maximum first power, and taking the maximum first power as a target power; obtaining a second power of each sub-first frequency domain signal in the second set, comparing the second power with the target power, and performing power suppression processing on the sub-first frequency domain signals of which the second power is greater than the target power; generating a second frequency domain signal according to the sub first frequency domain signals in the first set, the sub first frequency domain signals which are not subjected to power suppression processing in the second set and the sub first frequency domain signals which are subjected to power suppression processing, and performing inverse fast Fourier transform to obtain voice signals to be enhanced; the enhancement module 10 is connected with the suppression module 9 and is used for performing framing processing on the voice signal to be enhanced to obtain a plurality of frames of sub voice signals to be enhanced and respectively obtain the type of each frame of sub voice signals to be enhanced; the types include unvoiced frames and voiced frames; taking the sub-to-be-enhanced voice signal with the type of unvoiced frame as a reserved signal; taking the sub-to-be-enhanced voice signal with the type of the voiced frame as a signal to be processed; calculating the spectrum envelope of each signal to be processed, extracting a formant on each spectrum envelope, and acquiring the amplitude of each formant; wherein, one signal to be processed corresponds to one spectrum envelope line; determining an enhancement coefficient of a Hanning window filter according to the amplitude, and inputting a corresponding signal to be processed into the Hanning window filter for enhancement processing according to the enhancement coefficient; and generating a second voice signal according to the reserved signal and the to-be-processed signal after the enhancement processing.
The beneficial effect of above-mentioned scheme: in the process of collecting the first voice signal, the quality of the first voice signal is influenced by factors such as surrounding noise, aging of collecting equipment and the like, and the scheme provides a method for improving the quality of the voice signal, and the method has the advantages of full automation, high speed and high accuracy; the suppression module 9 is configured to perform fast fourier transform on the first voice signal to obtain a first frequency-domain voice signal, perform nonlinear processing on the first frequency-domain voice signal, and perform signal segmentation processing on the nonlinear-processed first frequency-domain voice signal to obtain a plurality of sub-first frequency-domain voice signals; obtaining the frequency of each sub-first frequency domain voice signal, and screening out the sub-first frequency domain signals of which the frequency is less than a preset frequency to obtain a first set; the first set is a low-frequency signal in the first frequency-domain speech signal; screening out the sub first frequency domain signals with the frequency greater than or equal to the preset frequency to obtain a second set; the second set is a high-frequency signal in the first frequency-domain speech signal; generally, noise is often accompanied in high-frequency signals; acquiring and sequencing the first power of each sub-first frequency domain signal in the first set, screening out the maximum first power, and taking the maximum first power as a target power; the target power is noise suppression power; obtaining a second power of each sub-first frequency domain signal in the second set, comparing the second power with the target power, and performing power suppression processing on the sub-first frequency domain signals of which the second power is greater than the target power; suppressing the power of the sub first frequency domain signal of which the second power is greater than the target power to the target power; by the method, noise can be effectively suppressed under the condition of not influencing voice, and a second frequency domain signal is generated according to the sub first frequency domain signal in the first set, the sub first frequency domain signal which is not subjected to power suppression processing in the second set and the sub first frequency domain signal which is subjected to power suppression processing, and inverse fast Fourier transform is carried out to obtain a voice signal to be enhanced; after processing noise in a voice signal, enhancing details of the voice signal, wherein the enhancing module 10 is configured to perform framing processing on the voice signal to be enhanced to obtain a plurality of frames of sub-voice signals to be enhanced, and respectively obtain types of each frame of sub-voice signals to be enhanced; the types include unvoiced frames and voiced frames; taking the sub-to-be-enhanced voice signal with the type of unvoiced frame as a reserved signal; taking the sub-to-be-enhanced voice signal with the type of the voiced frame as a signal to be processed; calculating the spectrum envelope of each signal to be processed, extracting a formant on each spectrum envelope, and acquiring the amplitude of each formant; wherein, one signal to be processed corresponds to one spectrum envelope line; determining an enhancement coefficient of a Hanning window filter according to the amplitude, and inputting a corresponding signal to be processed into the Hanning window filter for enhancement processing according to the enhancement coefficient; generating a second voice signal according to the reserved signal and the signal to be processed after enhancement processing, and determining an enhancement coefficient of a Ning window filter through the amplitude of a formant on a spectrum envelope line, so that the purpose of enhancing voice is achieved, and the voice intelligibility and the tone quality are improved; meanwhile, the intelligent furniture control method has the advantages of being simple in calculation and good in robustness, the accuracy of voice recognition is improved, the accuracy of final semantic recognition is improved, the finally generated intelligent furniture control quality is more accurate, and the experience of a user is improved.
According to some embodiments of the present invention, the obtaining the type of each frame of sub-speech signal to be enhanced includes:
and respectively inputting each frame of sub voice signals to be enhanced into a classification model trained in advance, and outputting the type corresponding to each frame of sub voice signals to be enhanced.
The working principle and the beneficial effects of the scheme are as follows: inputting each frame of sub-voice signals to be enhanced into a classification model trained in advance respectively, so that the corresponding type of each frame of sub-voice signals to be enhanced is more accurate; the classification model is a neural network model obtained by training the sample voice signal and the type corresponding to the sample voice signal.
According to some embodiments of the present invention, the infrared AI smart glasses for controlling smart home based on the voice recognition technology further include:
the power adjusting module is used for adjusting the receiving power of the Bluetooth module;
the second control module is used for calculating the target receiving power of the Bluetooth module when the user terminal transmits the music data to the Bluetooth module, and controlling the power adjusting module to adjust the receiving power of the Bluetooth module according to the target receiving power;
the calculating the target receiving power of the Bluetooth module comprises the following steps:
calculating a transmission loss γ of the music data at the time of transmission, as shown in formula (1):
Figure BDA0003231268170000121
wherein v is a transmission speed of the music data; f is the transmission frequency of the music data; l is the distance between the user terminal and the Bluetooth module;
calculating the target receiving power P of the Bluetooth module according to the transmission loss gamma of the music data in time1As shown in equation (2):
Figure BDA0003231268170000122
wherein, P2Is the transmit power of the user terminal.
The working principle of the scheme is as follows: the power adjusting module is used for adjusting the receiving power of the Bluetooth module; the second control module is used for calculating the target receiving power of the Bluetooth module when the user terminal transmits the music data to the Bluetooth module, and controlling the power adjusting module to adjust the receiving power of the Bluetooth module according to the target receiving power.
The beneficial effect of above-mentioned scheme: the receiving power of the Bluetooth module and the transmission loss of the music data determine the speed of receiving the music data, if the transmission loss of the music data is too large and the receiving power of the Bluetooth module is too small, the data can not be received timely, further the music can not be played timely, and the experience of a user can be greatly influenced, therefore, the calculation of the target receiving power of the Bluetooth module is necessary, when the second control module calculates the target receiving power of the Bluetooth module, the calculated target receiving power is more accurate by considering the factors of the transmitting power of the user terminal, the transmission loss of the music data during transmission, the transmission speed of the music data and the like, and the power adjusting module is controlled to adjust the receiving power of the Bluetooth module according to the target receiving power, so that the receiving speed of the Bluetooth module is ensured, therefore, the timeliness of music playing is guaranteed, and the experience of the user is improved.
As shown in fig. 4, according to some embodiments of the invention, the analysis module 3 comprises:
the text generation module 11 is configured to recognize the second speech signal based on a speech recognition technology to obtain a first text;
a text processing module 12, connected to the text generating module 11, configured to:
judging whether the first text has the language meaning word or not, and deleting the language meaning word when the language meaning word is determined to exist in the first text;
performing word segmentation on the first text with the removed Chinese language word to obtain a plurality of first words, respectively judging whether each first word exists in a preset correct word bank, and screening out the first words existing in the preset correct word bank as effective words; screening out first participles which do not exist in a preset correct word bank as to-be-processed participles;
inputting each word to be processed into a pre-trained error correction model, outputting a correct word corresponding to each word to be processed, and performing replacement processing on the corresponding word to be processed according to the correct word;
generating a second text according to the effective word segmentation and the word segmentation to be processed after the replacement processing;
a keyword extraction module 13, connected to the text processing module 12, configured to:
performing word segmentation processing on the second text to obtain a plurality of second words, and respectively obtaining attribute information of each second word; the attribute information comprises the part of speech, the length and the position of the participle in the second text;
evaluating the corresponding second participles according to the attribute information to obtain an evaluation value of each second participle;
respectively comparing the evaluation value of each second participle with a preset evaluation value, and screening out the second participles with the evaluation values larger than the preset evaluation value as keywords;
and the semantic analysis module 14 is connected with the keyword extraction module 13 and is used for performing semantic analysis on the keywords to obtain an analysis result.
The working principle of the scheme is as follows: the text generation module 11 is configured to recognize the second speech signal based on a speech recognition technology to obtain a first text; the text processing module 12 is configured to determine whether a mood word exists in the first text, and delete the mood word when determining that the mood word exists in the first text; performing word segmentation on the first text with the removed Chinese language word to obtain a plurality of first words, respectively judging whether each first word exists in a preset correct word bank, and screening out the first words existing in the preset correct word bank as effective words; screening out first participles which do not exist in a preset correct word bank as to-be-processed participles; inputting each word to be processed into a pre-trained error correction model, outputting a correct word corresponding to each word to be processed, and performing replacement processing on the corresponding word to be processed according to the correct word; generating a second text according to the effective word segmentation and the word segmentation to be processed after the replacement processing; the keyword extraction module 13 is configured to perform word segmentation on the second text to obtain a plurality of second words, and obtain attribute information of each second word; the attribute information comprises the part of speech, the length and the position of the participle in the second text; evaluating the corresponding second participles according to the attribute information to obtain an evaluation value of each second participle; respectively comparing the evaluation value of each second participle with a preset evaluation value, and screening out the second participles with the evaluation values larger than the preset evaluation value as keywords; the semantic analysis module 14 is configured to perform semantic analysis on the keyword to obtain an analysis result.
The beneficial effect of above-mentioned scheme: after the second voice signal is acquired, how to accurately generate the infrared control instruction according to the second voice signal is the most important thing, and the scheme provides a specific implementation mode; the text generation module 11 is configured to recognize the second speech signal based on a speech recognition technology to obtain a first text; the text processing module 12 is configured to determine whether a mood word exists in the first text, delete the mood word when the mood word exists in the first text, where the mood word is a "kayei, o" at the end of a sentence, simplify the following recognition steps, and accelerate the recognition speed; performing word segmentation on the first text with the removed Chinese language word to obtain a plurality of first words, respectively judging whether each first word exists in a preset correct word bank, and screening out the first words existing in the preset correct word bank as effective words; screening out first participles which do not exist in a preset correct word bank as to-be-processed participles; inputting each word to be processed into a pre-trained error correction model, outputting a correct word corresponding to each word to be processed, and performing replacement processing on the corresponding word to be processed according to the correct word; the error correction module is a neural network model obtained by training sample to-be-processed participles and correct participles corresponding to the sample to-be-processed participles; generating a second text according to the effective word segmentation and the word segmentation to be processed after the replacement processing; the second text is more accurate, and the accuracy of extracting the keywords finally is improved; the keyword extraction module 13 is configured to perform word segmentation on the second text to obtain a plurality of second words, and obtain attribute information of each second word; the attribute information comprises the part of speech, the length and the position of the participle in the second text; parts of speech include proper names, common nouns, idioms, verbs, adjectives and adverbs; the length of a word refers to the number of words in each word; the position of the word indicates the position of the word in the second text. Evaluating the corresponding second participles according to the attribute information to obtain an evaluation value of each second participle; the scores corresponding to the parts of speech of each word are sequentially from high to low: the score corresponding to the proper noun, the score corresponding to the common noun, the score corresponding to the adjective, the score corresponding to the verb, the score corresponding to the idiom, the score corresponding to the adverb and the scores corresponding to other words; respectively comparing the evaluation value of each second participle with a preset evaluation value, and screening out the second participles with the evaluation values larger than the preset evaluation value as keywords; the keywords obtained by screening according to the evaluation value of each second participle are more accurate, the semantic analysis module 14 is used for performing semantic analysis on the keywords to obtain an analysis result, and the keywords are subjected to semantic analysis, so that not only is the complexity of semantic recognition reduced, but also the accuracy of semantic recognition is improved.
According to some embodiments of the invention, the playback module comprises a speaker.
The working principle and the beneficial effects of the scheme are as follows: the music played through the loudspeaker is clearer.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (10)

1. The utility model provides an infrared AI intelligence glasses based on speech recognition technology control intelligence house which characterized in that includes:
the acquisition module is used for acquiring a first voice signal of a user;
the processing module is used for preprocessing the first voice signal to obtain a second voice signal;
the analysis module is used for extracting text information in the second voice signal, performing semantic analysis on the text information to obtain an analysis result, and generating an infrared control instruction for the smart home according to the analysis result;
and the first control module is used for regulating and controlling the smart home based on the infrared control instruction.
2. The infrared AI intelligent glasses for controlling smart home based on voice recognition technology according to claim 1, further comprising:
the judging module is used for judging whether the first voice signal meets a preset condition or not after the acquiring module acquires the first voice signal, and sending the first voice signal to the processing module when the first voice signal is determined to meet the preset condition; otherwise, the first voice signal is subjected to elimination processing.
3. The infrared AI smart glasses based on speech recognition technology for smart home as claimed in claim 2, wherein said decision module comprises:
the extraction unit is used for extracting the characteristics of the first voice signal to obtain the voiceprint characteristics of the first voice signal;
the matching unit is used for matching the voiceprint features with preset voiceprint features, calculating to obtain a matching degree, and when the matching degree is determined to be larger than the preset matching degree, indicating that the first voice signal meets a preset condition; otherwise, the first voice signal does not meet the preset condition.
4. The infrared AI intelligent glasses for controlling an intelligent home based on a voice recognition technology according to claim 1, wherein the generating an infrared control instruction for the intelligent home according to the analysis result comprises:
obtaining type information of the intelligent furniture according to the analysis result, inquiring a preset intelligent furniture infrared code value library according to the type information to obtain an infrared code value corresponding to the type information, and generating an infrared control instruction according to the infrared code value.
5. The infrared AI intelligent glasses for controlling smart home based on voice recognition technology according to claim 1, further comprising:
the Bluetooth module is used for establishing Bluetooth pairing connection with the user terminal and receiving music data sent by the user terminal;
and the playing module is used for playing the music data.
6. The infrared AI smart glasses based on speech recognition technology for smart home as claimed in claim 1, wherein said processing module comprises:
a suppression module to:
performing fast Fourier transform on the first voice signal to obtain a first frequency domain voice signal, performing nonlinear processing on the first frequency domain voice signal, and performing signal segmentation processing on the nonlinear-processed first frequency domain voice signal to obtain a plurality of sub-first frequency domain voice signals;
obtaining the frequency of each sub-first frequency domain voice signal, and screening out the sub-first frequency domain signals of which the frequency is less than a preset frequency to obtain a first set;
screening out the sub first frequency domain signals with the frequency greater than or equal to the preset frequency to obtain a second set;
acquiring and sequencing the first power of each sub-first frequency domain signal in the first set, screening out the maximum first power, and taking the maximum first power as a target power;
obtaining a second power of each sub-first frequency domain signal in the second set, comparing the second power with the target power, and performing power suppression processing on the sub-first frequency domain signals of which the second power is greater than the target power;
generating a second frequency domain signal according to the sub first frequency domain signals in the first set, the sub first frequency domain signals which are not subjected to power suppression processing in the second set and the sub first frequency domain signals which are subjected to power suppression processing, and performing inverse fast Fourier transform to obtain voice signals to be enhanced;
the enhancement module is connected with the suppression module and used for performing framing processing on the voice signal to be enhanced to obtain a plurality of frames of sub voice signals to be enhanced and respectively obtain the type of each frame of sub voice signals to be enhanced; the types include unvoiced frames and voiced frames;
taking the sub-to-be-enhanced voice signal with the type of unvoiced frame as a reserved signal;
taking the sub-to-be-enhanced voice signal with the type of the voiced frame as a signal to be processed;
calculating the spectrum envelope of each signal to be processed, extracting a formant on each spectrum envelope, and acquiring the amplitude of each formant; wherein, one signal to be processed corresponds to one spectrum envelope line;
determining an enhancement coefficient of a Hanning window filter according to the amplitude, and inputting a corresponding signal to be processed into the Hanning window filter for enhancement processing according to the enhancement coefficient;
and generating a second voice signal according to the reserved signal and the to-be-processed signal after the enhancement processing.
7. The infrared AI smart glasses based on speech recognition technology for smart home control according to claim 6, wherein the obtaining of each frame of sub-to-be-enhanced speech signal type respectively comprises:
and respectively inputting each frame of sub voice signals to be enhanced into a classification model trained in advance, and outputting the type corresponding to each frame of sub voice signals to be enhanced.
8. The infrared AI smart glasses based on speech recognition technology control smart home of claim 5, further comprising:
the power adjusting module is used for adjusting the receiving power of the Bluetooth module;
and the second control module is used for calculating the target receiving power of the Bluetooth module when the user terminal transmits the music data to the Bluetooth module, and controlling the power regulation module to regulate the receiving power of the Bluetooth module according to the target receiving power.
9. The infrared AI smart glasses based on speech recognition technology for smart home as claimed in claim 1, wherein the analysis module comprises:
the text generation module is used for recognizing the second voice signal based on a voice recognition technology to obtain a first text;
the text processing module is connected with the text generation module and is used for:
judging whether the first text has the language meaning word or not, and deleting the language meaning word when the language meaning word is determined to exist in the first text;
performing word segmentation on the first text with the removed Chinese language word to obtain a plurality of first words, respectively judging whether each first word exists in a preset correct word bank, and screening out the first words existing in the preset correct word bank as effective words; screening out first participles which do not exist in a preset correct word bank as to-be-processed participles;
inputting each word to be processed into a pre-trained error correction model, outputting a correct word corresponding to each word to be processed, and performing replacement processing on the corresponding word to be processed according to the correct word;
generating a second text according to the effective word segmentation and the word segmentation to be processed after the replacement processing;
the keyword extraction module is connected with the text processing module and is used for:
performing word segmentation processing on the second text to obtain a plurality of second words, and respectively obtaining attribute information of each second word; the attribute information comprises the part of speech, the length and the position of the participle in the second text;
evaluating the corresponding second participles according to the attribute information to obtain an evaluation value of each second participle;
respectively comparing the evaluation value of each second participle with a preset evaluation value, and screening out the second participles with the evaluation values larger than the preset evaluation value as keywords;
and the semantic analysis module is connected with the keyword extraction module and used for performing semantic analysis on the keywords to obtain an analysis result.
10. The infrared AI smart glasses based on voice recognition technology for controlling smart homes according to claim 1, wherein the playing module comprises a speaker.
CN202110987518.9A 2021-08-26 2021-08-26 Infrared AI intelligent glasses based on speech recognition technology control intelligence house Pending CN113778226A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110987518.9A CN113778226A (en) 2021-08-26 2021-08-26 Infrared AI intelligent glasses based on speech recognition technology control intelligence house

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110987518.9A CN113778226A (en) 2021-08-26 2021-08-26 Infrared AI intelligent glasses based on speech recognition technology control intelligence house

Publications (1)

Publication Number Publication Date
CN113778226A true CN113778226A (en) 2021-12-10

Family

ID=78839472

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110987518.9A Pending CN113778226A (en) 2021-08-26 2021-08-26 Infrared AI intelligent glasses based on speech recognition technology control intelligence house

Country Status (1)

Country Link
CN (1) CN113778226A (en)

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102779527A (en) * 2012-08-07 2012-11-14 无锡成电科大科技发展有限公司 Speech enhancement method on basis of enhancement of formants of window function
CN103680514A (en) * 2013-12-13 2014-03-26 广州华多网络科技有限公司 Method and system for processing signals in network voice communication
WO2017156893A1 (en) * 2016-03-18 2017-09-21 深圳Tcl数字技术有限公司 Voice control method and smart television
CN107997314A (en) * 2018-01-03 2018-05-08 潘荣兰 A kind of intelligent medical bracelet
CN108171951A (en) * 2018-01-03 2018-06-15 李文清 A kind of Intelligent home remote controller based on bluetooth
CN108917283A (en) * 2018-07-12 2018-11-30 四川虹美智能科技有限公司 A kind of intelligent refrigerator control method, system, intelligent refrigerator and cloud server
CN109410919A (en) * 2018-11-28 2019-03-01 深圳朗昇贸易有限公司 A kind of intelligent home control system
CN110675870A (en) * 2019-08-30 2020-01-10 深圳绿米联创科技有限公司 Voice recognition method and device, electronic equipment and storage medium
CN111128213A (en) * 2019-12-10 2020-05-08 展讯通信(上海)有限公司 Noise suppression method and system for processing in different frequency bands
CN111276141A (en) * 2020-01-19 2020-06-12 珠海格力电器股份有限公司 Voice interaction method and device, storage medium, processor and electronic equipment
CN111681655A (en) * 2020-05-21 2020-09-18 北京声智科技有限公司 Voice control method and device, electronic equipment and storage medium
CN112016275A (en) * 2020-10-30 2020-12-01 北京淇瑀信息科技有限公司 Intelligent error correction method and system for voice recognition text and electronic equipment
CN112201246A (en) * 2020-11-19 2021-01-08 深圳市欧瑞博科技股份有限公司 Intelligent control method and device based on voice, electronic equipment and storage medium
CN112230877A (en) * 2020-10-16 2021-01-15 惠州Tcl移动通信有限公司 Voice operation method and device, storage medium and electronic equipment
CN112382296A (en) * 2020-10-12 2021-02-19 绍兴市爱丘科技有限公司 Method and device for voiceprint remote control of wireless audio equipment
CN113050445A (en) * 2021-03-23 2021-06-29 安徽阜南县向发工艺品有限公司 Voice control system for smart home

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102779527A (en) * 2012-08-07 2012-11-14 无锡成电科大科技发展有限公司 Speech enhancement method on basis of enhancement of formants of window function
CN103680514A (en) * 2013-12-13 2014-03-26 广州华多网络科技有限公司 Method and system for processing signals in network voice communication
WO2017156893A1 (en) * 2016-03-18 2017-09-21 深圳Tcl数字技术有限公司 Voice control method and smart television
CN107997314A (en) * 2018-01-03 2018-05-08 潘荣兰 A kind of intelligent medical bracelet
CN108171951A (en) * 2018-01-03 2018-06-15 李文清 A kind of Intelligent home remote controller based on bluetooth
CN108917283A (en) * 2018-07-12 2018-11-30 四川虹美智能科技有限公司 A kind of intelligent refrigerator control method, system, intelligent refrigerator and cloud server
CN109410919A (en) * 2018-11-28 2019-03-01 深圳朗昇贸易有限公司 A kind of intelligent home control system
CN110675870A (en) * 2019-08-30 2020-01-10 深圳绿米联创科技有限公司 Voice recognition method and device, electronic equipment and storage medium
CN111128213A (en) * 2019-12-10 2020-05-08 展讯通信(上海)有限公司 Noise suppression method and system for processing in different frequency bands
CN111276141A (en) * 2020-01-19 2020-06-12 珠海格力电器股份有限公司 Voice interaction method and device, storage medium, processor and electronic equipment
CN111681655A (en) * 2020-05-21 2020-09-18 北京声智科技有限公司 Voice control method and device, electronic equipment and storage medium
CN112382296A (en) * 2020-10-12 2021-02-19 绍兴市爱丘科技有限公司 Method and device for voiceprint remote control of wireless audio equipment
CN112230877A (en) * 2020-10-16 2021-01-15 惠州Tcl移动通信有限公司 Voice operation method and device, storage medium and electronic equipment
CN112016275A (en) * 2020-10-30 2020-12-01 北京淇瑀信息科技有限公司 Intelligent error correction method and system for voice recognition text and electronic equipment
CN112201246A (en) * 2020-11-19 2021-01-08 深圳市欧瑞博科技股份有限公司 Intelligent control method and device based on voice, electronic equipment and storage medium
CN113050445A (en) * 2021-03-23 2021-06-29 安徽阜南县向发工艺品有限公司 Voice control system for smart home

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
张卫钢: "《通信原理与通信技术》", vol. 1, 西安电子科技大学出版社, pages: 336 - 339 *

Similar Documents

Publication Publication Date Title
CN109326302B (en) Voice enhancement method based on voiceprint comparison and generation of confrontation network
US20110161082A1 (en) Methods and systems for assessing and improving the performance of a speech recognition system
CN107945790A (en) A kind of emotion identification method and emotion recognition system
CN111862934B (en) Method for improving speech synthesis model and speech synthesis method and device
KR20080023030A (en) On-line speaker recognition method and apparatus for thereof
CN110070865A (en) A kind of guidance robot with voice and image identification function
CN111370030A (en) Voice emotion detection method and device, storage medium and electronic equipment
US5995924A (en) Computer-based method and apparatus for classifying statement types based on intonation analysis
CN116665669A (en) Voice interaction method and system based on artificial intelligence
CN111489763B (en) GMM model-based speaker recognition self-adaption method in complex environment
WO2023185006A1 (en) Working mode setting method and apparatus
CN114783464A (en) Cognitive detection method and related device, electronic equipment and storage medium
CN111798846A (en) Voice command word recognition method and device, conference terminal and conference terminal system
CN112017690B (en) Audio processing method, device, equipment and medium
CN110853669A (en) Audio identification method, device and equipment
Saba et al. The effects of Lombard perturbation on speech intelligibility in noise for normal hearing and cochlear implant listeners
CN112185357A (en) Device and method for simultaneously recognizing human voice and non-human voice
CN115104151A (en) Offline voice recognition method and device, electronic equipment and readable storage medium
WO2023185004A1 (en) Tone switching method and apparatus
CN113778226A (en) Infrared AI intelligent glasses based on speech recognition technology control intelligence house
Shufang Design of an automatic english pronunciation error correction system based on radio magnetic pronunciation recording devices
CN112767961B (en) Accent correction method based on cloud computing
CN112466335B (en) English pronunciation quality evaluation method based on accent prominence
Salim et al. Automatic Speaker Verification System for Dysarthria Patients.
KR102512570B1 (en) Method for screening psychiatric disorder based on voice and apparatus therefor

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination