WO2015180430A1 - 语音控制方法和系统 - Google Patents

语音控制方法和系统 Download PDF

Info

Publication number
WO2015180430A1
WO2015180430A1 PCT/CN2014/091948 CN2014091948W WO2015180430A1 WO 2015180430 A1 WO2015180430 A1 WO 2015180430A1 CN 2014091948 W CN2014091948 W CN 2014091948W WO 2015180430 A1 WO2015180430 A1 WO 2015180430A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio signal
controlled terminal
voice data
noise device
noise
Prior art date
Application number
PCT/CN2014/091948
Other languages
English (en)
French (fr)
Inventor
程德凯
吕艳红
Original Assignee
广东美的制冷设备有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广东美的制冷设备有限公司 filed Critical 广东美的制冷设备有限公司
Publication of WO2015180430A1 publication Critical patent/WO2015180430A1/zh

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/20Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech

Definitions

  • the present invention relates to the field of voice control technologies, and in particular, to a voice control method and system.
  • the built-in voice pickup device is mainly used by the controlled device, and the voice pickup device picks up the voice control command sent by the user and recognizes it, according to the pre-determination process.
  • the mapping relationship between the voice control command and the control code is determined, the control code corresponding to the received voice control command is determined, and the voice control of the controlled terminal is implemented in response to the control code.
  • the controlled terminal receives the voice control command. It may include noise data played by the sound device, which causes an error in the recognition of the voice control command, resulting in low voice control accuracy.
  • a sound device such as a television set and a radio light
  • the invention provides a voice control method, comprising:
  • the controlled terminal detects the voice data sent by the noise device in real time or periodically, and acquires the playing time point of the detected voice data;
  • the controlled terminal determines the voice data corresponding to the playing time point that matches the current time point, and converts the determined voice data into the second audio signal;
  • the controlled terminal culls a portion of the first audio signal that matches the second audio signal to generate a voice control instruction
  • the controlled terminal is responsive to the generated voice control instruction.
  • the step of the controlled terminal culling the portion of the first audio signal that matches the second audio signal to generate a voice control command comprises:
  • the controlled terminal adjusts the second audio signal according to preset attenuation information
  • the controlled terminal compares the adjusted second audio signal with the first audio signal
  • the controlled terminal culls a portion of the first audio signal that matches the adjusted second audio signal, and generates the voice control command.
  • the step of the controlled terminal adjusting the second audio signal according to preset attenuation information comprises:
  • the controlled terminal determines a corresponding noise device identifier according to the received voice data
  • the controlled terminal acquires the attenuation information corresponding to the identifier of the determinant noise device according to the mapping relationship between the preset attenuation information and the noise device identifier;
  • the controlled terminal adjusts the corresponding second audio signal according to the acquired attenuation information.
  • the method before the step of detecting, by the controlled terminal, the voice data sent by the noise device in real time or timing, and acquiring the playing time of the detected voice data, the method further includes:
  • the controlled terminal When detecting an audio play command sent by the noise device, the controlled terminal determines a play time and intensity information of the third audio signal to be played based on the received audio play command;
  • the generated attenuation information is saved in association with the identification of the noise device or the identification of the environmental noise pickup device.
  • the method includes:
  • the controlled terminal responds to the first audio signal when the first audio signal is detected and the playback time point corresponding to the received voice data does not match the current time point.
  • the step of the controlled terminal responding to the first audio signal includes:
  • the present invention also provides a voice control method, including:
  • the controlled terminal sends a voice data acquisition request to the noise device, so that when the voice device receives the voice data, the voice data matching the playing time point and the current time point is fed back to the controlled terminal;
  • the controlled terminal Receiving the voice data fed back by the noise device, the controlled terminal converting the voice data into a second audio signal;
  • the controlled terminal culls a portion of the first audio signal that matches the second audio signal to generate a voice control instruction
  • the controlled terminal is responsive to the generated voice control instruction.
  • the step of the controlled terminal culling the portion of the first audio signal that matches the second audio signal to generate a voice control command comprises:
  • the controlled terminal adjusts the second audio signal according to preset attenuation information
  • the controlled terminal compares the adjusted second audio signal with the first audio signal
  • the controlled terminal culls a portion of the first audio signal that matches the second audio signal, and generates the voice control command.
  • the step of the controlled terminal adjusting the second audio signal according to preset attenuation information comprises:
  • the controlled terminal determines a corresponding noise device identifier according to the received voice data
  • the controlled terminal acquires the attenuation information corresponding to the identifier of the determinant noise device according to the mapping relationship between the preset attenuation information and the noise device identifier;
  • the controlled terminal adjusts the corresponding second audio signal according to the acquired attenuation information.
  • the controlled terminal sends a voice data acquisition request to the noise device, so that when the noise device receives the voice data, the voice data feedback that matches the playing time point with the current time point is received.
  • the method includes:
  • the controlled terminal When detecting an audio play command sent by the noise device, the controlled terminal determines a play time and intensity information of the third audio signal to be played based on the received audio play command;
  • the generated attenuation information is saved in association with the identification of the noise device or the identification of the environmental noise pickup device.
  • the controlled terminal sends a voice data acquisition request to the noise device, so that when the noise device receives the voice data, the voice data feedback that matches the playing time point with the current time point is received.
  • the method includes:
  • the first audio signal is responsive to the voice data fed back by the noise device.
  • the step of responding to the first audio signal includes:
  • the present invention also provides a voice control system, including:
  • a detecting module for detecting voice data sent by the noise device in real time or timing
  • An obtaining module configured to acquire a playback time point of the detected voice data
  • a determining module configured to determine, when the first audio signal is detected, the voice data corresponding to the playing time point that matches the current time point;
  • a conversion module configured to convert the determined voice data into a second audio signal
  • a processing module configured to cancel a portion of the first audio signal that matches the adjusted second audio signal to generate a voice control instruction
  • the processing module comprises:
  • An adjusting unit configured to adjust the second audio signal according to preset attenuation information
  • a comparison unit configured to compare the adjusted second audio signal with the first audio signal
  • a processing unit configured to cull the portion of the first audio signal that matches the second audio signal, and generate the voice control instruction.
  • the adjusting unit comprises:
  • Determining a subunit configured to determine a corresponding noise device identifier according to the received voice data
  • Obtaining a sub-unit configured to acquire, according to a mapping relationship between the preset attenuation information and the noise device identifier, the attenuation information corresponding to the identifier of the determiner noise device;
  • a adjusting subunit configured to adjust the corresponding second audio signal according to the obtained attenuation information.
  • the determining module is further configured to: when detecting an audio play instruction sent by the noise device, the controlled terminal determines, according to the received audio play instruction, a play time and intensity information of the third audio signal to be played;
  • the obtaining module is further configured to: when receiving the third audio signal played by the noise device, acquire the strength information of the received third audio signal, the receiving time of the third audio signal, and the identifier of the noise device or receive the noise device.
  • the system further includes a generation module and a storage module, wherein the generation module is further configured to: based on the received intensity information of the third audio signal and the reception time of the third audio signal, And determining the playing time and the intensity information of the third audio signal to be played, generating corresponding attenuation information; the storage module is further configured to save the generated attenuation information in association with the identifier of the noise device or the identifier of the environmental noise picking device .
  • the response module is further configured to: when the first audio signal is detected, and the playback time point corresponding to the received voice data does not match the current time point, and respond to the first audio signal.
  • the response module is further configured to extract a voiceprint feature of the first audio signal, and compare the extracted voiceprint feature with the preset voiceprint feature, and the extracted voiceprint feature and the preset voiceprint The feature is responsive to the received first audio signal.
  • the voice control method and system provided by the present invention, in which the voice device transmits voice data to the controlled terminal in real time or periodically, and adds a play time point to the voice data, so that the controlled terminal detects the first audio signal. Determining voice data corresponding to a playback time point that matches the current time point, and converting the determined voice data into a second audio signal, the controlled terminal culling the first audio signal and the second audio signal.
  • the matching portion generates a new voice control command and responds to the generated voice control command to improve the accuracy of the voice control by rejecting the second audio signal generated by the noise device received in the first audio signal.
  • FIG. 1 is a schematic diagram of a hardware structure of a preferred embodiment of a controlled terminal that implements voice control according to the present invention
  • FIG. 2 is a schematic diagram of functional modules of the voice control system of FIG. 1;
  • FIG. 3 is a schematic flowchart of a first embodiment of a voice control method according to the present invention.
  • FIG. 4 is a schematic flow chart of a second embodiment of a voice control method according to the present invention.
  • FIG. 1 is a schematic diagram of a hardware structure of a preferred embodiment of a controlled terminal for implementing voice control according to the present invention.
  • the controlled terminal 1 includes a processing unit 11, a storage unit 12, a hair extension unit 13, a voice pickup device 14, and a voice control system 15.
  • the controlled terminal 1 can be a terminal that can implement voice control, such as an air conditioner and a television.
  • the voice pickup device 14 is configured to convert an electrical signal generated by the vibration into an audio signal when receiving the vibration of the sound wave.
  • the storage unit 12 is configured to store the voice control system 15 and its operation data, and a mapping relationship between the voice control instruction and the control code. It should be emphasized that the storage unit 12 may be a single storage device or a collective name of a plurality of different storage devices, and details are not described herein.
  • the transceiver unit 13 is configured to receive audio data sent by the noise device under the control of the processing unit 11, and the transceiver unit 13 may be a WIFI module, an infrared signal sending unit, a Bluetooth module, a wireless signal transmitter with a transmitting antenna, or Any other suitable wireless signal transmitting unit 13 (the preferred WIFI module in this embodiment).
  • the processing unit 11 is configured to invoke and execute the voice control system 15, and the control transceiver unit 13 detects the voice data sent by the noise device in real time or periodically, and acquires the voice data when the receiving and receiving unit 13 detects the voice data.
  • the voice data corresponding to the playing time point matching the current time point is determined, and the determined voice data is converted into the second Audio signal, and culling a portion of the first audio signal that matches the second audio signal to generate a new voice control command, invoking a mapping relationship between the voice control command and the control code stored in the storage unit 12,
  • the control code corresponding to the generated voice control instruction is determined, and the control code is executed.
  • the processing unit 11 and the storage unit 12 may be separate units, or may be integrated to form a controller, which is not described herein.
  • FIG. 2 is a schematic diagram of functional modules of the voice control system of FIG.
  • the functional block diagram shown in FIG. 2 is merely an exemplary diagram of a preferred embodiment, and those skilled in the art will surround the functional modules of the voice control system 15 shown in FIG.
  • the new function modules can be easily supplemented; the names of the function modules are custom names, which are only used to assist in understanding the various program function blocks of the voice control system 15, and are not used to limit the technical solution of the present invention.
  • the core is the function that each functional module of the defined name has to achieve.
  • This embodiment provides a voice control system 15 including:
  • the detecting module 151 is configured to detect voice data sent by the noise device in real time or periodically;
  • the second audio signal to be played or the second audio signal currently being played is encoded according to a preset communication protocol to generate a corresponding Voice data, when playing, add play time to the voice data, and send the encoded voice data to the controlled device.
  • a preset communication protocol to generate a corresponding Voice data
  • the second audio signal picked up is encoded by using the WIFI communication protocol.
  • the obtaining module 152 is configured to acquire a playback time point of the detected voice data.
  • the controlled device when the controlled device receives the voice data sent by the noise device, it can directly decode the voice data according to a preset communication protocol to obtain a corresponding playing time point, and can also obtain a corresponding correspondence in the message header of the voice data. Play time point.
  • the determining module 153 is configured to determine, when the first audio signal is detected, the voice data corresponding to the playing time point that matches the current time point;
  • the detection module 151 detects the time point of the second audio signal played by the noise device and the second audio signal played by the noise device.
  • the time point has a certain time difference. Therefore, the current time point and the playing time point match that the difference between the current time point and the playing time point is less than or equal to a preset threshold.
  • a conversion module 154 configured to convert the determined voice data into a second audio signal
  • a plurality of detection modules 151 such as a WIFI module and a wireless detection module such as an infrared module, or a wired interface such as an RS425 interface and a serial interface can be disposed in the controlled terminal to receive an environmental noise pickup device.
  • the voice data, the conversion module 154 may determine, when the detection module 151 receives the voice data, an interface or a module that receives the voice data, and decode the received voice data by using the determined interface or a communication protocol corresponding to the module. Converting the received voice data into a second audio signal.
  • the processing module 155 is configured to: cull the portion of the first audio signal that matches the second audio signal to generate a voice control instruction
  • the first audio signal received by the voice pickup device in the controlled terminal includes a voice control command sent by the user and an environmental noise (such as a second audio signal).
  • an environmental noise such as a second audio signal.
  • the method may be implemented by waveform comparison, such as comparing the first audio signal with the second converted signal.
  • the waveform direction of the audio signal adjusts the waveform corresponding to the first audio signal according to the amplitude of the waveform corresponding to the voice control signal.
  • the response module 156 is configured to respond to the generated voice control instruction.
  • the voice control command in the mapping relationship between the pre-stored voice control command and the control code may be compared with the generated voice control command. Determining a mapping relationship between the voice control instruction and the control code matched by the generated voice control instruction, and determining a control code corresponding to the generated voice control instruction according to a mapping relationship between the matched voice control instruction and the control code , execute the control code.
  • the generated voice control command is compared with the pre-stored voice control command, if the key tone matches or matches the number of key tones greater than a preset threshold, the generated voice control command is considered to match the pre-stored voice control command.
  • the voice control system provided in this embodiment, the system sends voice data to the controlled terminal in real time or timing through the noise device, and adds a play time point in the voice data, so that the detecting module determines when the first audio signal is detected.
  • the module determines voice data corresponding to the playing time point that matches the current time point, the converting module converts the determined voice data into a second audio signal, and the processing module rejects the first audio signal that matches the second audio signal
  • the response module in response to the generated voice control instruction, improves the accuracy of the voice control by rejecting the second audio signal generated by the noise device received in the first audio signal.
  • the processing module 155 includes:
  • the adjusting unit 1551 is configured to adjust the second audio signal according to preset attenuation information
  • the comparing unit 1552 is configured to compare the adjusted second audio signal with the first audio signal
  • the processing unit 1553 is configured to cull the portion of the first audio signal that matches the second audio signal, and generate a voice control instruction.
  • the attenuation information includes the attenuation amplitude of the second audio signal and the delay time. Since the position of the noise device is unchanged, the attenuation amplitude and The delay time is constant, so the preset attenuation amplitude and the delay duration are adjusted, and the waveform of the second audio signal is adjusted according to the preset attenuation amplitude and the delay duration, and the adjusted waveform is compared with the received first audio. The waveforms of the signals are compared.
  • the adjusting unit 1551 includes:
  • Determining a subunit configured to determine a corresponding noise device identifier according to the received voice data
  • Obtaining a sub-unit configured to acquire, according to a mapping relationship between the preset attenuation information and the noise device identifier, the attenuation information corresponding to the identifier of the determiner noise device;
  • a adjusting subunit configured to adjust the corresponding second audio signal according to the obtained attenuation information.
  • a plurality of noise devices may exist in the environment in which the controlled device is located.
  • the controlled device is an air conditioner
  • the indoor television and the radio are both used as noise devices to cause voice control of the air conditioner.
  • Interference it is necessary to save the mapping relationship between the preset attenuation information and the environmental noise pickup device identifier or the noise device identifier in the controlled terminal.
  • the detecting module 151 may receive voice data sent by multiple noise devices at the same time, so the voice data sent to identify different noise devices, the noise device
  • the noise device identifier may be added to the voice data, and the determining subunit determines the corresponding noise device identifier according to the voice data received by the detecting module 151, and the acquiring subunit is determined according to the preset attenuation information and the noise device identifier.
  • the mapping relationship between the obtained noise device identifiers is obtained, and the adjusting subunit adjusts the corresponding second audio signal according to the obtained attenuation information to ensure the accuracy of the attenuation adjustment of the second audio signal, that is, the voice is improved.
  • the accuracy of the control is obtained, and the adjusting subunit adjusts the corresponding second audio signal according to the obtained attenuation information to ensure the accuracy of the attenuation adjustment of the second audio signal, that is, the voice is improved.
  • the determining module 153 is further configured to: when detecting an audio play instruction sent by the noise device, the controlled terminal determines, according to the received audio play instruction, that the third audio is to be played.
  • the obtaining module 152 is further configured to: when receiving the third audio signal played by the noise device, acquire the intensity information of the received third audio signal, the receiving time of the third audio signal, and An identifier of the noise device or an identifier of the environmental noise pickup device that receives the third audio signal of the noise device;
  • the system further includes a generation module and a storage module, the generation module further configured to determine the intensity information of the third audio signal based on the received And the receiving time of the third audio signal, and the determined playing time and intensity information of the third audio signal to be played, generating corresponding attenuation information;
  • the storage module is further configured to use the generated attenuation information with the noise device Identification or association of the identifier of the environmental noise pickup device
  • the attenuation information of the third audio signal may be determined when the controlled terminal receives only the third audio signal played by the noise device.
  • the noise device may send the third audio signal playing time and the intensity information to the controlled terminal before playing the third audio signal, so that the controlled terminal determines the playing time and intensity information of the received third audio signal, and the playing time may be For a time point such as 8:00 play, it can also be a time interval, such as playing after 5 minutes.
  • the received play time is a time interval
  • the controlled terminal is based on the time point and time interval at which the play time interval is received. Determine the playback time of the third audio signal.
  • the noise device can also send the play time and the intensity information to the controlled terminal after playing the third audio signal, and the generating module is based on the intensity information of the received third audio signal and the third The receiving time of the audio signal, and the determined playing time and intensity information of the third audio signal to be played, generating corresponding attenuation information, and the storage module associates the generated attenuation information with the identifier of the noise device or the identifier of the environmental noise picking device save.
  • the response module 156 is further configured to respond to the first audio signal when the first audio signal is detected, and the playback time point corresponding to the received voice data does not match the current time point. An audio signal.
  • the second audio signal played by the noise device can extract the voiceprint feature of the first audio signal and improve the voiceprint feature of the first audio signal, and compare the extracted voiceprint feature with the preset voiceprint feature, and extract the voiceprint feature. Responding to the received first audio signal when matched with the preset voiceprint feature.
  • FIG. 3 is a schematic flowchart diagram of a voice control method according to a first embodiment of the present invention.
  • This embodiment provides a voice control method, including:
  • Step S10 the controlled terminal detects the voice data sent by the noise device in real time or timing, and acquires the playing time point of the detected voice data;
  • the second audio signal to be played or the second audio signal currently being played is encoded according to a preset communication protocol to generate a corresponding Voice data, when playing, add play time to the voice data, and send the encoded voice data to the controlled device.
  • a preset communication protocol to generate a corresponding Voice data
  • the second audio signal picked up is encoded by using the WIFI communication protocol.
  • Step S20 when the first audio signal is detected, the controlled terminal determines the voice data corresponding to the playing time point that matches the current time point, and converts the determined voice data into the second audio signal;
  • the controlled device when it receives the voice data sent by the noise device, it can directly decode the voice data according to a preset communication protocol to obtain a corresponding playing time point, and can also obtain a corresponding correspondence in the message header of the voice data. Play time point.
  • the time point of the second audio signal played by the noise device detected by the controlled terminal, and the time of the second audio signal played by the noise device The point has a certain time difference. Therefore, the matching between the current time point and the playing time point means that the difference between the current time point and the playing time point is less than or equal to a preset threshold.
  • a plurality of receiving modules such as a wireless detecting module such as a WIFI module and an infrared module, or a wired interface such as an RS425 interface and a serial interface can be received in the controlled terminal to receive the voice sent by the environmental noise pickup device.
  • the controlled terminal may determine an interface or module that receives the voice data when receiving the voice data, and decode the received voice data by using the determined interface or a communication protocol corresponding to the module, so as to receive the voice.
  • the data is converted to a second audio signal.
  • Step S30 the controlled terminal culls a portion of the first audio signal that matches the second audio signal to generate a voice control instruction.
  • the first audio signal received by the voice pickup device in the controlled terminal includes a voice control command sent by the user and an environmental noise (such as a second audio signal).
  • the controlled terminal culls the portion of the first audio signal that matches the second audio signal by means of waveform comparison, such as comparing the first audio signal with the second converted signal.
  • the waveform direction of the audio signal adjusts the waveform corresponding to the first audio signal according to the amplitude of the waveform corresponding to the voice control signal.
  • Step S40 the controlled terminal responds to the generated voice control instruction.
  • the voice control command in the mapping relationship between the pre-stored voice control command and the control code may be compared with the generated voice control command. Determining a mapping relationship between the voice control instruction and the control code matched by the generated voice control instruction, and determining a control code corresponding to the generated voice control instruction according to a mapping relationship between the matched voice control instruction and the control code , execute the control code.
  • the generated voice control command is compared with the pre-stored voice control command, if the key tone matches or matches the number of key tones greater than a preset threshold, the generated voice control command is considered to match the pre-stored voice control command.
  • the system sends voice data to the controlled terminal in real time or timing through the noise device, and adds a play time point in the voice data, so that the controlled terminal determines when the first audio signal is detected. And the voice data corresponding to the playing time point matched with the current time point, and converting the determined voice data into a second audio signal; the controlled terminal culling the portion of the first audio signal that matches the second audio signal, To generate a voice control command, and in response to the generated voice command in response to the module, the accuracy of the voice control is improved by culling the second audio signal generated by the noise device received in the first audio signal.
  • the step S30 includes:
  • Step S31 the controlled terminal adjusts the second audio signal according to preset attenuation information
  • Step S32 the controlled terminal compares the adjusted second audio signal with the first audio signal
  • Step S33 the controlled terminal culls a portion of the first audio signal that matches the adjusted second audio signal, and generates a voice control instruction.
  • the attenuation information includes the attenuation amplitude of the second audio signal and the delay time. Since the position of the noise device is unchanged, the attenuation amplitude and The delay time is constant, so the preset attenuation amplitude and the delay duration are adjusted, and the waveform of the second audio signal is adjusted according to the preset attenuation amplitude and the delay duration, and the adjusted waveform is compared with the received first audio. The waveforms of the signals are compared.
  • the step S31 includes:
  • the controlled terminal determines a corresponding noise device identifier according to the received voice data
  • the controlled terminal acquires the attenuation information corresponding to the identifier of the determinant noise device according to the mapping relationship between the preset attenuation information and the noise device identifier;
  • the controlled terminal adjusts the corresponding second audio signal according to the acquired attenuation information.
  • a plurality of noise devices may exist in the environment in which the controlled device is located.
  • the controlled device is an air conditioner
  • the indoor television and the radio are both used as noise devices to cause voice control of the air conditioner.
  • Interference it is necessary to save the mapping relationship between the preset attenuation information and the environmental noise pickup device identifier or the noise device identifier in the controlled terminal.
  • the controlled terminal may receive voice data sent by multiple noise devices at the same time. Therefore, in order to identify voice data sent by different noise devices, the noise device is When the voice data is sent, the noise device identifier may be added to the voice data, and the controlled terminal determines the corresponding noise device identifier according to the received voice data, and obtains the determination according to the mapping relationship between the preset attenuation information and the noise device identifier.
  • the attenuation information corresponding to the noise device identifier the controlled terminal adjusts the corresponding second audio signal according to the obtained attenuation information, and ensures the accuracy of the attenuation adjustment of the second audio signal, that is, improves the accuracy of the voice control.
  • the method includes:
  • the controlled terminal When detecting an audio play command sent by the noise device, the controlled terminal determines a play time and intensity information of the third audio signal to be played based on the received audio play command;
  • the generated attenuation information is saved in association with the identification of the noise device or the identification of the environmental noise pickup device.
  • the attenuation information of the third audio signal may be determined when the controlled terminal receives only the third audio signal played by the noise device.
  • the noise device may send the third audio signal playing time and the intensity information to the controlled terminal before playing the third audio signal, so that the controlled terminal determines the playing time and intensity information of the received third audio signal, and the playing time may be For a time point such as 8:00 play, it can also be a time interval, such as playing after 5 minutes.
  • the received play time is a time interval
  • the controlled terminal is based on the time point and time interval at which the play time interval is received. Determine the playback time of the third audio signal.
  • the noise device can also send the play time and the intensity information to the controlled terminal after playing the third audio signal, and the generating module is based on the intensity information of the received third audio signal and the third The receiving time of the audio signal, and the determined playing time and intensity information of the third audio signal to be played, generating corresponding attenuation information, and the storage module associates the generated attenuation information with the identifier of the noise device or the identifier of the environmental noise picking device save.
  • step S10 the method includes the steps of:
  • the controlled terminal responds to the first audio signal when the playback time point corresponding to the received voice data does not match the current time point.
  • the second audio signal played by the noise device can extract the voiceprint feature of the first audio signal and improve the voiceprint feature of the first audio signal, and compare the extracted voiceprint feature with the preset voiceprint feature, and extract the voiceprint feature. Responding to the received first audio signal when matched with the preset voiceprint feature.
  • FIG. 4 is a schematic flowchart diagram of a second embodiment of a voice control method according to the present invention.
  • the invention provides a voice control method, comprising:
  • Step S50 When detecting the first audio signal, the controlled terminal sends a voice data acquisition request to the noise device, so that the voice device feeds back the voice data matching the current time point to the current time point when receiving the voice data.
  • the noise device may save the second audio signal to be played or the currently played second audio signal in association with the playing time point before playing the second audio signal or playing the second audio signal, and the noise device is Receiving the voice data acquisition request sent by the controlled terminal, acquiring the receiving time point of receiving the voice data acquiring request, and storing the pre-stored second audio signal to be played or the currently played second audio signal and the playing time point The relationship between the two is compared with the receiving time point of the data acquisition request. In the pre-stored second audio signal to be played or the relationship between the currently played second audio signal and the playing time point, there is a playing time point.
  • the second audio signal corresponding to the matched playing time point may be encoded into voice data, and the generated voice data is sent to the controlled terminal.
  • the noise device encodes the second audio signal
  • the corresponding audio data may be encoded according to a preset communication protocol, and the playback time is added to the voice data, and the encoded voice data is sent to the controlled device.
  • the communication mode between the noise device and the controlled terminal is WIFI communication
  • the corresponding communication protocol is the WIFI communication protocol
  • the second audio signal picked up is encoded by using the WIFI communication protocol.
  • the time point of the second audio signal played by the noise device detected by the controlled terminal has a certain time difference from the time point of the second audio signal played by the noise device. . That is, when the noise device receives the voice data acquisition request sent by the controlled device, there is a certain time difference, and the data acquisition request receiving time point matches the playing time point, and the data acquisition request receiving time point and the playing time point are The difference between the two is less than or equal to the preset threshold.
  • Step S60 when receiving the voice data fed back by the noise device, the controlled terminal converts the voice data into a second audio signal
  • a plurality of receiving modules such as a wireless detecting module such as a WIFI module and an infrared module, or a wired interface such as an RS425 interface and a serial interface can be received in the controlled terminal to receive the environment noise pickup device.
  • Voice data the controlled terminal may determine an interface or module that receives the voice data when receiving the voice data, and decode the received voice data by using the determined interface or a communication protocol corresponding to the module, so as to receive the received voice data.
  • the voice data is converted into a second audio signal.
  • Step S70 the controlled terminal culls a portion of the first audio signal that matches the second audio signal to generate a voice control instruction.
  • the first audio signal received by the voice pickup device in the controlled terminal includes a voice control command sent by the user and an environmental noise (such as a second audio signal).
  • the controlled terminal culls the portion of the first audio signal that matches the second audio signal by means of waveform comparison, such as comparing the first audio signal with the second converted signal.
  • the waveform direction of the audio signal adjusts the waveform corresponding to the first audio signal according to the amplitude of the waveform corresponding to the voice control signal.
  • Step S80 the controlled terminal responds to the generated voice control instruction.
  • the first audio signal in the mapping relationship between the pre-stored first audio signal and the control code and the generated first audio signal may be Performing an alignment, determining a mapping relationship between the first audio signal and the control code that matches the generated first audio signal, and determining the generated first according to a mapping relationship between the matched first audio signal and the control code
  • the control code corresponding to an audio signal executes the control code.
  • the controlled terminal when the first audio signal is detected, the controlled terminal sends a voice data acquisition request to the noise device, so that the noise device will play the time point and current when receiving the voice data.
  • the time-point-matched voice data is sent to the controlled terminal, and when receiving the voice data sent by the noise device, the controlled terminal converts the voice data into a second audio signal, and the processing module culls the first audio signal Determining a portion of the second audio signal matching to generate a voice control command, and the controlled terminal responds to the generated voice control command to improve the voice control by rejecting the second audio signal generated by the noise device received in the first audio signal accuracy.
  • the step S70 includes:
  • Step S71 the controlled terminal adjusts the second audio signal according to preset attenuation information
  • Step S72 the controlled terminal compares the adjusted second audio signal with the first audio signal
  • Step S73 the controlled terminal culls a portion of the first audio signal that matches the second audio signal, and generates the voice control instruction.
  • the attenuation information includes the attenuation amplitude of the second audio signal and the delay time. Since the position of the noise device is unchanged, the attenuation amplitude and The delay time is constant, so the preset attenuation amplitude and the delay duration are adjusted, and the waveform of the second audio signal is adjusted according to the preset attenuation amplitude and the delay duration, and the adjusted waveform is compared with the received first audio. The waveforms of the signals are compared.
  • step S71 includes:
  • the controlled terminal determines a corresponding noise device identifier according to the received voice data
  • the controlled terminal acquires the attenuation information corresponding to the identifier of the determinant noise device according to the mapping relationship between the preset attenuation information and the noise device identifier;
  • the controlled terminal adjusts the corresponding second audio signal according to the acquired attenuation information.
  • a plurality of noise devices may exist in the environment in which the controlled device is located.
  • the controlled device is an air conditioner
  • the indoor television and the radio are both used as noise devices to cause voice control of the air conditioner.
  • Interference it is necessary to save the mapping relationship between the preset attenuation information and the environmental noise pickup device identifier or the noise device identifier in the controlled terminal.
  • the controlled terminal may receive voice data sent by multiple noise devices at the same time. Therefore, in order to identify voice data sent by different noise devices, the noise device is When the voice data is sent, the noise device identifier may be added to the voice data, and the controlled terminal determines the corresponding noise device identifier according to the received voice data, and obtains the determination according to the mapping relationship between the preset attenuation information and the noise device identifier.
  • the attenuation information corresponding to the noise device identifier the controlled terminal adjusts the corresponding second audio signal according to the obtained attenuation information, and ensures the accuracy of the attenuation adjustment of the second audio signal, that is, improves the accuracy of the voice control.
  • the method further includes:
  • the controlled terminal When detecting an audio play command sent by the noise device, the controlled terminal determines a play time and intensity information of the third audio signal to be played based on the received audio play command;
  • the generated attenuation information is saved in association with the identification of the noise device or the identification of the environmental noise pickup device.
  • the attenuation information of the third audio signal may be determined when the controlled terminal receives only the third audio signal played by the noise device.
  • the noise device may send the third audio signal playing time and the intensity information to the controlled terminal before playing the third audio signal, so that the controlled terminal determines the playing time and intensity information of the received third audio signal, and the playing time may be For a time point such as 8:00 play, it can also be a time interval, such as playing after 5 minutes.
  • the received play time is a time interval
  • the controlled terminal is based on the time point and time interval at which the play time interval is received. Determine the playback time of the third audio signal.
  • the noise device can also send the play time and the intensity information to the controlled terminal after playing the third audio signal, and the controlled terminal is based on the strength information of the received third audio signal and the first
  • the receiving time of the three audio signals, and the determined playing time and intensity information of the third audio signal to be played, generate corresponding attenuation information, and the controlled terminal associates the generated attenuation information with the identifier of the noise device.
  • the method further includes:
  • Step S60 responding to the first audio signal when receiving the voice data fed back by the noise device.
  • the voice control efficiency can be extracted. a voiceprint feature of the first audio signal, and comparing the extracted voiceprint feature with a preset voiceprint feature, and responding to the received first when the extracted voiceprint feature matches the preset voiceprint feature audio signal.
  • the technical solution of the present invention which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk,
  • a storage medium such as ROM/RAM, disk,
  • the optical disc includes a number of instructions for causing a terminal device (which may be a cell phone, a computer, a server, or a network device, etc.) to perform the methods described in various embodiments of the present invention.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Selective Calling Equipment (AREA)

Abstract

一种语音控制方法和系统,该方法包括:被控终端实时或定时侦测噪音设备发送的语音数据,并获取侦测到的语音数据的播放时间点(S10);在侦测到第一音频信号时,所述被控终端确定与当前时间点匹配的播放时间点所对应的语音数据,并将确定的语音数据转换为第二音频信号(S20);所述被控终端剔除所述第一音频信号中与第二音频信号匹配的部分,以生成语音控制指令(S30);所述被控终端响应该生成的语音控制指令(S40)。通过将接收到第一音频信号中的噪音设备产生的第二音频信号剔除,提高语音控制的准确性。

Description

语音控制方法和系统
技术领域
本发明涉及语音控制技术领域,尤其涉及一种语音控制方法和系统。
背景技术
随着语音识别技术的发展,越来越多的设备采用语音来控制,目前主要采用被控设备内置语音拾取装置,该语音拾取装置拾取用户发送的语音控制指令并识别,在识别过程中根据预设的语音控制指令与控制代码之间的映射关系,确定该接收到的语音控制指令所对应的控制代码,响应该控制代码实现被控终端的语音控制。
现有技术中,一般被控终端所处的环境中可能存在有声设备(如电视机以及收音机灯),则在用户向被控终端发送语音控制指令时,被控终端接收到的语音控制指令中可能包括有声设备播放的噪音数据,使得语音控制指令的识别出现错误,导致语音控制准确率低。
发明内容
本发明的主要目的是提供一种语音控制方法和系统,旨在提高语音控制的准确性。
本发明提出一种语音控制方法,包括:
被控终端实时或定时侦测噪音设备发送的语音数据,并获取侦测到的语音数据的播放时间点;
在侦测到第一音频信号时,所述被控终端确定与当前时间点匹配的播放时间点所对应的语音数据,并将确定的语音数据转换为第二音频信号;
所述被控终端剔除所述第一音频信号中与所述第二音频信号匹配的部分,以生成语音控制指令;
所述被控终端响应该生成的语音控制指令。
优选地,所述被控终端剔除所述第一音频信号中与所述第二音频信号匹配的部分,以生成语音控制指令的步骤包括:
所述被控终端根据预设的衰减信息调节所述第二音频信号;
所述被控终端将调节后的第二音频信号与所述第一音频信号进行比对;
所述被控终端剔除所述第一音频信号中与调节后的所述第二音频信号匹配的部分,并生成所述语音控制指令。
优选地,所述被控终端根据预设的衰减信息调节所述第二音频信号的步骤包括:
所述被控终端根据接收到的语音数据确定对应的噪音设备标识;
所述被控终端根据预设的衰减信息与噪音设备标识之间的映射关系,获取确定者噪音设备标识所对应的衰减信息;
所述被控终端根据获取到的衰减信息调节对应的所述第二音频信号。
优选地,所述被控终端实时或定时侦测噪音设备发送的语音数据,并获取侦测到的语音数据的播放时间的步骤之前,该方法还包括:
在侦测到噪音设备发送的音频播放指令时,所述被控终端基于接收到的音频播放指令确定待播放第三音频信号的播放时间以及强度信息;
在接收到噪音设备播放的第三音频信号时,获取接收到的第三音频信号的强度信息、该第三音频信号的接收时间以及噪音设备的标识或者接收所述第三音频信号的环境噪音拾取装置的标识;
基于该接收到的第三音频信号的强度信息以及该第三音频信号的接收时间,以及确定的待播放第三音频信号的播放时间以及强度信息,生成对应的衰减信息;
将生成的衰减信息与所述噪音设备的标识或者环境噪音拾取装置的标识关联保存。
优选地,所述被控终端实时或定时侦测噪音设备发送的语音数据,并获取侦测到的语音数据的播放时间的步骤之后,该方法包括:
在侦测到第一音频信号,且接收到语音数据所对应的播放时间点均与当前时间点不匹配时,所述被控终端响应所述第一音频信号。
优选地,所述在侦测到第一音频信号,且接收到语音数据所对应的播放时间点均与当前时间点不匹配时,所述被控终端响应所述第一音频信号的步骤包括:
在侦测到第一音频信号,且接收到语音数据所对应的播放时间点均与当前时间点不匹配时,提取第一音频信号的声纹特征,并将提取的声纹特征与预设的声纹特征进行比对;
在提取的声纹特征与预设的声纹特征匹配时,响应该接收到的第一音频信号。
此外,为实现上述目的,本发明还提出一种语音控制方法,包括:
在侦测到第一音频信号时,被控终端向噪音设备发送语音数据获取请求,以供噪音设备在接收语音数据时,将播放时间点与当前时间点匹配的语音数据反馈给被控终端;
在接收到噪音设备反馈的语音数据时,所述被控终端将所述语音数据转换为第二音频信号;
所述被控终端剔除所述第一音频信号中与所述第二音频信号匹配的部分,以生成语音控制指令;
所述被控终端响应该生成的语音控制指令。
优选地,所述被控终端剔除所述第一音频信号中与所述第二音频信号匹配的部分,以生成语音控制指令的步骤包括:
所述被控终端根据预设的衰减信息调节所述第二音频信号;
所述被控终端将调节后的第二音频信号与所述第一音频信号进行比对;
所述被控终端剔除所述第一音频信号中与所述第二音频信号匹配的部分,并生成所述语音控制指令。
优选地,所述被控终端根据预设的衰减信息调节所述第二音频信号的步骤包括:
所述被控终端根据接收到的语音数据确定对应的噪音设备标识;
所述被控终端根据预设的衰减信息与噪音设备标识之间的映射关系,获取确定者噪音设备标识所对应的衰减信息;
所述被控终端根据获取到的衰减信息调节对应的所述第二音频信号。
优选地,所述在侦测到第一音频信号时,被控终端向噪音设备发送语音数据获取请求,以供噪音设备在接收语音数据时,将播放时间点与当前时间点匹配的语音数据反馈给被控终端的步骤之前,该方法包括:
在侦测到噪音设备发送的音频播放指令时,所述被控终端基于接收到的音频播放指令确定待播放第三音频信号的播放时间以及强度信息;
在接收到噪音设备播放的第三音频信号时,获取接收到的第三音频信号的强度信息、该第三音频信号的接收时间以及噪音设备的标识或者接收该噪音设备第三音频信号的环境噪音拾取装置的标识;
基于该接收到的第三音频信号的强度信息以及该第三音频信号的接收时间,以及确定的待播放第三音频信号的播放时间以及强度信息,生成对应的衰减信息;
将生成的衰减信息与所述噪音设备的标识或者环境噪音拾取装置的标识关联保存。
优选地,所述在侦测到第一音频信号时,被控终端向噪音设备发送语音数据获取请求,以供噪音设备在接收语音数据时,将播放时间点与当前时间点匹配的语音数据反馈给被控终端的步骤之后,该方法包括:
在未接收到噪音设备反馈的语音数据时,响应所述第一音频信号。
优选地,所述在未接收到噪音设备反馈的语音数据时,响应所述第一音频信号的步骤包括:
在未接收到噪音设备反馈的语音数据时,提取第一音频信号的声纹特征,并将提取的声纹特征与预设的声纹特征进行比对;
在提取的声纹特征与预设的声纹特征匹配时,响应该接收到的第一音频信号。
此外,为实现上述目的,本发明还提出一种语音控制系统,包括:
侦测模块,用于实时或定时侦测噪音设备发送的语音数据;
获取模块,用于获取侦测到的语音数据的播放时间点;
确定模块,用于在侦测到第一音频信号时,确定与当前时间点匹配的播放时间点所对应的语音数据;
转换模块,用于将确定的语音数据转换为第二音频信号;
处理模块,用于剔除所述第一音频信号中与调节后的所述第二音频信号匹配的部分,以生成语音控制指令;
响应模块,用于响应该生成的语音控制指令。
优选地,所述处理模块包括:
调节单元,用于根据预设的衰减信息调节所述第二音频信号;
比对单元,用于将调节后的第二音频信号与所述第一音频信号进行比对;
处理单元,用于剔除所述第一音频信号中与所述第二音频信号匹配的部分,并生成所述语音控制指令。
优选地,所述调节单元包括:
确定子单元,用于根据接收到的语音数据确定对应的噪音设备标识;
获取子单元,用于根据预设的衰减信息与噪音设备标识之间的映射关系,获取确定者噪音设备标识所对应的衰减信息;
调节子单元,用于根据获取到的衰减信息调节对应的所述第二音频信号。
优选地,所述确定模块还用于在侦测到噪音设备发送的音频播放指令时,所述被控终端基于接收到的音频播放指令确定待播放第三音频信号的播放时间以及强度信息;所述获取模块还用于在接收到噪音设备播放的第三音频信号时,获取接收到的第三音频信号的强度信息、该第三音频信号的接收时间以及噪音设备的标识或者接收该噪音设备第三音频信号的环境噪音拾取装置的标识;该系统还包括生成模块和存储模块,所述生成模块还用于基于该接收到的第三音频信号的强度信息以及该第三音频信号的接收时间,以及确定的待播放第三音频信号的播放时间以及强度信息,生成对应的衰减信息;所述存储模块还用于将生成的衰减信息与所述噪音设备的标识或者环境噪音拾取装置的标识关联保存。
优选地,所述响应模块还用于在侦测到第一音频信号,且接收到语音数据所对应的播放时间点均与当前时间点不匹配,响应所述第一音频信号。
优选地,所述响应模块还用于提取第一音频信号的声纹特征,并将提取的声纹特征与预设的声纹特征进行比对,在提取的声纹特征与预设的声纹特征匹配时,响应该接收到的第一音频信号。
本发明提出的语音控制方法和系统,该方法中通过噪音设备实时或定时向被控终端发送语音数据,并在语音数据中添加播放时间点,使得被控终端在侦测到第一音频信号时,确定与当前时间点匹配的播放时间点所对应的语音数据,并将确定的语音数据转换为第二音频信号,所述被控终端剔除所述第一音频信号中与所述第二音频信号匹配的部分,以生成新的语音控制指令并响应该生成的语音控制指令,通过将接收到第一音频信号中的噪音设备产生的第二音频信号剔除,提高语音控制的准确性。
附图说明
图1为本发明实现语音控制的被控终端的较佳实施例的硬件结构示意图;
图2为图1中语音控制系统较佳实施例的功能模块示意图;
图3为本发明语音控制方法第一实施例的流程示意图;
图4为本发明语音控制方法第二实施例的流程示意图。
本发明目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
具体实施方式
下面结合附图及具体实施例就本发明的技术方案做进一步的说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。
参照图1,图1为本发明实现语音控制的被控终端的较佳实施例的硬件结构示意图。
该被控终端1包括处理单元11、存储单元12、接发单元13、语音拾取装置14及语音控制系统15。该被控终端1可为空调器以及电视机等可实现语音控制的终端。
语音拾取装置14,用于在接收到声波的震动时,将震动产生的电信号转换为音频信号。
存储单元12,用于存储该语音控制系统15及其运行数据,以及语音控制指令以及控制代码之间的映射关系。需要强调的是,该存储单元12既可以是一个单独的存储装置,也可以是多个不同存储装置的统称,在此不作赘述。
接发单元13,用于在处理单元11的控制下,接收噪音设备发送的音频数据,该接发单元13可以为WIFI模块、红外信号发送单元、蓝牙模块、带发射天线的无线信号发射器或者其他任意适用的无线信号接发单元13(本实施例优选WIFI模块)。
该处理单元11,用于调用并执行该语音控制系统15,控制接发单元13实时或定时侦测噪音设备发送的语音数据,并在接发单元13侦测到语音数据时,并获取侦测到的语音数据的播放时间点,同时在接发单元13侦测到第一音频信号时,确定与当前时间点匹配的播放时间点所对应的语音数据,并将确定的语音数据转换为第二音频信号,并剔除所述第一音频信号中与所述第二音频信号匹配的部分,以生成新的语音控制指令,调用存储单元12中存储的语音控制指令与控制代码之间的映射关系,确定生成的语音控制指令所对应的控制代码,执行该控制代码。该处理单元11与存储单元12既可以分别是单独的单元,也可以集成在一起,构成一个控制器,在此不作赘述。
参照图2,图2为图1中语音控制系统较佳实施例的功能模块示意图。
需要强调的是,对本领域的技术人员来说,图2所示功能模块图仅仅是一个较佳实施例的示例图,本领域的技术人员围绕图2所示的语音控制系统15的功能模块,可轻易进行新的功能模块的补充;各功能模块的名称是自定义名称,仅用于辅助理解该语音控制系统15的各个程序功能块,不用于限定本发明的技术方案,本发明技术方案的核心是,各自定义名称的功能模块所要达成的功能。
本实施例提出一种语音控制系统15,包括:
侦测模块151,用于实时或定时侦测噪音设备发送的语音数据;
在本实施例中,噪音设备可在播放第二音频信号之前或者播放第二音频信号时,对待播放的第二音频信号或者当前播放的第二音频信号按照预设的通信协议进行编码生成对应的语音数据,编码时在语音数据中添加播放时间,并将编码生成的语音数据发送至被控设备。例如,噪音设备与被控终端之间的通信方式为WIFI通信时,所对应的通信协议为WIFI通信协议,则采用WIFI通信协议对拾取到的第二音频信号进行编码。
获取模块152,用于获取侦测到的语音数据的播放时间点;
同理,被控设备在接收到的噪音设备发送的语音数据时,可直接按照预设的通信协议对将语音数据解码以获取对应的播放时间点,也可由语音数据的报文头中获取对应的播放时间点。
确定模块153,用于在侦测到第一音频信号时,确定与当前时间点匹配的播放时间点所对应的语音数据;
在本实施例中,由于被控终端与噪音设备之间有一定的距离,故侦测模块151侦测到的噪音设备播放的第二音频信号的时间点,与噪音设备播放的第二音频信号的时间点有一定的时间差,故,当前时间点与播放时间点匹配是指当前时间点与播放时间点之间的差值小于等于预设的阀值。
转换模块154,用于将确定的语音数据转换为第二音频信号;
本领域技术人员可以理解的是,可在被控终端中设置多种侦测模块151如WIFI模块以及红外模块等无线侦测模块,或者RS425接口以及串行接口等有线接口接收环境噪音拾取装置发送的语音数据,转换模块154可在侦测模块151接收到语音数据时,确定接收到语音数据的接口或者模块,采用该确定的接口或者模块所对应的通信协议对接收到的语音数据进行解码,以将接收到的语音数据转换为第二音频信号。
处理模块155,用于剔除所述第一音频信号中与所述第二音频信号匹配的部分,以生成语音控制指令;
被控终端中的语音拾取装置接收到的第一音频信号包括用户发送的语音控制指令以及环境噪音(如第二音频信号)。在本实施例中,处理模块155剔除所述第一音频信号中与所述第二音频信号匹配的部分时可通过波形比对的方式实现,如比对第一音频信号以及转换后的第二音频信号的波形走向,根据语音控制信号所对应的波形的幅度对第一音频信号所对应的波形进行调节。
响应模块156,用于响应该生成的语音控制指令。
在本实施例中,响应模块156在响应该生成的语音控制指令时,可将预存的语音控制指令与控制代码之间的映射关系中的语音控制指令与该生成的语音控制指令进行比对,确定与该生成的语音控制指令匹配的语音控制指令与控制代码之间的映射关系,根据该匹配的语音控制指令与控制代码之间的映射关系,确定该生成的语音控制指令所对应的控制代码,执行该控制代码。在生成的语音控制指令与预存的语音控制指令进行比对时,若关键音匹配或者匹配的关键音的数量大于预设的阀值,则认为生成的语音控制指令与预存的语音控制指令匹配。
本实施例提出的语音控制系统,该系统通过噪音设备实时或定时向被控终端发送语音数据,并在语音数据中添加播放时间点,使得侦测模块在侦测到第一音频信号时,确定模块确定与当前时间点匹配的播放时间点所对应的语音数据,转换模块将确定的语音数据转换为第二音频信号,处理模块剔除所述第一音频信号中与所述第二音频信号匹配的部分,以生成语音控制指令,响应模块响应该生成的语音控制指令,通过将接收到第一音频信号中的噪音设备产生的第二音频信号剔除,提高语音控制的准确性。
进一步地,为提高语音控制的准确性,所述处理模块155包括:
调节单元1551,用于根据预设的衰减信息调节所述第二音频信号;
比对单元1552,用于将调节后的第二音频信号与所述第一音频信号进行比对;
处理单元1553,用于剔除所述第一音频信号中与所述第二音频信号匹配的部分,并生成语音控制指令。
由于第二音频信号由噪音设备发送至被控终端的过程中会出现衰减,该衰减信息包括第二音频信号的衰减幅度以及延时时长,由于噪音设备所处的位置不变,故衰减幅度以及延时时长不变,故预设衰减幅度以及延时时长,并根据预设的衰减幅度以及延时时长对第二音频信号的波形进行调整,并将调整后的波形与接收到的第一音频信号的波形进行比对。
进一步地,为提高语音控制的准确性,所述调节单元1551包括:
确定子单元,用于根据接收到的语音数据确定对应的噪音设备标识;
获取子单元,用于根据预设的衰减信息与噪音设备标识之间的映射关系,获取确定者噪音设备标识所对应的衰减信息;
调节子单元,用于根据获取到的衰减信息调节对应的所述第二音频信号。
在本实施例中,被控设备所处的环境中可能存在多个噪音设备,例如在被控设备为空调器时,室内的电视机以及收音机等均作为噪音设备会对空调器的语音控制造成干扰,故需要在被控终端中保存预设的衰减信息与环境噪音拾取装置标识或噪音设备标识之间的映射关系。
本领域技术人员可以理解的是,在设置有多个噪音设备时,侦测模块151可能同时接收到多个噪音设备发送的语音数据,故,为识别不同的噪音设备发送的语音数据,噪音设备在发送语音数据时,可在语音数据中添加噪音设备标识,确定子单元根据侦测模块151接收到的语音数据确定对应的噪音设备标识,获取子单元根据预设的衰减信息与噪音设备标识之间的映射关系,获取确定的噪音设备标识所对应的衰减信息,调节子单元根据获取到的衰减信息调节对应的第二音频信号,保证对第二音频信号进行衰减调节的准确性,即提高语音控制的准确性。
进一步地,为提高语音控制的准确性,所述确定模块153还用于在侦测到噪音设备发送的音频播放指令时,所述被控终端基于接收到的音频播放指令确定待播放第三音频信号的播放时间以及强度信息;所述获取模块152还用于在接收到噪音设备播放的第三音频信号时,获取接收到的第三音频信号的强度信息、该第三音频信号的接收时间以及噪音设备的标识或者接收该噪音设备第三音频信号的环境噪音拾取装置的标识;该系统还包括生成模块和存储模块,所述生成模块还用于基于该接收到的第三音频信号的强度信息以及该第三音频信号的接收时间,以及确定的待播放第三音频信号的播放时间以及强度信息,生成对应的衰减信息;所述存储模块还用于将生成的衰减信息与所述噪音设备的标识或者环境噪音拾取装置的标识关联保存
在本实施例中,可在被控终端仅接收到噪音设备播放的第三音频信号时,确定第三音频信号的衰减信息。噪音设备可通过在播放第三音频信号之前向被控终端发送第三音频信号播放时间以及强度信息,以供被控终端确定接收到的第三音频信号的播放时间和强度信息,该播放时间可为一个时间点如8:00播放,也可为一个时间间隔,如5min之后播放,在接收到的播放时间为时间间隔时,被控终端基于接收到该播放时间间隔的时间点以及时间间隔,确定第三音频信号的播放时间。
本领域技术人员可以理解的是,噪音设备也可在播放第三音频信号后,向被控终端发送播放时间以及强度信息,生成模块基于该接收到的第三音频信号的强度信息以及该第三音频信号的接收时间,以及确定的待播放第三音频信号的播放时间以及强度信息,生成对应的衰减信息,存储模块将生成的衰减信息与所述噪音设备的标识或者环境噪音拾取装置的标识关联保存。
进一步地,为提高语音控制效率,所述响应模块156还用于在侦测到第一音频信号,且接收到语音数据所对应的播放时间点均与当前时间点不匹配时,响应所述第一音频信号。
本领域技术人员可以理解的是,在侦测到第一音频信号,且接收到语音数据所对应的播放时间点均与当前时间点不匹配时,说明侦测到的第一音频信号中不包括噪音设备播放的第二音频信号,为提高语音控制效率,可提取第一音频信号的声纹特征,并将提取的声纹特征与预设的声纹特征进行比对,在提取的声纹特征与预设的声纹特征匹配时,响应该接收到的第一音频信号。
参照图3,图3为本发明语音控制方法第一实施例的流程示意图。
本实施例提出一种语音控制方法,包括:
步骤S10,被控终端实时或定时侦测噪音设备发送的语音数据,并获取侦测到的语音数据的播放时间点;
在本实施例中,噪音设备可在播放第二音频信号之前或者播放第二音频信号时,对待播放的第二音频信号或者当前播放的第二音频信号按照预设的通信协议进行编码生成对应的语音数据,编码时在语音数据中添加播放时间,并将编码生成的语音数据发送至被控设备。例如,噪音设备与被控终端之间的通信方式为WIFI通信时,所对应的通信协议为WIFI通信协议,则采用WIFI通信协议对拾取到的第二音频信号进行编码。
步骤S20,在侦测到第一音频信号时,所述被控终端确定与当前时间点匹配的播放时间点所对应的语音数据,并将确定的语音数据转换为第二音频信号;
同理,被控设备在接收到的噪音设备发送的语音数据时,可直接按照预设的通信协议对将语音数据解码以获取对应的播放时间点,也可由语音数据的报文头中获取对应的播放时间点。在本实施例中,由于被控终端与噪音设备之间有一定的距离,被控终端侦测到的噪音设备播放的第二音频信号的时间点,与噪音设备播放的第二音频信号的时间点有一定的时间差,故,当前时间点与播放时间点匹配是指当前时间点与播放时间点之间的差值小于等于预设的阀值。
本领域技术人员可以理解的是,可在被控终端中设置多种接收模块如WIFI模块以及红外模块等无线侦测模块,或者RS425接口以及串行接口等有线接口接收环境噪音拾取装置发送的语音数据,被控终端可在接收到语音数据时,确定接收到语音数据的接口或者模块,采用该确定的接口或者模块所对应的通信协议对接收到的语音数据进行解码,以将接收到的语音数据转换为第二音频信号。
步骤S30,所述被控终端剔除所述第一音频信号中与所述第二音频信号匹配的部分,以生成语音控制指令;
被控终端中的语音拾取装置接收到的第一音频信号包括用户发送的语音控制指令以及环境噪音(如第二音频信号)。在本实施例中,被控终端剔除所述第一音频信号中与所述第二音频信号匹配的部分时可通过波形比对的方式实现,如比对第一音频信号以及转换后的第二音频信号的波形走向,根据语音控制信号所对应的波形的幅度对第一音频信号所对应的波形进行调节。
步骤S40,所述被控终端响应该生成的语音控制指令。
在本实施例中,被控终端在响应该生成的语音控制指令时,可将预存的语音控制指令与控制代码之间的映射关系中的语音控制指令与该生成的语音控制指令进行比对,确定与该生成的语音控制指令匹配的语音控制指令与控制代码之间的映射关系,根据该匹配的语音控制指令与控制代码之间的映射关系,确定该生成的语音控制指令所对应的控制代码,执行该控制代码。在生成的语音控制指令与预存的语音控制指令进行比对时,若关键音匹配或者匹配的关键音的数量大于预设的阀值,则认为生成的语音控制指令与预存的语音控制指令匹配。
本实施例提出的语音控制方法,该系统通过噪音设备实时或定时向被控终端发送语音数据,并在语音数据中添加播放时间点,使得被控终端在侦测到第一音频信号时,确定与当前时间点匹配的播放时间点所对应的语音数据,并将确定的语音数据转换为第二音频信号;被控终端剔除所述第一音频信号中与所述第二音频信号匹配的部分,以生成语音控制指令,并响应模块响应该生成的语音控制指令,通过将接收到第一音频信号中的噪音设备产生的第二音频信号剔除,提高语音控制的准确性。
进一步地,为提高语音控制的准确性,所述步骤S30包括:
步骤S31,所述被控终端根据预设的衰减信息调节所述第二音频信号;
步骤S32,所述被控终端将调节后的第二音频信号与所述第一音频信号进行比对;
步骤S33,所述被控终端剔除所述第一音频信号中与调节后的所述第二音频信号匹配的部分,并生成语音控制指令。
由于第二音频信号由噪音设备发送至被控终端的过程中会出现衰减,该衰减信息包括第二音频信号的衰减幅度以及延时时长,由于噪音设备所处的位置不变,故衰减幅度以及延时时长不变,故预设衰减幅度以及延时时长,并根据预设的衰减幅度以及延时时长对第二音频信号的波形进行调整,并将调整后的波形与接收到的第一音频信号的波形进行比对。
进一步地,为提高语音控制的准确性,所述步骤S31包括:
所述被控终端根据接收到的语音数据确定对应的噪音设备标识;
所述被控终端根据预设的衰减信息与噪音设备标识之间的映射关系,获取确定者噪音设备标识所对应的衰减信息;
所述被控终端根据获取到的衰减信息调节对应的所述第二音频信号。
在本实施例中,被控设备所处的环境中可能存在多个噪音设备,例如在被控设备为空调器时,室内的电视机以及收音机等均作为噪音设备会对空调器的语音控制造成干扰,故需要在被控终端中保存预设的衰减信息与环境噪音拾取装置标识或噪音设备标识之间的映射关系。
本领域技术人员可以理解的是,在设置有多个噪音设备时,被控终端可能同时接收到多个噪音设备发送的语音数据,故,为识别不同的噪音设备发送的语音数据,噪音设备在发送语音数据时,可在语音数据中添加噪音设备标识,被控终端根据接收到的语音数据确定对应的噪音设备标识,并根据预设的衰减信息与噪音设备标识之间的映射关系,获取确定的噪音设备标识所对应的衰减信息,被控终端根据获取到的衰减信息调节对应的第二音频信号,保证对第二音频信号进行衰减调节的准确性,即提高语音控制的准确性。
进一步地,为提高语音控制的准确性,步骤S10之前,该方法包括:
在侦测到噪音设备发送的音频播放指令时,所述被控终端基于接收到的音频播放指令确定待播放第三音频信号的播放时间以及强度信息;
在接收到噪音设备播放的第三音频信号时,获取接收到的第三音频信号的强度信息、该第三音频信号的接收时间以及噪音设备的标识或者接收该噪音设备第三音频信号的环境噪音拾取装置的标识;
基于该接收到的第三音频信号的强度信息以及该第三音频信号的接收时间,以及确定的待播放第三音频信号的播放时间以及强度信息,生成对应的衰减信息;
将生成的衰减信息与所述噪音设备的标识或者环境噪音拾取装置的标识关联保存。
在本实施例中,可在被控终端仅接收到噪音设备播放的第三音频信号时,确定第三音频信号的衰减信息。噪音设备可通过在播放第三音频信号之前向被控终端发送第三音频信号播放时间以及强度信息,以供被控终端确定接收到的第三音频信号的播放时间和强度信息,该播放时间可为一个时间点如8:00播放,也可为一个时间间隔,如5min之后播放,在接收到的播放时间为时间间隔时,被控终端基于接收到该播放时间间隔的时间点以及时间间隔,确定第三音频信号的播放时间。
本领域技术人员可以理解的是,噪音设备也可在播放第三音频信号后,向被控终端发送播放时间以及强度信息,生成模块基于该接收到的第三音频信号的强度信息以及该第三音频信号的接收时间,以及确定的待播放第三音频信号的播放时间以及强度信息,生成对应的衰减信息,存储模块将生成的衰减信息与所述噪音设备的标识或者环境噪音拾取装置的标识关联保存。
进一步地,为提高语音控制效率,步骤S10之后,该方法包括步骤:
在接收到语音数据所对应的播放时间点均与当前时间点不匹配时,所述被控终端响应所述第一音频信号。
本领域技术人员可以理解的是,在侦测到第一音频信号,且接收到语音数据所对应的播放时间点均与当前时间点不匹配时,说明侦测到的第一音频信号中不包括噪音设备播放的第二音频信号,为提高语音控制效率,可提取第一音频信号的声纹特征,并将提取的声纹特征与预设的声纹特征进行比对,在提取的声纹特征与预设的声纹特征匹配时,响应该接收到的第一音频信号。
参照图4,图4为本发明语音控制方法第二实施例的流程示意图。
本发明提出一种语音控制方法,包括:
步骤S50,在侦测到第一音频信号时,被控终端向噪音设备发送语音数据获取请求,以供噪音设备在接收语音数据时,将播放时间点与当前时间点匹配的语音数据反馈给被控终端;
在本实施例中,噪音设备可在播放第二音频信号之前或者播放第二音频信号时,将待播放的第二音频信号或者当前播放的第二音频信号与播放时间点关联保存,噪音设备在接收到被控终端发送的语音数据获取请求时,获取接收到该语音数据获取请求的接收时间点,并将预存的待播放的第二音频信号或者当前播放的第二音频信号与播放时间点之间的关联关系,与数据获取请求的接收时间点进行比对,在预存的待播放的第二音频信号或者当前播放的第二音频信号与播放时间点之间的关联关系中,有播放时间点与数据获取请求的接收时间点匹配时,可将该匹配的播放时间点所对应的第二音频信号编码为语音数据,并将该生成的语音数据发送给被控终端。噪音设备对第二音频信号进行编码时,可按照预设的通信协议进行编码生成对应的语音数据,编码时在语音数据中添加播放时间,并将编码生成的语音数据发送至被控设备。例如,噪音设备与被控终端之间的通信方式为WIFI通信时,所对应的通信协议为WIFI通信协议,则采用WIFI通信协议对拾取到的第二音频信号进行编码。
由于被控终端与噪音设备之间有一定的距离,故被控终端侦测到的噪音设备播放的第二音频信号的时间点,与噪音设备播放的第二音频信号的时间点有一定的时间差。即噪音设备接收到被控设备发送的语音数据获取请求的时间点有一定的时间差,则数据获取请求的接收时间点与播放时间点匹配是指,数据获取请求的接收时间点与播放时间点之间的差值小于等于预设的阀值。
步骤S60,在接收到噪音设备反馈的语音数据时,所述被控终端将所述语音数据转换为第二音频信号;
本领域技术人员可以理解的是,可在被控终端中设置多种接收模块,如WIFI模块以及红外模块等无线侦测模块,或者RS425接口以及串行接口等有线接口接收环境噪音拾取装置发送的语音数据,被控终端可在接收到语音数据时,确定接收到语音数据的接口或者模块,采用该确定的接口或者模块所对应的通信协议对接收到的语音数据进行解码,以将接收到的语音数据转换为第二音频信号。
步骤S70,所述被控终端剔除所述第一音频信号中与所述第二音频信号匹配的部分,以生成语音控制指令;
被控终端中的语音拾取装置接收到的第一音频信号包括用户发送的语音控制指令以及环境噪音(如第二音频信号)。在本实施例中,被控终端剔除所述第一音频信号中与所述第二音频信号匹配的部分时可通过波形比对的方式实现,如比对第一音频信号以及转换后的第二音频信号的波形走向,根据语音控制信号所对应的波形的幅度对第一音频信号所对应的波形进行调节。
步骤S80,所述被控终端响应该生成的语音控制指令。
在本实施例中,被控终端在响应该生成的第一音频信号时,可将预存的第一音频信号与控制代码之间的映射关系中的第一音频信号与该生成的第一音频信号进行比对,确定与该生成的第一音频信号匹配的第一音频信号与控制代码之间的映射关系,根据该匹配的第一音频信号与控制代码之间的映射关系,确定该生成的第一音频信号所对应的控制代码,执行该控制代码。在生成的第一音频信号与预存的第一音频信号进行比对时,若关键音匹配或者匹配的关键音的数量大于预设的阀值,则认为生成的第一音频信号与预存的第一音频信号匹配。
本实施例提出的语音控制方法,该方法中在侦测到第一音频信号时,被控终端向噪音设备发送语音数据获取请求,以供噪音设备在接收语音数据时,将播放时间点与当前时间点匹配的语音数据发送给被控终端,在接收到噪音设备发送的语音数据时,被控终端将所述语音数据转换为第二音频信号,处理模块剔除所述第一音频信号中与所述第二音频信号匹配的部分,以生成语音控制指令,被控终端响应该生成的语音控制指令,通过将接收到第一音频信号中的噪音设备产生的第二音频信号剔除,提高语音控制的准确性。
进一步地,为提高语音控制的准确性,所述步骤S70包括:
步骤S71,所述被控终端根据预设的衰减信息调节所述第二音频信号;
步骤S72,所述被控终端将调节后的第二音频信号与所述第一音频信号进行比对;
步骤S73,所述被控终端剔除所述第一音频信号中与所述第二音频信号匹配的部分,并生成所述语音控制指令。
由于第二音频信号由噪音设备发送至被控终端的过程中会出现衰减,该衰减信息包括第二音频信号的衰减幅度以及延时时长,由于噪音设备所处的位置不变,故衰减幅度以及延时时长不变,故预设衰减幅度以及延时时长,并根据预设的衰减幅度以及延时时长对第二音频信号的波形进行调整,并将调整后的波形与接收到的第一音频信号的波形进行比对。
进一步地,为提高语音控制的准确性,步骤S71包括:
所述被控终端根据接收到的语音数据确定对应的噪音设备标识;
所述被控终端根据预设的衰减信息与噪音设备标识之间的映射关系,获取确定者噪音设备标识所对应的衰减信息;
所述被控终端根据获取到的衰减信息调节对应的所述第二音频信号。
在本实施例中,被控设备所处的环境中可能存在多个噪音设备,例如在被控设备为空调器时,室内的电视机以及收音机等均作为噪音设备会对空调器的语音控制造成干扰,故需要在被控终端中保存预设的衰减信息与环境噪音拾取装置标识或噪音设备标识之间的映射关系。
本领域技术人员可以理解的是,在设置有多个噪音设备时,被控终端可能同时接收到多个噪音设备发送的语音数据,故,为识别不同的噪音设备发送的语音数据,噪音设备在发送语音数据时,可在语音数据中添加噪音设备标识,被控终端根据接收到的语音数据确定对应的噪音设备标识,并根据预设的衰减信息与噪音设备标识之间的映射关系,获取确定的噪音设备标识所对应的衰减信息,被控终端根据获取到的衰减信息调节对应的第二音频信号,保证对第二音频信号进行衰减调节的准确性,即提高语音控制的准确性。
进一步地,为提高语音控制的准确性,步骤S50之前还包括:
在侦测到噪音设备发送的音频播放指令时,所述被控终端基于接收到的音频播放指令确定待播放第三音频信号的播放时间以及强度信息;
在接收到噪音设备播放的第三音频信号时,获取接收到的第三音频信号的强度信息、该第三音频信号的接收时间以及噪音设备的标识或者接收该噪音设备第三音频信号的环境噪音拾取装置的标识;
基于该接收到的第三音频信号的强度信息以及该第三音频信号的接收时间,以及确定的待播放第三音频信号的播放时间以及强度信息,生成对应的衰减信息;
将生成的衰减信息与所述噪音设备的标识或者环境噪音拾取装置的标识关联保存。
在本实施例中,可在被控终端仅接收到噪音设备播放的第三音频信号时,确定第三音频信号的衰减信息。噪音设备可通过在播放第三音频信号之前向被控终端发送第三音频信号播放时间以及强度信息,以供被控终端确定接收到的第三音频信号的播放时间和强度信息,该播放时间可为一个时间点如8:00播放,也可为一个时间间隔,如5min之后播放,在接收到的播放时间为时间间隔时,被控终端基于接收到该播放时间间隔的时间点以及时间间隔,确定第三音频信号的播放时间。
本领域技术人员可以理解的是,噪音设备也可在播放第三音频信号后,向被控终端发送播放时间以及强度信息,被控终端基于该接收到的第三音频信号的强度信息以及该第三音频信号的接收时间,以及确定的待播放第三音频信号的播放时间以及强度信息,生成对应的衰减信息,被控终端将生成的衰减信息与所述噪音设备的标识关联保存。
进一步地,为提高语音控制效率,步骤S50之后还包括:
步骤S60,在接收到噪音设备反馈的语音数据时,响应所述第一音频信号。
本领域技术人员可以理解的是,在未接收到噪音设备反馈的语音数据时,说明侦测到的第一音频信号中不包括噪音设备播放的第二音频信号,为提高语音控制效率,可提取第一音频信号的声纹特征,并将提取的声纹特征与预设的声纹特征进行比对,在提取的声纹特征与预设的声纹特征匹配时,响应该接收到的第一音频信号。
上述本发明实施例序号仅仅为了描述,不代表实施例的优劣。通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本发明各个实施例所述的方法。
以上所述仅为本发明的优选实施例,并非因此限制本发明的专利范围,凡是利用本发明说明书及附图内容所作的等效结构变换,或直接或间接运用在其他相关的技术领域,均同理包括在本发明的专利保护范围内。

Claims (18)

  1. 一种语音控制方法,其特征在于,包括:
    被控终端实时或定时侦测噪音设备发送的语音数据,并获取侦测到的语音数据的播放时间点;
    在侦测到第一音频信号时,所述被控终端确定与当前时间点匹配的播放时间点所对应的语音数据,并将确定的语音数据转换为第二音频信号;
    所述被控终端剔除所述第一音频信号中与所述第二音频信号匹配的部分,以生成语音控制指令;
    所述被控终端响应该生成的语音控制指令。
  2. 根据权利要求1所述的方法,其特征在于,所述被控终端剔除所述第一音频信号中与所述第二音频信号匹配的部分,以生成语音控制指令的步骤包括:
    所述被控终端根据预设的衰减信息调节所述第二音频信号;
    所述被控终端将调节后的第二音频信号与所述第一音频信号进行比对;
    所述被控终端剔除所述第一音频信号中与调节后的所述第二音频信号匹配的部分,并生成所述语音控制指令。
  3. 根据权利要求2所述的方法,其特征在于,所述被控终端根据预设的衰减信息调节所述第二音频信号的步骤包括:
    所述被控终端根据接收到的语音数据确定对应的噪音设备标识;
    所述被控终端根据预设的衰减信息与噪音设备标识之间的映射关系,获取确定者噪音设备标识所对应的衰减信息;
    所述被控终端根据获取到的衰减信息调节对应的所述第二音频信号。
  4. 根据权利要求1所述的方法,其特征在于,所述被控终端实时或定时侦测噪音设备发送的语音数据,并获取侦测到的语音数据的播放时间的步骤之前,该方法还包括:
    在侦测到噪音设备发送的音频播放指令时,所述被控终端基于接收到的音频播放指令确定待播放第三音频信号的播放时间以及强度信息;
    在接收到噪音设备播放的第三音频信号时,获取接收到的第三音频信号的强度信息、该第三音频信号的接收时间以及噪音设备的标识或者接收所述第三音频信号的环境噪音拾取装置的标识;
    基于该接收到的第三音频信号的强度信息以及该第三音频信号的接收时间,以及确定的待播放第三音频信号的播放时间以及强度信息,生成对应的衰减信息;
    将生成的衰减信息与所述噪音设备的标识或者环境噪音拾取装置的标识关联保存。
  5. 根据权利要求1所述的方法,其特征在于,所述被控终端实时或定时侦测噪音设备发送的语音数据,并获取侦测到的语音数据的播放时间的步骤之后,该方法包括:
    在侦测到第一音频信号,且接收到语音数据所对应的播放时间点均与当前时间点不匹配时,所述被控终端响应所述第一音频信号。
  6. 根据权利要求5所述的方法,其特征在于,所述在侦测到第一音频信号,且接收到语音数据所对应的播放时间点均与当前时间点不匹配时,所述被控终端响应所述第一音频信号的步骤包括:
    在侦测到第一音频信号,且接收到语音数据所对应的播放时间点均与当前时间点不匹配时,提取第一音频信号的声纹特征,并将提取的声纹特征与预设的声纹特征进行比对;
    在提取的声纹特征与预设的声纹特征匹配时,响应该接收到的第一音频信号。
  7. 一种语音控制方法,其特征在于,包括:
    在侦测到第一音频信号时,被控终端向噪音设备发送语音数据获取请求,以供噪音设备在接收语音数据时,将播放时间点与当前时间点匹配的语音数据反馈给被控终端;
    在接收到噪音设备反馈的语音数据时,所述被控终端将所述语音数据转换为第二音频信号;
    所述被控终端剔除所述第一音频信号中与所述第二音频信号匹配的部分,以生成语音控制指令;
    所述被控终端响应该生成的语音控制指令。
  8. 根据权利要求7所述的方法,其特征在于,所述被控终端剔除所述第一音频信号中与所述第二音频信号匹配的部分,以生成语音控制指令的步骤包括:
    所述被控终端根据预设的衰减信息调节所述第二音频信号;
    所述被控终端将调节后的第二音频信号与所述第一音频信号进行比对;
    所述被控终端剔除所述第一音频信号中与所述第二音频信号匹配的部分,并生成所述语音控制指令。
  9. 根据权利要求8所述的方法,其特征在于,所述被控终端根据预设的衰减信息调节所述第二音频信号的步骤包括:
    所述被控终端根据接收到的语音数据确定对应的噪音设备标识;
    所述被控终端根据预设的衰减信息与噪音设备标识之间的映射关系,获取确定者噪音设备标识所对应的衰减信息;
    所述被控终端根据获取到的衰减信息调节对应的所述第二音频信号。
  10. 根据权利要求7所述的方法,其特征在于,所述在侦测到第一音频信号时,被控终端向噪音设备发送语音数据获取请求,以供噪音设备在接收语音数据时,将播放时间点与当前时间点匹配的语音数据反馈给被控终端的步骤之前,该方法包括:
    在侦测到噪音设备发送的音频播放指令时,所述被控终端基于接收到的音频播放指令确定待播放第三音频信号的播放时间以及强度信息;
    在接收到噪音设备播放的第三音频信号时,获取接收到的第三音频信号的强度信息、该第三音频信号的接收时间以及噪音设备的标识或者接收该噪音设备第三音频信号的环境噪音拾取装置的标识;
    基于该接收到的第三音频信号的强度信息以及该第三音频信号的接收时间,以及确定的待播放第三音频信号的播放时间以及强度信息,生成对应的衰减信息;
    将生成的衰减信息与所述噪音设备的标识或者环境噪音拾取装置的标识关联保存。
  11. 根据权利要求7所述的方法,其特征在于,所述在侦测到第一音频信号时,被控终端向噪音设备发送语音数据获取请求,以供噪音设备在接收语音数据时,将播放时间点与当前时间点匹配的语音数据反馈给被控终端的步骤之后,该方法包括:
    在未接收到噪音设备反馈的语音数据时,响应所述第一音频信号。
  12. 根据权利要求11所述的方法,其特征在于,所述在未接收到噪音设备反馈的语音数据时,响应所述第一音频信号的步骤包括:
    在未接收到噪音设备反馈的语音数据时,提取第一音频信号的声纹特征,并将提取的声纹特征与预设的声纹特征进行比对;
    在提取的声纹特征与预设的声纹特征匹配时,响应该接收到的第一音频信号。
  13. 一种语音控制系统,其特征在于,包括:
    侦测模块,用于实时或定时侦测噪音设备发送的语音数据;
    获取模块,用于获取侦测到的语音数据的播放时间点;
    确定模块,用于在侦测到第一音频信号时,确定与当前时间点匹配的播放时间点所对应的语音数据;
    转换模块,用于将确定的语音数据转换为第二音频信号;
    处理模块,用于剔除所述第一音频信号中与调节后的所述第二音频信号匹配的部分,以生成语音控制指令;
    响应模块,用于响应该生成的语音控制指令。
  14. 根据权利要求13所述的系统,其特征在于,所述处理模块包括:
    调节单元,用于根据预设的衰减信息调节所述第二音频信号;
    比对单元,用于将调节后的第二音频信号与所述第一音频信号进行比对;
    处理单元,用于剔除所述第一音频信号中与所述第二音频信号匹配的部分,并生成所述语音控制指令。
  15. 根据权利要求14所述的系统,其特征在于,所述调节单元包括:
    确定子单元,用于根据接收到的语音数据确定对应的噪音设备标识;
    获取子单元,用于根据预设的衰减信息与噪音设备标识之间的映射关系,获取确定者噪音设备标识所对应的衰减信息;
    调节子单元,用于根据获取到的衰减信息调节对应的所述第二音频信号。
  16. 根据权利要求13所述的系统,其特征在于,所述确定模块还用于在侦测到噪音设备发送的音频播放指令时,所述被控终端基于接收到的音频播放指令确定待播放第三音频信号的播放时间以及强度信息;所述获取模块还用于在接收到噪音设备播放的第三音频信号时,获取接收到的第三音频信号的强度信息、该第三音频信号的接收时间以及噪音设备的标识或者接收该噪音设备第三音频信号的环境噪音拾取装置的标识;该系统还包括生成模块和存储模块,所述生成模块还用于基于该接收到的第三音频信号的强度信息以及该第三音频信号的接收时间,以及确定的待播放第三音频信号的播放时间以及强度信息,生成对应的衰减信息;所述存储模块还用于将生成的衰减信息与所述噪音设备的标识或者环境噪音拾取装置的标识关联保存。
  17. 根据权利要求13所述的系统,其特征在于,所述响应模块还用于在侦测到第一音频信号,且接收到语音数据所对应的播放时间点均与当前时间点不匹配,响应所述第一音频信号。
  18. 根据权利要求17所述的系统,其特征在于,所述响应模块还用于提取第一音频信号的声纹特征,并将提取的声纹特征与预设的声纹特征进行比对,在提取的声纹特征与预设的声纹特征匹配时,响应该接收到的第一音频信号。
PCT/CN2014/091948 2014-05-29 2014-11-21 语音控制方法和系统 WO2015180430A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410234257.3 2014-05-29
CN201410234257.3A CN105280184A (zh) 2014-05-29 2014-05-29 语音控制方法和系统

Publications (1)

Publication Number Publication Date
WO2015180430A1 true WO2015180430A1 (zh) 2015-12-03

Family

ID=54698012

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/091948 WO2015180430A1 (zh) 2014-05-29 2014-11-21 语音控制方法和系统

Country Status (2)

Country Link
CN (1) CN105280184A (zh)
WO (1) WO2015180430A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109389976A (zh) * 2018-09-27 2019-02-26 珠海格力电器股份有限公司 智能家电设备控制方法、装置、智能家电设备及存储介质
CN113504756B (zh) * 2021-07-23 2023-03-21 巨翊科技(上海)有限公司 一种基于语音电路的延时方法、系统和电路保护法及装置

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1322348A (zh) * 1999-09-23 2001-11-14 皇家菲利浦电子有限公司 语音识别设备和消费者电子系统
CN1397062A (zh) * 2000-12-29 2003-02-12 祖美和 声音控制电视接收设备以及声音控制方法
CN102890936A (zh) * 2011-07-19 2013-01-23 联想(北京)有限公司 一种音频处理方法、终端设备及系统
CN102915732A (zh) * 2012-10-31 2013-02-06 黑龙江省电力有限公司信息通信分公司 抑制背景广播的语音指令识别方法与装置
CN103050116A (zh) * 2012-12-25 2013-04-17 安徽科大讯飞信息科技股份有限公司 语音命令识别方法及系统
CN103559878A (zh) * 2013-09-04 2014-02-05 张家港保税区润桐电子技术研发有限公司 一种消除音频信息中的噪声的方法及装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1322348A (zh) * 1999-09-23 2001-11-14 皇家菲利浦电子有限公司 语音识别设备和消费者电子系统
CN1397062A (zh) * 2000-12-29 2003-02-12 祖美和 声音控制电视接收设备以及声音控制方法
CN102890936A (zh) * 2011-07-19 2013-01-23 联想(北京)有限公司 一种音频处理方法、终端设备及系统
CN102915732A (zh) * 2012-10-31 2013-02-06 黑龙江省电力有限公司信息通信分公司 抑制背景广播的语音指令识别方法与装置
CN103050116A (zh) * 2012-12-25 2013-04-17 安徽科大讯飞信息科技股份有限公司 语音命令识别方法及系统
CN103559878A (zh) * 2013-09-04 2014-02-05 张家港保税区润桐电子技术研发有限公司 一种消除音频信息中的噪声的方法及装置

Also Published As

Publication number Publication date
CN105280184A (zh) 2016-01-27

Similar Documents

Publication Publication Date Title
WO2019051908A1 (zh) 终端控制方法、装置及计算机可读存储介质
WO2019051899A1 (zh) 终端控制方法、装置及存储介质
WO2015158132A1 (zh) 语音控制方法和系统
WO2020045950A1 (en) Method, device, and system of selectively using multiple voice data receiving devices for intelligent service
WO2019051890A1 (zh) 终端控制方法、装置及计算机可读存储介质
WO2019019374A1 (zh) 智能语音设备控制家电的方法、装置及系统
WO2017201899A1 (zh) 连接蓝牙设备的方法及装置
WO2019051895A1 (zh) 终端控制方法、装置及存储介质
WO2020246844A1 (en) Device control method, conflict processing method, corresponding apparatus and electronic device
WO2019056752A1 (zh) 家电设备的配网方法、装置、系统及计算机可读存储介质
WO2015196720A1 (zh) 语音识别方法及系统
WO2016082267A1 (zh) 语音识别方法和系统
WO2019019340A1 (zh) 应用程序页面打开方法、装置、终端及可读存储介质
WO2015007007A1 (zh) 一种adc自动校正的方法及装置
WO2015158133A1 (zh) 语音控制指令纠错方法和系统
WO2019062194A1 (zh) 家电设备及其控制方法、系统及计算机可读存储介质
WO2019071762A1 (zh) 楼层位置定位方法、系统、服务器和计算机可读存储介质
EP3821417A2 (en) Method and apparatus for providing notification by interworking plurality of electronic devices
WO2017063369A1 (zh) 无线直连连接方法及装置
WO2020130535A1 (en) Electronic device including earphone, and method of controlling the electronic device
WO2016171512A1 (ko) 복수의 디바이스에 대한 원격제어를 수행할 수 있는 원격제어장치
WO2016029597A1 (zh) 终端控制方法和系统
WO2017148028A1 (zh) 基于智能电视的远端网络连接方法和系统
WO2018000856A1 (zh) 一种实现SDN Overlay网络报文转发的方法、终端、设备及计算机可读存储介质
WO2018166236A1 (zh) 理赔账单识别方法、装置、设备及计算机可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14893423

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14893423

Country of ref document: EP

Kind code of ref document: A1