WO2023206686A1 - Control method for smart device, and storage medium and electronic apparatus - Google Patents

Control method for smart device, and storage medium and electronic apparatus Download PDF

Info

Publication number
WO2023206686A1
WO2023206686A1 PCT/CN2022/095335 CN2022095335W WO2023206686A1 WO 2023206686 A1 WO2023206686 A1 WO 2023206686A1 CN 2022095335 W CN2022095335 W CN 2022095335W WO 2023206686 A1 WO2023206686 A1 WO 2023206686A1
Authority
WO
WIPO (PCT)
Prior art keywords
delay
target
initial
function value
sound signal
Prior art date
Application number
PCT/CN2022/095335
Other languages
French (fr)
Chinese (zh)
Inventor
郝斌
Original Assignee
青岛海尔科技有限公司
海尔智家股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 青岛海尔科技有限公司, 海尔智家股份有限公司 filed Critical 青岛海尔科技有限公司
Publication of WO2023206686A1 publication Critical patent/WO2023206686A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/28Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]
    • H04L12/2803Home automation networks
    • H04L12/2816Controlling appliance services of a home automation network by calling their functionalities
    • H04L12/282Controlling appliance services of a home automation network by calling their functionalities based on user interaction within the home
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/22Interactive procedures; Man-machine interfaces
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/28Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0852Delays

Definitions

  • the present disclosure relates to the field of smart home/smart home, specifically, to a control method, storage medium and electronic device of an intelligent device.
  • a voice wake-up signal is used to wake up the voice interaction function of the smart device
  • voice control instructions are used to control and trigger the voice interaction function of the smart device.
  • the same scene may include multiple devices configured with voice interaction functions, and users usually only need one device to respond to voice control instructions at the same time. For example, only the device close to the user needs to execute voice control instructions.
  • the nearest wake-up determination algorithm can be used to wake up the device. Nearby wake-up means that devices closest to the speaker respond first.
  • the control method of the above-mentioned smart device is effective in high signal-to-noise ratio scenarios and has good results.
  • the energy received by the microphone not only includes the sound source, but also the energy received by the microphone.
  • noise and reverberation such as historical signals of sound sources emitted from various objects in the environment, and other noises.
  • the control methods of the above-mentioned smart devices have poor performance.
  • control method of smart devices in related technologies has the problem of poor device control accuracy due to the presence of noise and reverberation in scenarios with low signal-to-noise ratio and high reverberation.
  • Embodiments of the present disclosure provide a control method, a storage medium and an electronic device for an intelligent device, to at least solve the problem that the control method of an intelligent device in related technologies exists in a scene with low signal-to-noise ratio and high reverberation due to the presence of noise. And the problem of poor accuracy of device control caused by reverberation.
  • a method for controlling a smart device including: acquiring a first sound signal received by a first device and a second sound signal received by a second device, wherein the first sound signal The signal and the second sound signal are sound signals corresponding to the operation execution signal issued by the target object; the first sound signal and the second sound signal are determined according to the first sound signal and the second sound signal.
  • the target cross power spectrum of the signal determines the target delay between the first device and the second device, wherein the target delay is when the operation execution signal reaches the The difference in time between the first device and the second device; determining a target device from the first device and the second device according to the target delay, and controlling the target device to perform the operation execution Operation of the device indicated by the signal, wherein the target device is the device closest to the target object.
  • a control device for an intelligent device including: an acquisition unit configured to acquire the first sound signal received by the first device and the second sound signal received by the second device, Wherein, the first sound signal and the second sound signal are sound signals corresponding to the operation execution signal issued by the target object; the first determination unit is configured to determine the sound signal according to the first sound signal and the second sound signal. , determine the target cross power spectrum of the first sound signal and the second sound signal; the second determination unit is configured to determine the relationship between the first device and the second device according to the target cross power spectrum.
  • the target delay wherein the target delay is the time difference between the operation execution signal arriving at the first device and the second device; the execution unit is configured to, according to the target delay, start from the A target device is determined among the first device and the second device, and the target device is controlled to perform the device operation indicated by the operation execution signal, wherein the target device is the device closest to the target object.
  • a computer-readable storage medium stores a computer program, wherein the computer program is configured to execute the above-mentioned smart device when running. Control Method.
  • an electronic device including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the above-mentioned steps through the computer program. Control methods for smart devices.
  • the smart device to be controlled is determined based on the cross-correlation function between the received signals of the two devices, by obtaining the first sound signal received by the first device and the second sound signal received by the second device.
  • Sound signal wherein the first sound signal and the second sound signal are sound signals corresponding to the operation execution signal issued by the target object; according to the first sound signal and the second sound signal, the first sound signal and the second sound signal are determined.
  • Target cross power spectrum determine the target delay between the first device and the second device, where the target delay is the time difference between the arrival of the operation execution signal at the first device and the arrival at the second device; according to The target delay is to determine the target device from the first device and the second device, and control the target device to perform the device operation indicated by the operation execution signal, wherein the target device is the device closest to the target object, because the target device is based on the signal received by the two devices.
  • the cross power spectrum of the sound signal determines the time difference between the two devices receiving the sound signal, so that the device closest to the user can be selected as the smart device to be controlled based on the time difference, and is determined based on the arrival time difference of the operation execution signal
  • the device closest to the user can achieve the purpose of reducing miscontrol of the device caused by the impact of noise, reverberation, etc. on the signal energy, achieving the technical effect of improving the accuracy of device control, and thus solving the problem of control of smart devices in related technologies.
  • This method has the problem of poor device control accuracy due to the presence of noise and reverberation in scenes with low signal-to-noise ratio and high reverberation.
  • Figure 1 is a schematic diagram of the hardware environment of an optional smart device control method according to an embodiment of the present disclosure
  • Figure 2 is a schematic flowchart of an optional smart device control method according to an embodiment of the present disclosure
  • FIG. 3 is a schematic flowchart of another optional smart device control method according to an embodiment of the present disclosure.
  • Figure 4 is a structural block diagram of an optional intelligent device control device according to an embodiment of the present disclosure.
  • FIG. 5 is a structural block diagram of an optional electronic device according to an embodiment of the present disclosure.
  • a method for controlling a smart device is provided.
  • the control method of this smart device is widely used in whole-house intelligent digital control application scenarios such as smart home, smart home, smart home device ecology, and smart residence (Intelligence House) ecology.
  • the above intelligent device control method can be applied to a hardware environment composed of a terminal device 102 and a server 104 as shown in FIG. 1 .
  • the server 104 is connected to the terminal device 102 through the network and can be used to provide services (such as application services, etc.) for the terminal or the client installed on the terminal.
  • a database can be set up on the server or independently from the server.
  • cloud computing and/or edge computing services can be configured on the server or independently of the server to provide data computing services for the server 104.
  • the above-mentioned network may include but is not limited to at least one of the following: wired network, wireless network.
  • the above-mentioned wired network may include but is not limited to at least one of the following: wide area network, metropolitan area network, and local area network.
  • the above-mentioned wireless network may include at least one of the following: WIFI (Wireless Fidelity, Wireless Fidelity), Bluetooth.
  • the terminal device 102 may be, but is not limited to, a PC, a mobile phone, a tablet, a smart air conditioner, a smart hood, a smart refrigerator, a smart oven, a smart stove, a smart washing machine, a smart water heater, a smart washing equipment, a smart dishwasher, or a smart projection device.
  • smart TV smart clothes drying rack, smart curtains, smart audio and video, smart sockets, smart audio, smart speakers, smart fresh air equipment, smart kitchen and bathroom equipment, smart bathroom equipment, smart sweeping robot, smart window cleaning robot, smart mopping robot, Smart air purification equipment, smart steamers, smart microwave ovens, smart kitchen appliances, smart purifiers, smart water dispensers, smart door locks, etc.
  • the control method of the smart device in the embodiment of the present disclosure may be executed by the server 104, may be executed by the terminal device 102, or may be executed jointly by the server 104 and the terminal device 102.
  • the terminal device 102 may also execute the control method of the smart device according to the embodiment of the present disclosure by a client installed thereon.
  • FIG. 2 is a schematic flowchart of an optional smart device control method according to an embodiment of the present disclosure. As shown in Figure 2, this method The process can include the following steps:
  • Step S202 Obtain the first sound signal received by the first device and the second sound signal received by the second device, where the first sound signal and the second sound signal are sound signals corresponding to the operation execution signal sent by the target object.
  • the control method of smart devices in this embodiment can be applied to scenarios where there are multiple smart devices that are allowed to use the same operation execution signal to control the execution of corresponding device operations.
  • the operation execution signal can be a device wake-up signal, or a signal that controls the device to perform other device operations.
  • the device wake-up signal can contain the wake-up word of the smart device, and the smart device can respond to the received wake-up signal to wake it up.
  • Smart devices may be smart home devices, which may include but are not limited to smart home appliances, such as the above-mentioned smart air conditioners, smart refrigerators, smart ovens, etc. In this embodiment, the type of smart device is not limited.
  • an operation execution signal can be sent.
  • the operation execution signal can be used to instruct the execution of the corresponding device operation, which can be responded to by both the first device and the second device.
  • voice control signal may be a target wake-up signal, and the wake-up word carried by the target wake-up signal can wake up the first device and the second device at the same time.
  • the first device and the second device can respectively collect sounds through the sound collecting components on them to obtain the first sound signal and the second sound signal.
  • the sound collecting component can be a microphone or a microphone array.
  • the first sound signal is a sound signal collected by the first microphone in the first microphone array on the first device
  • the second sound signal is collected by the second microphone in the second microphone array on the second device. sound signal.
  • the first device and the second device can respectively send the collected sound signals to the server.
  • the server may receive the sent sound signal from the first device and the second device respectively, thereby acquiring the first sound signal and the second sound signal.
  • what the first device and the second device send to the server may be sound signals collected by their microphone arrays.
  • the server can select the sound signal collected by the first microphone from the sound signal collected by the first microphone array to obtain the first sound signal (which may be the sound signal of the first channel); the sound collected from the second microphone array Among the signals, the sound signal collected by the second microphone is selected to obtain the second sound signal (which may be the sound signal of the second channel).
  • the method of selecting the first microphone and the second microphone may be random selection or selection based on a sequence of microphones, which is not limited in this embodiment.
  • the corresponding relationship between the sound signals collected by the first device and the second device can be obtained by matching based on the collection time of the sound signal, the signal characteristics of the sound signal, etc., that is, based on the first device and the second device.
  • the sound signal collection time and the signal characteristics of the sound signal are matched to determine the sound signal of the second device that matches the sound signal collected by the first device.
  • Step S204 Determine the target cross power spectrum of the first sound signal and the second sound signal based on the first sound signal and the second sound signal.
  • the server can select the smart device to be controlled from the multiple devices.
  • the method of selecting the smart device to be controlled can be: using energy to select the nearest device.
  • using energy to select nearby devices requires ensuring that the selected audio signal (sound signal) must be similar to the sound source signal to ensure the accuracy of device selection. Under low signal-to-noise ratio and high reverberation conditions, the accuracy of selecting the nearest device among multiple devices is poor.
  • the time difference between the sound signals received by the two devices can be determined based on the cross power spectrum of the sound signals received by the two devices, so that the device closest to the user can be selected based on the time difference.
  • determining the device closest to the user based on the arrival time difference of the operation execution signal can reduce device miscontrol caused by noise, reverberation and other effects on signal energy, thereby improving the accuracy of device control. .
  • the server may first calculate the cross power spectrum of the first sound signal and the second sound signal, obtain the target cross power spectrum, and calculate the first sound signal and the second sound signal.
  • the mutual power spectrum of the two sound signals may be obtained in one or more ways.
  • the first sound signal and the second sound signal may be converted into the frequency domain, and the first sound signal and the second sound signal may be determined based on the converted frequency domain signals.
  • the cross power spectrum of the sound signal can also be determined by other methods, which is not limited in this embodiment.
  • Step S206 Determine the target delay between the first device and the second device according to the target cross power spectrum, where the target delay is the time difference between the operation execution signal arriving at the first device and the second device.
  • the server can determine the target delay between the first device and the second device.
  • the target delay here is the time difference between the operation execution signal arriving at the first device and the second device. It can be the operation execution time. The difference between the time the signal reaches the first device and the time the operation execution signal reaches the second device.
  • Step S208 Determine the target device from the first device and the second device according to the target delay, and control the target device to perform the device operation indicated by the operation execution signal, where the target device is the device closest to the target object.
  • the server may determine the device closest to the target object among the first device and the second device, and determine the determined device as the target device.
  • the determined target device may be the smart device to be controlled. For example, for a scenario where the operation execution signal is a target wake-up signal, the target device to be woken up among the first device and the second device may be determined.
  • the server may control the target device to perform a device operation indicated by the operation execution signal.
  • the operation execution signal may carry an operation execution instruction
  • the target device may perform the device operation indicated by the operation execution instruction carried in the operation execution signal.
  • the operation execution instruction is an instruction to control increasing the temperature
  • the device closest to the user can be selected from two devices that are allowed to execute the instruction to increase the temperature, such as a water heater and a bathroom heater, to perform the operation of increasing the temperature.
  • the operation execution signal is a target wake-up signal
  • the other device among the first device and the second device, except the target device can be controlled to enter from the wake-up state. to sleep state.
  • the target device it can be controlled to remain in the awake state, so that the target device can collect subsequent voice control signals sent by the target object through the sound collection component on it, and respond to the collected voice control signals to execute the instructions of the voice control signals.
  • the control operation is not limited in this embodiment.
  • two devices are used as an example to illustrate the scenario of selecting the nearest device for control in this embodiment, it is not limited to this.
  • the first sound signal received by the first device and the second sound signal received by the second device are obtained, wherein the first sound signal and the second sound signal are sound signals corresponding to the operation execution signal sent by the target object. ;
  • the target cross power spectrum of the first sound signal and the second sound signal determines the target delay between the first device and the second device, where , the target delay is the time difference between the arrival of the operation execution signal at the first device and the arrival at the second device;
  • the target device is determined from the first device and the second device, and the target device is controlled to execute the instruction of the operation execution signal Device operation, in which the target device is the device closest to the target object, solves the problem of the control method of smart devices in related technologies due to the presence of noise and reverberation in scenarios with low signal-to-noise ratio and high reverberation.
  • the problem of poor accuracy of equipment control improves the accuracy of equipment control.
  • determining the target cross power spectrum of the first sound signal and the second sound signal according to the first sound signal and the second sound signal includes:
  • the server when determining the cross power spectrum of the first sound signal and the second sound signal, may obtain the initial cross power spectrum of the first sound signal and the second sound signal.
  • Obtaining the initial cross power spectrum may be: obtaining the spectrum signals of the first sound signal and the second sound signal respectively, obtaining the first frequency domain signal and the second frequency domain signal, and then calculating the first frequency domain signal and the second frequency domain signal.
  • Cross power spectrum the initial cross power spectrum is obtained.
  • two devices are device A and device B.
  • One signal is selected from the voice signals collected by the microphone arrays of device A and device B respectively.
  • the two obtained signals are x 1 and x 2 respectively, belonging to the device.
  • a and device B as shown in formula (1):
  • x is the microphone received signal
  • i is the microphone serial number
  • n is the time domain sampling point
  • s is the sound source signal
  • d is the noise.
  • the short-time Fourier transform (STFT, short-time Fourier transform, or short-term Fourier transform) is performed on the two signals respectively.
  • the frequency domain signal after transformation to the frequency domain is as follows: Formula (2):
  • the mutual power spectrum of the two signals can be determined based on the two frequency domain signals. Determining the cross power spectrum of the two signals refers to related technologies, which will not be described in detail in this embodiment.
  • the cross power spectrum can also be smoothed, and the smoothed cross power spectrum can be shown in formula (3):
  • the server may also obtain a target reverberation gain value corresponding to the first device and the second device, where the target reverberation gain value is a gain value used to suppress reverberation noise.
  • the target reverberation gain value may be determined based on the sound signal received by the first device that matches the operation execution signal, and the sound signal received by the second device that matches the operation execution signal, for example, the first sound signal and the second
  • the sound signal may be determined based on signals other than the first sound signal and the second sound signal.
  • the method of determining the reverberation gain value based on the sound signal can be any method capable of determining the reverberation gain value between sound signals, which is not limited in this embodiment.
  • the target reverberation gain value can be used to update the initial cross power spectrum to obtain the target cross power spectrum.
  • the way to update the initial cross power spectrum can be to use the target reverberation gain value to update the initial cross power spectrum. Smoothing processing may also be performed in other ways. In this embodiment, there is no limitation on the way of updating the initial cross power spectrum.
  • the gain value can suppress other noises, such as stationary noise, non-stationary noise, etc. In this embodiment, This will not be described in detail.
  • the cross power spectrum is updated through the reverberation gain value.
  • the cross power spectrum can better characterize the relationship between sound source signals and improve the accuracy of equipment control.
  • obtaining target reverberation gain values corresponding to the first device and the second device includes:
  • multiple sound signals received by each device can be obtained respectively.
  • the multiple sound signals are all sound signals corresponding to the operation execution signal, which can They are sound signals of different channels in the same microphone array of the same device, that is, different sound signals collected by different microphones.
  • the method of obtaining multiple sound signals received by each device may be: selecting multiple microphones according to the microphone serial numbers in the microphone array of each device, and determining the sound signals received by the multiple microphones as multiple sound signals, Other methods may also be used to obtain multiple sound signals corresponding to each device, which is not limited in this embodiment.
  • the server can determine the coherence functions of the multiple sound signals and obtain the target coherence function.
  • the method of determining the correlation functions of multiple sound signals may be: determining the correlation functions of any two sound signals among the multiple sound signals to obtain multiple coherence functions.
  • the target coherence function may include multiple coherence functions, or be obtained from multiple coherence functions.
  • any two sound signals for example, the third sound signal and the fourth sound signal, determine the coherence of the third sound signal and the fourth sound signal based on the mutual power spectrum of the third sound signal and the fourth sound signal.
  • the function can be as shown in formula (4):
  • x 3 may be the third sound signal
  • x 4 may be the fourth sound signal
  • the reverberation suppression coefficient of each device can be estimated through the target coherence function, and the reverberation suppression coefficient of The method can be as shown in formula (5):
  • ⁇ n (f) sinc (2 ⁇ fd'/c)
  • d' the distance between the microphones that receive the two sound signals (the spacing is known)
  • c speed of sound.
  • the reverberation gain value of each device may be further determined, that is, the target reverberation gain value.
  • the way to determine the reverberation gain value of each device can be as shown in formula (6):
  • G(l,f) is the reverberation gain value
  • G min is the minimum reverberation gain value
  • is the preset value, which can be 0.9.
  • the reverberation gain value may be determined based on sound signals respectively selected by different devices.
  • the reverberation gain value of the first device and the reverberation gain value of the second device may be the same, that is, the first device and the second device receive at least one sound signal and obtain multiple sound signals. ; Determine the coherence functions of multiple sound signals to obtain the target coherence function; estimate the reverberation suppression coefficient between the first device and the second device according to the target coherence function to obtain the target reverberation suppression coefficient; convert the minimum reverberation gain value The maximum value among the reverberation gain values corresponding to the target reverberation suppression coefficient is determined as the target reverberation gain value.
  • the third sound signal and the first sound signal may be the same sound signal or different sound signals, which is not limited in this embodiment.
  • the reverberation suppression coefficient is estimated based on the coherence function, and the reverberation gain value is determined based on the reverberation suppression coefficient, which can improve the accuracy of the reverberation gain value determination.
  • the obtaining a plurality of sound signals received by each of the first device and the second device includes:
  • the number of acquired sound signals received by each device may be two. That is, the server can obtain two sound signals (or two-way sound signals) received by each device. The two sound signals obtained are received by different microphones of the same microphone array of each device and are related to the operation execution signal. corresponding sound signal.
  • the server can determine the coherence functions of the two sound signals in a similar manner to the previous embodiment, obtain the target coherence function, and calculate the target coherence function for each device based on the obtained target coherence function.
  • the reverberation suppression coefficient is estimated to obtain the target reverberation suppression coefficient; the maximum value of the minimum reverberation gain value and the reverberation gain value corresponding to the target reverberation suppression coefficient is determined as the reverberation gain value of each device.
  • the process of determining the reverberation gain value of each device is similar to the previous embodiment and will not be described again.
  • the reverberation gain value of each device is determined by separately obtaining two sound signals received by each device, which can improve the timeliness of determining the reverberation gain of the device.
  • the target reverberation gain value includes a first reverberation gain value corresponding to the first device and a second reverberation gain value corresponding to the second device, for example, based on equation (5)
  • the first reverberation gain value and the second reverberation gain value may be the same or different, which is not limited in this embodiment.
  • the initial cross power spectrum is updated using the target reverberation gain value to obtain the target cross power spectrum, including:
  • the frequency domain signal is a frequency domain signal corresponding to the first sound signal
  • the second frequency domain signal is a frequency domain signal corresponding to the second sound signal
  • S42 Perform a weighted summation operation on the initial cross power spectrum and the reverberation reference information to obtain the target cross power spectrum.
  • the server may calculate the product of the first reverberation gain value and the first frequency domain signal, and the product of the second reverberation gain value and the conjugate of the second frequency domain signal. Perform a multiplication operation to obtain the reverberation reference information. After obtaining the reverberation reference information, a weighted summation operation can be performed on the initial cross power spectrum and the reverberation reference information to obtain the target cross power spectrum.
  • G A (l,f) is the reverberation gain value of device A
  • G B (l,f) is the reverberation gain value of device B
  • X 1 (l,f) is the frequency domain signal of x 1
  • X 2 * (l, f) is the conjugate of the frequency domain signal of x 2
  • is a preset value, which can be any value between [0,1], for example, 0.9 .
  • the ability of the cross-power spectrum to represent the relationship between sound source signals can be improved. This improves the accuracy of equipment control.
  • determining the target delay between the first device and the second device according to the target cross power spectrum includes:
  • the first delay range is based on the inverse number of the delay threshold and the delay threshold.
  • the delay threshold is the value obtained by dividing the distance between the first device and the second device by the speed of sound.
  • the time delay between the first device and the second device may be determined based on a GCC-PHAT (Generalized Cross Correlation PHAse Transformation) function corresponding to the first device and the second device.
  • the server may determine the GCC-PHAT function corresponding to the first device and the second device according to the target cross power spectrum, where the GCC-PHAT function is a function with the time delay between the first device and the second device as a variable.
  • the GCC-PHAT function can be determined based on equation (8)
  • the value range of the delay is ⁇ [-d/c,d/c], where c is the speed of sound propagation, that is, the speed of sound.
  • the delay value range is the first delay range
  • the first delay range is based on the delay threshold (that is, d/c, d is the distance between the two devices)
  • the inverse number and delay threshold are the endpoints of the interval.
  • the optimal delay can be found within the first delay range, that is, the optimal delay can be found within the first delay range such that (The variable is ⁇ )
  • the maximum delay is a problem of finding the global optimal solution. You can use interpolation (find the optimal solution through a large number of interpolations) or other methods to find the time that maximizes the function value of the GCC-PHAT function. delay to obtain the target delay.
  • the delay between devices is determined through the GCC-PHAT function, which can improve the accuracy of delay determination.
  • the delay that maximizes the function value of the generalized cross-correlation phase transformation function is found within the first delay range to obtain the target delay, which includes:
  • S61 Randomly select a delay within the second delay range to obtain a random delay, where the second delay range is an interval with zero and the delay threshold as endpoints;
  • S62 Determine the first parameter function value corresponding to the generalized cross-correlation phase transformation function and the random delay, and the second reference function value corresponding to the generalized cross-correlation phase transformation function and the inverse of the random delay;
  • the optimal delay can be found within the first delay range through an interpolation method.
  • the optimal ⁇ can be obtained by performing a large number of interpolation calculations.
  • the opposite number of the delay threshold can be used as the starting point, and the preset time interval is the step size to sequentially interpolate within the first delay range to obtain a set of insertion delays; determine the GCC-PHAT function and a set of The function value corresponding to each insertion delay in the insertion delay; select the delay with the largest corresponding function value from the inverse of the delay threshold, a set of insertion delays and the delay threshold to obtain the target delay.
  • the convenience of interpolation processing can be improved by setting the reference delay and selecting the interpolation position based on the reference delay and the function value corresponding to the GCC-PHAT function and the reference delay.
  • the server can randomly select a delay within the second delay range to obtain a random delay.
  • the random delay can be a positive value.
  • the second delay range can be an interval between zero and the delay threshold as the endpoint. For example, assuming that the distance d between two microphones is at most 6m, the delay Randomly select a value between (0,d/c] to obtain ⁇ 0 (an example of random delay).
  • the server may also determine the first parameter function value corresponding to the GCC-PHAT function and the random delay, and the second reference function value corresponding to the GCC-PHAT function and the inverse of the random delay, and combine the first reference function value and the second reference function value.
  • the delay corresponding to the maximum function value among the function values is determined as the initial delay, that is, the above-mentioned reference delay, and based on the initial delay and the inverse number of the initial delay, an interpolation operation is performed within the first delay range,
  • the target delay is obtained.
  • the target delay is the delay that maximizes the function value of the GCC-PHAT function among the initial delay and the delay inserted by the interpolation operation.
  • an interpolation operation is performed within the first delay range to obtain the target delay, including:
  • the loop stop condition includes at least one of the following: the number of interpolation steps performed reaches a preset number of times (for example, 10 times), and the initial delay is within the preset delay Within the range (which can be based on the delay range determined a priori, that is, the delay range set based on the allowed activity range of the target object), the initial delay after the end of the cycle is the target delay:
  • Step 1 Determine a first delay, where the first delay is a delay inserted between the initial delay and the delay threshold.
  • the first delay can be inserted between the initial delay and the delay threshold (for example, between ( ⁇ 1 ,d/c]).
  • the inserted delay ⁇ 2 i.e., The first delay
  • ⁇ 2 ⁇ 1 + ⁇ ( ⁇ 1 - ⁇ 1 ′)
  • can also be other values.
  • Step 2 Determine the second delay when the first function value corresponding to the generalized cross-correlation phase transformation function and the first delay is greater than the first reference function value, where the second delay is between the initial delay and the first delay. The delay inserted between delays.
  • Step 3 When the second function value corresponding to the generalized cross-correlation phase transformation function and the second delay is greater than the first function value, use the second delay to update the initial delay to obtain an updated initial delay.
  • a second function value corresponding to the GCC-PHAT function and the second delay may be determined. If the second function value is greater than the first function value, the second delay can be used to update the initial delay to obtain an updated initial delay.
  • Step 4 When the second function value is smaller than the first function value, use the first delay to update the initial delay to obtain the updated initial delay.
  • Step 5 When the first function value is smaller than the first reference function value and larger than the second reference function value, determine the third delay, where the third delay is between the first delay and the delay threshold. Insertion delay.
  • a third delay may be inserted between the first delay and the delay threshold.
  • the inserted value is between ( ⁇ 2 , d/c], and ⁇ can also be other values.
  • Step 6 When the third function value corresponding to the generalized cross-correlation phase transformation function and the third delay is greater than the first function value, use the third delay to update the initial delay to obtain an updated initial delay.
  • Step 7 When the third function value is smaller than the first function value, determine the fourth delay, and use the fourth delay to update the initial delay to obtain the updated initial delay, where the fourth delay is the delay inserted between zero and the inverse of the initial delay.
  • Step 8 When the first function value is less than the second reference function value, determine the fifth delay, where the fifth delay is the time inserted between the inverse of the delay threshold and the inverse of the initial delay. extension.
  • Step 9 When the fourth function value corresponding to the generalized cross-correlation phase transformation function and the fifth time delay is greater than the first reference function value, use the fifth time delay to update the initial time delay to obtain the updated initial time delay. .
  • Step 10 When the fourth function value is less than the first reference function value, determine the sixth time delay, and use the sixth time delay to update the initial time delay to obtain the updated initial time delay, where the sixth time delay Delay is the delay inserted between zero and the inverse of the initial delay.
  • the target delay may be determined in a manner similar to that in the foregoing embodiment.
  • the first to sixth delays may be determined in a manner similar to that in the foregoing embodiment.
  • an interpolation operation is performed within the first delay range to obtain the target delay, including:
  • the second delay is determined, where the second delay is the sum of the first delay and the initial delay.
  • the first delay is used to update the initial delay to obtain the updated initial delay
  • the third delay is the difference between the first delay and the initial delay multiplied by 1.2
  • the fourth delay determines the fourth delay, and use the fourth delay to update the initial delay to obtain the updated initial delay, where the fourth delay is the initial delay.
  • the sum of the difference between the delay and the inverse of the initial delay multiplied by 0.125 and the inverse of the initial delay (for example, ⁇ 32 ⁇ 1 ′+0.125 ⁇ ( ⁇ 1 - ⁇ 1 ′));
  • the fifth delay is determined, where the fifth delay is the difference between the inverse of the initial delay and the initial delay multiplied by 1.2 and the value obtained by the initial delay.
  • the accuracy of target delay determination can be improved, and the accuracy of device control can be improved.
  • determining the target device from the first device and the second device according to the target delay includes:
  • the smart device to be controlled may be selected from the first device and the second device based on the positive or negative of the target delay. If the target delay is positive, the operation execution signal reaches the first device later than the second device, that is, the operation execution signal reaches the second device earlier, and the second device is closer to the target object. Therefore, it can Determine the second device as the target device.
  • the operation execution signal reaches the first device earlier than the second device, that is, the operation execution signal reaches the first device earlier, and the first device is closer to the target object. Therefore, the first device can be determined as the target device.
  • the smart device to be controlled is selected based on the positive and negative delay of arrival of two devices, which can improve the rationality of device control.
  • the operation execution signal is that the first device is a first home appliance, the second device is a second home appliance, and the operation execution signal is a wake-up signal.
  • This optional example provides a solution for determining nearby wake-up of multiple devices.
  • the signal between devices is used to estimate TDOA (Time Difference of Arrival) (that is, by solving two
  • the cross-correlation function between device signals is used for discrimination), and the reverberation factor is considered.
  • the cross-correlation function can be used to suppress the reverberation and improve the robustness of the TDOA estimation method to the reverberation factor.
  • the method of distinguishing two devices can be extended to multiple devices. The two devices are first judged, and the better device is selected, and then the next device is judged, and so on.
  • the process of the control method of the smart device in this optional example may include the following steps:
  • Step S302 Select two channels from the microphone arrays of the first home appliance and the second home appliance, obtain two channels of sound signals, and calculate corresponding reverberation gain values.
  • the two sound signals obtained are the same as the wake-up signals collected by the microphones of each channel.
  • Step S304 Select a channel in the microphone array of the first home appliance and the second home appliance to obtain two sound signals. After data calibration, calculate their mutual power spectrum.
  • the calculated cross power spectrum is the cross power spectrum taking into account the reverberation gain value.
  • Step S306 Calculate the GCC_PHAT function of the first home appliance and the second home appliance based on the cross power spectrum, and determine the optimal delay through interpolation calculation.
  • the way to find the optimal TDOA is a nonlinear single-variable optimization, which is a statistical optimization problem.
  • particle swarm optimization, maximum likelihood estimation, Markov Monte Carlo method, etc. can also be used, but not limited to interpolation algorithm.
  • Step S308 By judging the positive and negative of the optimal delay, it is determined which device the sound source is closer to, and the device closest to the sound source is awakened, and the device closest to the sound source is awakened.
  • the sound source here is the sound source of the wake-up signal.
  • the cross-correlation function between devices is used to find the optimal time delay TDOA.
  • the device with a closer sound source is obtained.
  • the energy-based discrimination method it has better Anti-noise; at the same time, the reverberation factor is taken into account when calculating the cross-correlation function, which also has good robustness in a large reverberation environment; in addition, the simplified interpolation calculation method is used to find the global optimal TDOA, and the calculation The volume is greatly reduced and the accuracy is improved.
  • the method according to the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is Better implementation.
  • the technical solution of the present disclosure can be embodied in the form of a software product in essence or that contributes to the existing technology.
  • the computer software product is stored in a storage medium (such as ROM (Read-Only Memory, Read-only memory)/RAM (Random Access Memory, disk, optical disk), including a number of instructions to make a terminal device (can be a mobile phone, computer, server, or network device, etc.) to execute this Methods described in various embodiments are disclosed.
  • FIG. 4 is a structural block diagram of an optional intelligent device control device according to an embodiment of the present disclosure. As shown in Figure 4, the device may include:
  • the acquisition unit 402 is configured to acquire the first sound signal received by the first device and the second sound signal received by the second device, wherein the first sound signal and the second sound signal are operations related to the target object. Execute the sound signal corresponding to the signal;
  • the first determination unit 404 is connected to the acquisition unit 402 and is configured to determine the target cross power spectrum of the first sound signal and the second sound signal according to the first sound signal and the second sound signal;
  • the second determination unit 406 is connected to the first determination unit 404 and is configured to determine the target delay between the first device and the second device according to the target cross power spectrum, wherein the target delay The time difference between when the operation execution signal reaches the first device and when it reaches the second device;
  • the execution unit 408 is connected to the second determination unit 406 and is configured to determine a target device from the first device and the second device according to the target delay, and control the target device to execute the operation execution signal
  • the indicated device operates in which the target device is the device closest to the target object.
  • the acquisition unit 402 in this embodiment can be configured to perform the above step S202
  • the first determination unit 404 in this embodiment can be configured to perform the above step S204
  • the second determination unit 406 in this embodiment can Set to execute the above step S206
  • the execution unit 408 in this embodiment may be set to execute the above step S208.
  • the first sound signal received by the first device and the second sound signal received by the second device are obtained, wherein the first sound signal and the second sound signal are sound signals corresponding to the operation execution signal sent by the target object. ;
  • the target cross power spectrum of the first sound signal and the second sound signal determines the target delay between the first device and the second device, where , the target delay is the time difference between the arrival of the operation execution signal at the first device and the arrival at the second device;
  • the target device is determined from the first device and the second device, and the target device is controlled to execute the instruction of the operation execution signal Device operation, in which the target device is the device closest to the target object, solves the problem of the control method of smart devices in related technologies due to the presence of noise and reverberation in scenarios with low signal-to-noise ratio and high reverberation.
  • the problem of poor accuracy of equipment control improves the accuracy of equipment control.
  • the first determining unit includes:
  • the first acquisition module is configured to acquire the initial cross power spectrum of the first sound signal and the second sound signal
  • the second acquisition module is configured to acquire the target reverberation gain value corresponding to the first device and the second device, where the target reverberation gain value is a gain value used to suppress reverberation noise;
  • the update module is set to use the target reverberation gain value to update the initial cross power spectrum to obtain the target cross power spectrum.
  • the second acquisition module includes:
  • the acquisition submodule is configured to acquire multiple sound signals received by each of the first device and the second device, where the multiple sound signals are sound signals corresponding to the operation execution signal;
  • the first determination sub-module is configured to determine the coherence functions of multiple sound signals and obtain the target coherence function
  • the estimation submodule is configured to estimate the reverberation suppression coefficient of each device based on the target coherence function to obtain the target reverberation suppression coefficient;
  • the second determination sub-module is configured to determine the maximum value of the minimum reverberation gain value and the reverberation gain value corresponding to the target reverberation suppression coefficient as the reverberation gain value of each device.
  • the acquisition sub-module includes:
  • the acquisition subunit is configured to acquire two sound signals received by each device, where the two sound signals are sound signals received by different microphones of the same microphone array of each device and corresponding to the operation execution signal.
  • the target reverberation gain value includes a first reverberation gain value corresponding to the first device and a second reverberation gain value corresponding to the second device;
  • the update module includes:
  • the first execution sub-module is configured to perform a multiplication operation on the product of the first reverberation gain value and the first frequency domain signal, and the product of the second reverberation gain value and the conjugate of the second frequency domain signal, to obtain the reverberation reference Information, wherein the first frequency domain signal is a frequency domain signal corresponding to the first sound signal, and the second frequency domain signal is a frequency domain signal corresponding to the second sound signal;
  • the second execution submodule is configured to perform a weighted sum operation on the initial cross power spectrum and the reverberation reference information to obtain the target cross power spectrum.
  • the second determining unit includes:
  • the first determination module is configured to determine the generalized cross-correlation phase transformation function corresponding to the first device and the second device according to the target cross power spectrum, wherein the generalized cross-correlation phase transformation function is based on the relationship between the first device and the second device.
  • the delay is a function of variables;
  • the search module is configured to search for the delay that maximizes the function value of the generalized cross-correlation phase transformation function within the first delay range, and obtain the target delay, wherein the first delay range is the inverse of the delay threshold and the time delay.
  • the delay threshold is the interval between the endpoints, and the delay threshold is the value obtained by dividing the distance between the first device and the second device by the speed of sound.
  • the search module includes:
  • the third determination submodule is configured to determine the first parameter function value corresponding to the generalized cross-correlation phase transformation function and the random delay, and the second reference function value corresponding to the generalized cross-correlation phase transformation function and the inverse number of the random delay;
  • the fourth determination sub-module is configured to determine the delay corresponding to the maximum function value among the first reference function value and the second reference function value as the initial delay;
  • the third execution sub-module is set to perform an interpolation operation within the first delay range based on the initial delay and the inverse of the initial delay to obtain the target delay, where the target delay is the initial delay and the interpolation operation inserted Among the time delays, the time delay that maximizes the function value of the generalized cross-correlation phase transformation function.
  • the third execution sub-module includes:
  • the first execution subunit is configured to perform the following interpolation steps cyclically until the loop stop condition is met, where the loop stop condition includes at least one of the following: the number of interpolation steps executed reaches the preset number, and the initial delay is within the preset delay Within the range, the initial delay after the end of the loop is the target delay:
  • the first delay is the delay inserted between the initial delay and the delay threshold
  • the second delay is determined, wherein the second delay is the sum of the initial delay and the first delay.
  • the first delay is used to update the initial delay to obtain the updated initial delay
  • a third delay is determined, where the third delay is a time inserted between the first delay and the delay threshold.
  • the fifth delay is a delay inserted between the inverse of the delay threshold and the inverse of the initial delay
  • the sixth delay When the fourth function value is less than the first reference function value, determine the sixth delay, and use the sixth delay to update the initial delay to obtain the updated initial delay, where the sixth delay is Delay inserted between zero and the inverse of the initial delay.
  • the third execution sub-module includes:
  • the second execution subunit is configured to execute the following interpolation steps cyclically until the loop stop condition is met, where the loop stop condition includes at least one of the following: the number of interpolation steps executed reaches the preset number, and the initial delay is within the preset delay Within the range, the initial delay after the end of the loop is the target delay:
  • the first delay is the sum of the difference between the initial delay and the inverse of the initial delay multiplied by 0.6 and the initial delay;
  • the second delay is determined, where the second delay is the sum of the first delay and the initial delay. The sum of the value obtained by multiplying the difference by 0.8 and the initial delay;
  • the first delay is used to update the initial delay to obtain the updated initial delay
  • the third delay is the difference between the first delay and the initial delay multiplied by 1.2 The sum of the obtained value and the initial delay
  • the fourth delay determines the fourth delay, and use the fourth delay to update the initial delay to obtain the updated initial delay, where the fourth delay is the initial delay.
  • the fifth delay is determined, where the fifth delay is the difference between the inverse of the initial delay and the initial delay multiplied by 1.2 and the value obtained by the initial delay. sum of delays;
  • the sixth delay When the fourth function value is less than the first reference function value, determine the sixth delay, and use the sixth delay to update the initial delay to obtain the updated initial delay, where the sixth delay is the initial The sum of the difference between the delay and the inverse of the initial delay multiplied by 0.125 and the inverse of the initial delay.
  • execution units include:
  • the second determination module is configured to determine the second device as the target device when the target delay is positive;
  • the third determination module is configured to determine the first device as the target device when the target delay is negative.
  • the above module as part of the device, can run in the hardware environment as shown in Figure 1, and can be implemented by software or hardware, where the hardware environment includes a network environment.
  • a storage medium is also provided.
  • the above-mentioned storage medium can be used to execute the program code of any of the above-mentioned control methods of the smart device in the embodiment of the present disclosure.
  • the above storage medium may be located on at least one network device among multiple network devices in the network shown in the above embodiment.
  • the storage medium is configured to store program codes for performing the following steps:
  • S3. Determine the target delay between the first device and the second device according to the target cross power spectrum, where the target delay is the time difference between the operation execution signal arriving at the first device and the second device;
  • the target delay determine the target device from the first device and the second device, and control the target device to perform the device operation indicated by the operation execution signal, where the target device is the device closest to the target object.
  • the above-mentioned storage medium may include but is not limited to: U disk, ROM, RAM, mobile hard disk, magnetic disk or optical disk and other various media that can store program codes.
  • an electronic device for implementing the above control method of an intelligent device.
  • the electronic device may be a server, a terminal, or a combination thereof.
  • Figure 5 is a structural block diagram of an optional electronic device according to an embodiment of the present disclosure. As shown in Figure 5, it includes a processor 502, a communication interface 504, a memory 506 and a communication bus 508. The processor 502, the communication interface 504 and memory 506 complete communication with each other through communication bus 508, where,
  • Memory 506 configured to store computer programs
  • processor 502 When the processor 502 is configured to execute the computer program stored on the memory 506, it implements the following steps:
  • S3. Determine the target delay between the first device and the second device according to the target cross power spectrum, where the target delay is the time difference between the operation execution signal arriving at the first device and the second device;
  • the target delay determine the target device from the first device and the second device, and control the target device to perform the device operation indicated by the operation execution signal, where the target device is the device closest to the target object.
  • the communication bus may be a PCI (Peripheral Component Interconnect, Peripheral Component Interconnect Standard) bus, or an EISA (Extended Industry Standard Architecture, Extended Industrial Standard Architecture) bus, etc.
  • the communication bus can be divided into address bus, data bus, control bus, etc. For ease of presentation, only one thick line is used in Figure 5, but it does not mean that there is only one bus or one type of bus.
  • the communication interface is used for communication between the above-mentioned electronic device and other equipment.
  • the memory may include RAM or non-volatile memory, such as at least one disk memory.
  • the memory may also be at least one storage device located remotely from the aforementioned processor.
  • the memory 506 may include, but is not limited to, the acquisition unit 402, the first determination unit 404, the second determination 406 and the execution unit 408 in the control device of the smart device. In addition, it may also include but is not limited to other modular units in the control device of the above-mentioned smart device, which will not be described again in this example.
  • the above-mentioned processor can be a general-purpose processor, which can include but is not limited to: CPU (Central Processing Unit, central processing unit), NP (Network Processor, network processor), etc.; it can also be a DSP (Digital Signal Processing, digital signal processor) ), ASIC (Application Specific Integrated Circuit), FPGA (Field-Programmable Gate Array, field programmable gate array) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
  • CPU Central Processing Unit, central processing unit
  • NP Network Processor, network processor
  • DSP Digital Signal Processing, digital signal processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array, field programmable gate array
  • other programmable logic devices discrete gate or transistor logic devices, discrete hardware components.
  • the device that implements the above control method for smart devices can be a terminal device, and the terminal device can be a smart phone (such as an Android phone, iOS phone, etc.), a tablet Computers, handheld computers, and mobile Internet devices (Mobile Internet Devices, MID), PAD and other terminal equipment.
  • FIG. 5 does not limit the structure of the above-mentioned electronic device.
  • the electronic device may also include more or fewer components (such as network interfaces, display devices, etc.) than shown in FIG. 5 , or have a different configuration than that shown in FIG. 5 .
  • the program can be stored in a computer-readable storage medium, and the storage medium can Including: flash disk, ROM, RAM, magnetic disk or optical disk, etc.
  • the integrated units in the above embodiments are implemented in the form of software functional units and sold or used as independent products, they can be stored in the above computer-readable storage medium.
  • the technical solution of the present disclosure is essentially or contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, It includes several instructions to cause one or more computer devices (which can be personal computers, servers or network devices, etc.) to execute all or part of the steps of the methods described in various embodiments of the present disclosure.
  • the disclosed client can be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division.
  • multiple units or components may be combined or may be Integrated into another system, or some features can be ignored, or not implemented.
  • the coupling or direct coupling or communication connection between each other shown or discussed may be through some interfaces, and the indirect coupling or communication connection of the units or modules may be in electrical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place or distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution provided in this embodiment.
  • each functional unit in various embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the above integrated units can be implemented in the form of hardware or software functional units.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Environmental & Geological Engineering (AREA)
  • Quality & Reliability (AREA)
  • Automation & Control Theory (AREA)
  • Measurement Of Velocity Or Position Using Acoustic Or Ultrasonic Waves (AREA)
  • Selective Calling Equipment (AREA)

Abstract

A control method for a smart device, and a storage medium and an electronic apparatus, which relate to the field of home automation/smart homes. The method comprises: acquiring a first sound signal received by a first device and a second sound signal received by a second device, wherein the first sound signal and the second sound signal are sound signals corresponding to an operation execution signal which is sent by a target object (S202); determining a target cross-power spectrum of the first sound signal and the second sound signal according to the first sound signal and the second sound signal (S204); according to the target cross-power spectrum, determining a target delay between the first device and the second device, wherein the target delay is the difference between the time of the operation execution signal reaching the first device and the time of the operation execution signal reaching the second device (S206); and according to the target delay, determining a target device from the first device and the second device, and controlling the target device to execute a device operation indicated by the operation execution signal, wherein the target device is the device which is closest to the target object (S208).

Description

智能设备的控制方法、存储介质及电子装置Control method, storage medium and electronic device of intelligent equipment
本公开要求于2022年4月29日提交中国专利局、申请号为202210469005.3、发明名称“智能设备的控制方法、存储介质及电子装置”的中国专利申请的优先权,其全部内容通过引用结合在本公开中。This disclosure claims priority to the Chinese patent application filed with the China Patent Office on April 29, 2022, with application number 202210469005.3 and the invention title "Control method, storage medium and electronic device for intelligent equipment", the entire content of which is incorporated by reference. This disclosure is ongoing.
技术领域Technical field
本公开涉及智能家居/智慧家庭领域,具体而言,涉及一种智能设备的控制方法、存储介质及电子装置。The present disclosure relates to the field of smart home/smart home, specifically, to a control method, storage medium and electronic device of an intelligent device.
背景技术Background technique
目前,可以通过语音控制的方式来控制智能设备执行设备操作,例如,通过语音唤醒信号唤醒智能设备的语音交互功能、通过语音控制指令控制触发智能设备的语音交互功能。很多情况下,同一场景中可能包括多台配置了语音交互功能的设备,而用户在同一时间通常只需要一台设备对于语音控制指令进行响应,例如,只需靠近用户的设备执行语音控制指令。Currently, smart devices can be controlled to perform device operations through voice control. For example, a voice wake-up signal is used to wake up the voice interaction function of the smart device, and voice control instructions are used to control and trigger the voice interaction function of the smart device. In many cases, the same scene may include multiple devices configured with voice interaction functions, and users usually only need one device to respond to voice control instructions at the same time. For example, only the device close to the user needs to execute voice control instructions.
以智能家居环境下的语音唤醒指令为例,在同一场景中同时有多台配备了语音交互功能的设备,包括音响、电视机、洗衣机等,用户在同一时间只需唤醒其中的一台设备。对此,可以采用就近唤醒判定算法进行设备唤醒,就近唤醒是指距离说话人近的设备优先响应。Taking the voice wake-up command in a smart home environment as an example, there are multiple devices equipped with voice interaction functions in the same scene, including speakers, TVs, washing machines, etc., and the user only needs to wake up one of the devices at the same time. In this regard, the nearest wake-up determination algorithm can be used to wake up the device. Nearby wake-up means that devices closest to the speaker respond first.
大多数确定距离用户最近的设备的方式主要利用能量信息进行判定,可以采用归一化的能量作为判别标准来判别距离用户最近的设备。一般这类算法只考虑每个设备上的麦克风阵列本身的关系,在各个设备的信号去除自噪(例如,音响、电视剧做回声消除,洗衣机去除本机噪声等)之后,信号能量经过归一化,结合评估得分来进行判别。Most methods of determining the device closest to the user mainly use energy information for determination, and normalized energy can be used as the criterion to determine the device closest to the user. Generally, this type of algorithm only considers the relationship between the microphone array itself on each device. After the self-noise is removed from the signals of each device (for example, audio and TV series perform echo cancellation, washing machines remove local noise, etc.), the signal energy is normalized. , combined with the evaluation scores to make the judgment.
上述智能设备的控制方式在高信噪比场景下是有效的,具有较好的效果,而在信噪比较低、混响较高的场景下,麦克风接收到的能量不仅包括声源的, 还有噪声和混响的,比如,被环境内各个物体面发射来的声源历史信号,以及其他噪声,上述智能设备的控制方式的表现较差。The control method of the above-mentioned smart device is effective in high signal-to-noise ratio scenarios and has good results. However, in scenarios with low signal-to-noise ratio and high reverberation, the energy received by the microphone not only includes the sound source, but also the energy received by the microphone. There are also noise and reverberation, such as historical signals of sound sources emitted from various objects in the environment, and other noises. The control methods of the above-mentioned smart devices have poor performance.
由此可见,相关技术中的智能设备的控制方式,存在在信噪比较低、混响较高的场景下由于存在噪声和混响导致的设备控制的准确性差的问题。It can be seen that the control method of smart devices in related technologies has the problem of poor device control accuracy due to the presence of noise and reverberation in scenarios with low signal-to-noise ratio and high reverberation.
发明内容Contents of the invention
本公开实施例提供了一种智能设备的控制方法、存储介质及电子装置,以至少解决相关技术中的智能设备的控制方式存在在信噪比较低、混响较高的场景下由于存在噪声和混响导致的设备控制的准确性差的问题。Embodiments of the present disclosure provide a control method, a storage medium and an electronic device for an intelligent device, to at least solve the problem that the control method of an intelligent device in related technologies exists in a scene with low signal-to-noise ratio and high reverberation due to the presence of noise. And the problem of poor accuracy of device control caused by reverberation.
根据本公开实施例的一个方面,提供了一种智能设备的控制方法,包括:获取第一设备接收的第一声音信号和第二设备接收到的第二声音信号,其中,所述第一声音信号和所述第二声音信号是与目标对象发出的操作执行信号对应的声音信号;根据所述第一声音信号和所述第二声音信号,确定所述第一声音信号和所述第二声音信号的目标互功率谱;根据所述目标互功率谱,确定所述第一设备和所述第二设备之间的目标时延,其中,所述目标时延为所述操作执行信号到达所述第一设备和到达所述第二设备的时间差值;根据所述目标时延,从所述第一设备和所述第二设备中确定目标设备,并控制所述目标设备执行所述操作执行信号所指示的设备操作,其中,所述目标设备距离所述目标对象最近的设备。According to an aspect of an embodiment of the present disclosure, a method for controlling a smart device is provided, including: acquiring a first sound signal received by a first device and a second sound signal received by a second device, wherein the first sound signal The signal and the second sound signal are sound signals corresponding to the operation execution signal issued by the target object; the first sound signal and the second sound signal are determined according to the first sound signal and the second sound signal. The target cross power spectrum of the signal; according to the target cross power spectrum, determine the target delay between the first device and the second device, wherein the target delay is when the operation execution signal reaches the The difference in time between the first device and the second device; determining a target device from the first device and the second device according to the target delay, and controlling the target device to perform the operation execution Operation of the device indicated by the signal, wherein the target device is the device closest to the target object.
根据本公开实施例的另一个方面,还提供了一种智能设备的控制装置,包括:获取单元,设置为获取第一设备接收的第一声音信号和第二设备接收到的第二声音信号,其中,所述第一声音信号和所述第二声音信号是与目标对象发出的操作执行信号对应的声音信号;第一确定单元,设置为根据所述第一声音信号和所述第二声音信号,确定所述第一声音信号和所述第二声音信号的目标互功率谱;第二确定单元,设置为根据所述目标互功率谱,确定所述第一设备和所述第二设备之间的目标时延,其中,所述目标时延为所述操作执行信号到达所述第一设备和到达所述第二设备的时间差值;执行单元,设置为根据所述目标时延,从所述第一设备和所述第二设备中确定目标设备,并控制所述目标设备执行所述操作执行信号所指示的设备操作,其中,所述目标设备距离所述目标对象最近的设备。According to another aspect of the embodiment of the present disclosure, a control device for an intelligent device is also provided, including: an acquisition unit configured to acquire the first sound signal received by the first device and the second sound signal received by the second device, Wherein, the first sound signal and the second sound signal are sound signals corresponding to the operation execution signal issued by the target object; the first determination unit is configured to determine the sound signal according to the first sound signal and the second sound signal. , determine the target cross power spectrum of the first sound signal and the second sound signal; the second determination unit is configured to determine the relationship between the first device and the second device according to the target cross power spectrum. The target delay, wherein the target delay is the time difference between the operation execution signal arriving at the first device and the second device; the execution unit is configured to, according to the target delay, start from the A target device is determined among the first device and the second device, and the target device is controlled to perform the device operation indicated by the operation execution signal, wherein the target device is the device closest to the target object.
根据本公开实施例的又一方面,还提供了一种计算机可读的存储介质,该计算机可读的存储介质中存储有计算机程序,其中,该计算机程序被设置为运行时执行上述智能设备的控制方法。According to yet another aspect of the embodiments of the present disclosure, a computer-readable storage medium is also provided. The computer-readable storage medium stores a computer program, wherein the computer program is configured to execute the above-mentioned smart device when running. Control Method.
根据本公开实施例的又一方面,还提供了一种电子装置,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,其中,上述处理器通过计算机程序执行上述的智能设备的控制方法。According to another aspect of the embodiment of the present disclosure, an electronic device is also provided, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the above-mentioned steps through the computer program. Control methods for smart devices.
在本公开实施例中,采用基于两个设备的接收信号之间的互相关函数确定待控制的智能设备的方式,通过获取第一设备接收的第一声音信号和第二设备接收到的第二声音信号,其中,第一声音信号和第二声音信号是与目标对象发出的操作执行信号对应的声音信号;根据第一声音信号和第二声音信号,确定第一声音信号和第二声音信号的目标互功率谱;根据目标互功率谱,确定第一设备和第二设备之间的目标时延,其中,目标时延为操作执行信号到达第一设备和到达第二设备的时间差值;根据目标时延,从第一设备和第二设备中确定目标设备,并控制目标设备执行操作执行信号所指示的设备操作,其中,目标设备距离目标对象最近的设备,由于基于两个设备接收到的声音信号的互功率谱,确定两个设备接收声音信号的时间差值,从而可以基于时间差值选取出距离用户最近的设备作为待控制的智能设备,基于操作执行信号的到达时间差值来确定距离用户最近的设备,可以实现降低噪声、混响等对于信号能量的影响所导致的设备误控制的目的,达到了提高设备控制准确性的技术效果,进而解决了相关技术中的智能设备的控制方式存在在信噪比较低、混响较高的场景下由于存在噪声和混响导致的设备控制的准确性差的问题。In the embodiment of the present disclosure, the smart device to be controlled is determined based on the cross-correlation function between the received signals of the two devices, by obtaining the first sound signal received by the first device and the second sound signal received by the second device. Sound signal, wherein the first sound signal and the second sound signal are sound signals corresponding to the operation execution signal issued by the target object; according to the first sound signal and the second sound signal, the first sound signal and the second sound signal are determined. Target cross power spectrum; according to the target cross power spectrum, determine the target delay between the first device and the second device, where the target delay is the time difference between the arrival of the operation execution signal at the first device and the arrival at the second device; according to The target delay is to determine the target device from the first device and the second device, and control the target device to perform the device operation indicated by the operation execution signal, wherein the target device is the device closest to the target object, because the target device is based on the signal received by the two devices. The cross power spectrum of the sound signal determines the time difference between the two devices receiving the sound signal, so that the device closest to the user can be selected as the smart device to be controlled based on the time difference, and is determined based on the arrival time difference of the operation execution signal The device closest to the user can achieve the purpose of reducing miscontrol of the device caused by the impact of noise, reverberation, etc. on the signal energy, achieving the technical effect of improving the accuracy of device control, and thus solving the problem of control of smart devices in related technologies. This method has the problem of poor device control accuracy due to the presence of noise and reverberation in scenes with low signal-to-noise ratio and high reverberation.
附图说明Description of the drawings
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例,并与说明书一起用于解释本公开的原理。The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure.
为了更清楚地说明本公开实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,对于本领域普通技术人员而言,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below. Obviously, for those of ordinary skill in the art, It is said that other drawings can be obtained based on these drawings without exerting creative labor.
图1是根据本公开实施例的一种可选的智能设备的控制方法的硬件环境的示意图;Figure 1 is a schematic diagram of the hardware environment of an optional smart device control method according to an embodiment of the present disclosure;
图2是根据本公开实施例的一种可选的智能设备的控制方法的流程示意图;Figure 2 is a schematic flowchart of an optional smart device control method according to an embodiment of the present disclosure;
图3是根据本公开实施例的另一种可选的智能设备的控制方法的流程示意图;Figure 3 is a schematic flowchart of another optional smart device control method according to an embodiment of the present disclosure;
图4是根据本公开实施例的一种可选的智能设备的控制装置的结构框图;Figure 4 is a structural block diagram of an optional intelligent device control device according to an embodiment of the present disclosure;
图5是根据本公开实施例的一种可选的电子装置的结构框图。FIG. 5 is a structural block diagram of an optional electronic device according to an embodiment of the present disclosure.
具体实施方式Detailed ways
为了使本技术领域的人员更好地理解本公开方案,下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分的实施例,而不是全部的实施例。基于本公开中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本公开保护的范围。In order to enable those skilled in the art to better understand the present disclosure, the following will clearly and completely describe the technical solutions in the present disclosure embodiments in conjunction with the accompanying drawings. Obviously, the described embodiments are only These are part of the embodiments of this disclosure, not all of them. Based on the embodiments in this disclosure, all other embodiments obtained by those of ordinary skill in the art without creative efforts should fall within the scope of protection of this disclosure.
需要说明的是,本公开的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本公开的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terms "first", "second", etc. in the description and claims of the present disclosure and the above-mentioned drawings are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances so that the embodiments of the disclosure described herein can be practiced in sequences other than those illustrated or described herein. In addition, the terms "including" and "having" and any variations thereof are intended to cover non-exclusive inclusions, e.g., a process, method, system, product, or apparatus that encompasses a series of steps or units and need not be limited to those explicitly listed. Those steps or elements may instead include other steps or elements not expressly listed or inherent to the process, method, product or apparatus.
根据本公开实施例的一个方面,提供了一种智能设备的控制方法。该智能设备的控制方法广泛应用于智慧家庭(Smart Home)、智能家居、智能家用设备生态、智慧住宅(Intelligence House)生态等全屋智能数字化控制应用场景。可选地,在本实施例中,上述智能设备的控制方法可以应用于如图1所示的由终端设备102和服务器104所构成的硬件环境中。如图1所示,服务器104通过网络与终端设备102进行连接,可用于为终端或终端上安装的客户端提供服务(如应用 服务等),可在服务器上或独立于服务器设置数据库,用于为服务器104提供数据存储服务,可在服务器上或独立于服务器配置云计算和/或边缘计算服务,用于为服务器104提供数据运算服务。According to an aspect of an embodiment of the present disclosure, a method for controlling a smart device is provided. The control method of this smart device is widely used in whole-house intelligent digital control application scenarios such as smart home, smart home, smart home device ecology, and smart residence (Intelligence House) ecology. Optionally, in this embodiment, the above intelligent device control method can be applied to a hardware environment composed of a terminal device 102 and a server 104 as shown in FIG. 1 . As shown in Figure 1, the server 104 is connected to the terminal device 102 through the network and can be used to provide services (such as application services, etc.) for the terminal or the client installed on the terminal. A database can be set up on the server or independently from the server. To provide data storage services for the server 104, cloud computing and/or edge computing services can be configured on the server or independently of the server to provide data computing services for the server 104.
上述网络可以包括但不限于以下至少之一:有线网络,无线网络。上述有线网络可以包括但不限于以下至少之一:广域网,城域网,局域网,上述无线网络可以包括但不限于以下至少之一:WIFI(Wireless Fidelity,无线保真),蓝牙。终端设备102可以并不限定于为PC、手机、平板电脑、智能空调、智能烟机、智能冰箱、智能烤箱、智能炉灶、智能洗衣机、智能热水器、智能洗涤设备、智能洗碗机、智能投影设备、智能电视、智能晾衣架、智能窗帘、智能影音、智能插座、智能音响、智能音箱、智能新风设备、智能厨卫设备、智能卫浴设备、智能扫地机器人、智能擦窗机器人、智能拖地机器人、智能空气净化设备、智能蒸箱、智能微波炉、智能厨宝、智能净化器、智能饮水机、智能门锁等。The above-mentioned network may include but is not limited to at least one of the following: wired network, wireless network. The above-mentioned wired network may include but is not limited to at least one of the following: wide area network, metropolitan area network, and local area network. The above-mentioned wireless network may include at least one of the following: WIFI (Wireless Fidelity, Wireless Fidelity), Bluetooth. The terminal device 102 may be, but is not limited to, a PC, a mobile phone, a tablet, a smart air conditioner, a smart hood, a smart refrigerator, a smart oven, a smart stove, a smart washing machine, a smart water heater, a smart washing equipment, a smart dishwasher, or a smart projection device. , smart TV, smart clothes drying rack, smart curtains, smart audio and video, smart sockets, smart audio, smart speakers, smart fresh air equipment, smart kitchen and bathroom equipment, smart bathroom equipment, smart sweeping robot, smart window cleaning robot, smart mopping robot, Smart air purification equipment, smart steamers, smart microwave ovens, smart kitchen appliances, smart purifiers, smart water dispensers, smart door locks, etc.
本公开实施例的智能设备的控制方法可以由服务器104来执行,也可以由终端设备102来执行,还可以是由服务器104和终端设备102共同执行。其中,终端设备102执行本公开实施例的智能设备的控制方法也可以是由安装在其上的客户端来执行。The control method of the smart device in the embodiment of the present disclosure may be executed by the server 104, may be executed by the terminal device 102, or may be executed jointly by the server 104 and the terminal device 102. Wherein, the terminal device 102 may also execute the control method of the smart device according to the embodiment of the present disclosure by a client installed thereon.
以由服务器104来执行本实施例中的智能设备的控制方法为例,图2是根据本公开实施例的一种可选的智能设备的控制方法的流程示意图,如图2所示,该方法的流程可以包括以下步骤:Taking the server 104 executing the smart device control method in this embodiment as an example, Figure 2 is a schematic flowchart of an optional smart device control method according to an embodiment of the present disclosure. As shown in Figure 2, this method The process can include the following steps:
步骤S202,获取第一设备接收的第一声音信号和第二设备接收到的第二声音信号,其中,第一声音信号和第二声音信号是与目标对象发出的操作执行信号对应的声音信号。Step S202: Obtain the first sound signal received by the first device and the second sound signal received by the second device, where the first sound signal and the second sound signal are sound signals corresponding to the operation execution signal sent by the target object.
本实施例中的智能设备的控制方法可以应用在场景中存在多个允许使用相同的操作执行信号控制执行对应的设备操作的智能设备的场景。操作执行信号可以是设备唤醒信号,也可以是控制设备执行其他设备操作的信号,以设备唤醒信号为例,设备唤醒信号可以包含智能设备的唤醒词,智能设备可以响应接收到的唤醒信号唤醒其语音交互功能。智能设备可以是智能家居设备,可以包括但不限于为智能家电,例如,上述智能空调、智能冰箱、智能烤箱等。本实 施例中对于智能设备的类型不做限定。The control method of smart devices in this embodiment can be applied to scenarios where there are multiple smart devices that are allowed to use the same operation execution signal to control the execution of corresponding device operations. The operation execution signal can be a device wake-up signal, or a signal that controls the device to perform other device operations. Taking the device wake-up signal as an example, the device wake-up signal can contain the wake-up word of the smart device, and the smart device can respond to the received wake-up signal to wake it up. Voice interaction function. Smart devices may be smart home devices, which may include but are not limited to smart home appliances, such as the above-mentioned smart air conditioners, smart refrigerators, smart ovens, etc. In this embodiment, the type of smart device is not limited.
一般情况下,各个设备之间的摆放是未知的,有些设备如音箱的位置甚至不固定,相比冰箱、空调等位置一般不会频繁变动,此时的困难有:1)利用各个设备对声源估计角度后,由于缺少设备之间的位置信息,无法判断声源的具体位置;2)即使把多个设备类似分布式阵列处理,也需要设备之间的位置信息进行声源定位。Under normal circumstances, the placement of various devices is unknown. The positions of some devices such as speakers are not even fixed. Compared with refrigerators, air conditioners, etc., the positions generally do not change frequently. The difficulties at this time are: 1) Using each device to coordinate After the angle of the sound source is estimated, the specific location of the sound source cannot be determined due to the lack of position information between devices; 2) Even if multiple devices are processed like a distributed array, position information between devices is still needed for sound source localization.
在目标对象(与用户对应)需要使用某一智能设备时,可以发送操作执行信号,该操作执行信号可以用于指示执行对应的设备操作,其可以是第一设备和第二设备均可以进行响应的语音控制信号。例如,操作执行信号可以是目标唤醒信号,该目标唤醒信号所携带的唤醒词可以同时唤醒第一设备和第二设备。第一设备和第二设备可以分别通过其上的声音采集部件进行声音采集,得到第一声音信号和第二声音信号,这里,声音采集部件可以是麦克风,也可以是麦克风阵列。对应地,第一声音信号是第一设备上的第一麦克风阵列中的第一麦克风所采集到的声音信号,第二声音信号是第二设备上的第二麦克风阵列中的第二麦克风所采集到的声音信号。When the target object (corresponding to the user) needs to use a certain smart device, an operation execution signal can be sent. The operation execution signal can be used to instruct the execution of the corresponding device operation, which can be responded to by both the first device and the second device. voice control signal. For example, the operation execution signal may be a target wake-up signal, and the wake-up word carried by the target wake-up signal can wake up the first device and the second device at the same time. The first device and the second device can respectively collect sounds through the sound collecting components on them to obtain the first sound signal and the second sound signal. Here, the sound collecting component can be a microphone or a microphone array. Correspondingly, the first sound signal is a sound signal collected by the first microphone in the first microphone array on the first device, and the second sound signal is collected by the second microphone in the second microphone array on the second device. sound signal.
对于采集到的声音信号,第一设备和第二设备可以分别将采集到的声音信号发送给服务器。服务器可以分别从第一设备和第二设备接收发送的声音信号,从而获取到的第一声音信号和第二声音信号。For the collected sound signals, the first device and the second device can respectively send the collected sound signals to the server. The server may receive the sent sound signal from the first device and the second device respectively, thereby acquiring the first sound signal and the second sound signal.
可选地,第一设备和第二设备向服务器发送的可以是其麦克风阵列所采集到的声音信号。服务器可以从第一麦克风阵列采集到的声音信号中,选取出第一麦克风采集到的声音信号,得到第一声音信号(可以是第一通道的声音信号);从第二麦克风阵列采集到的声音信号中,选取出第二麦克风采集到的声音信号,得到第二声音信号(可以是第二通道的声音信号)。选取第一麦克风和第二麦克风的方式可以是随机选取,也可以是根据麦克风的序列进行选取的,本实施例中对此不做限定。Optionally, what the first device and the second device send to the server may be sound signals collected by their microphone arrays. The server can select the sound signal collected by the first microphone from the sound signal collected by the first microphone array to obtain the first sound signal (which may be the sound signal of the first channel); the sound collected from the second microphone array Among the signals, the sound signal collected by the second microphone is selected to obtain the second sound signal (which may be the sound signal of the second channel). The method of selecting the first microphone and the second microphone may be random selection or selection based on a sequence of microphones, which is not limited in this embodiment.
可选地,第一设备和第二设备所采集到的声音信号的对应关系可以是基于声音信号的采集时间、声音信号的信号特征等进行匹配得到的,即,基于第一设备和第二设备的声音信号的采集时间、声音信号的信号特征进行匹配,确定 与第一设备采集到的声音信号匹配的第二设备的声音信号。Optionally, the corresponding relationship between the sound signals collected by the first device and the second device can be obtained by matching based on the collection time of the sound signal, the signal characteristics of the sound signal, etc., that is, based on the first device and the second device. The sound signal collection time and the signal characteristics of the sound signal are matched to determine the sound signal of the second device that matches the sound signal collected by the first device.
步骤S204,根据第一声音信号和第二声音信号,确定第一声音信号和第二声音信号的目标互功率谱。Step S204: Determine the target cross power spectrum of the first sound signal and the second sound signal based on the first sound signal and the second sound signal.
在同一时间用户通常仅希望控制一个智能设备。如果多个设备同时采集到的声音信号,服务器可以从多个设备中选取出待控制的智能设备。选取待控制的智能设备的方式可以是:利用能量就近选择设备。然而,利用能量进行就近选择设备需要确保选取的音频信号(声音信号)必须是近似于声源信号才能保证设备选取的准确性。在低信噪比、高混响条件下,多个设备之间就近选择设备的准确性较差。Users typically only want to control one smart device at a time. If multiple devices collect sound signals at the same time, the server can select the smart device to be controlled from the multiple devices. The method of selecting the smart device to be controlled can be: using energy to select the nearest device. However, using energy to select nearby devices requires ensuring that the selected audio signal (sound signal) must be similar to the sound source signal to ensure the accuracy of device selection. Under low signal-to-noise ratio and high reverberation conditions, the accuracy of selecting the nearest device among multiple devices is poor.
为了解决上述技术问题中的至少部分,可以基于两个设备接收到的声音信号的互功率谱,确定两个设备接收声音信号的时间差值,从而可以基于时间差值选取出距离用户最近的设备作为待控制的智能设备,基于操作执行信号的到达时间差值来确定距离用户最近的设备,可以降低噪声、混响等对于信号能量的影响所导致的设备误控制,从而提高设备控制的准确性。In order to solve at least part of the above technical problems, the time difference between the sound signals received by the two devices can be determined based on the cross power spectrum of the sound signals received by the two devices, so that the device closest to the user can be selected based on the time difference. As a smart device to be controlled, determining the device closest to the user based on the arrival time difference of the operation execution signal can reduce device miscontrol caused by noise, reverberation and other effects on signal energy, thereby improving the accuracy of device control. .
对于操作执行信号,在获取到第一声音信号和第二声音信号之后,服务器可以首先计算第一声音信号和第二声音信号的互功率谱,得到目标互功率谱,计算第一声音信号和第二声音信号的互功率谱的方式可以有一种或多种,例如,可以将先第一声音信号和第二声音信号转换到频域,基于转换后的频域信号确定第一声音信号和第二声音信号的互功率谱,还可以通过其他方式确定两个声音信号的互功率谱,本实施例中对此不做限定。For the operation execution signal, after acquiring the first sound signal and the second sound signal, the server may first calculate the cross power spectrum of the first sound signal and the second sound signal, obtain the target cross power spectrum, and calculate the first sound signal and the second sound signal. The mutual power spectrum of the two sound signals may be obtained in one or more ways. For example, the first sound signal and the second sound signal may be converted into the frequency domain, and the first sound signal and the second sound signal may be determined based on the converted frequency domain signals. The cross power spectrum of the sound signal can also be determined by other methods, which is not limited in this embodiment.
步骤S206,根据目标互功率谱,确定第一设备和第二设备之间的目标时延,其中,目标时延为操作执行信号到达第一设备和到达第二设备的时间差值。Step S206: Determine the target delay between the first device and the second device according to the target cross power spectrum, where the target delay is the time difference between the operation execution signal arriving at the first device and the second device.
根据目标互功率谱,服务器可以确定第一设备和第二设备之间的目标时延,这里的目标时延是操作执行信号到达第一设备和到达第二设备的时间差值,可以是操作执行信号到达第一设备的时间与操作执行信号到达第二设备的时间之间的差值。根据目标互功率谱确定目标时延的方式可以有一种或者多种,例如,可以基于广义互相关函数确定的,也可以是基于其他与互功率谱相关的函数确定的,本实施例中对于根据目标互功率谱确定目标时延的方式不做具体限定。According to the target cross power spectrum, the server can determine the target delay between the first device and the second device. The target delay here is the time difference between the operation execution signal arriving at the first device and the second device. It can be the operation execution time. The difference between the time the signal reaches the first device and the time the operation execution signal reaches the second device. There may be one or more ways to determine the target delay based on the target cross-power spectrum. For example, it may be determined based on the generalized cross-correlation function, or it may be determined based on other functions related to the cross-power spectrum. In this embodiment, the target delay is determined based on the target cross-power spectrum. The method of determining the target delay using the target cross power spectrum is not specifically limited.
步骤S208,根据目标时延,从第一设备和第二设备中确定目标设备,并控制目标设备执行操作执行信号所指示的设备操作,其中,目标设备距离目标对象最近的设备。Step S208: Determine the target device from the first device and the second device according to the target delay, and control the target device to perform the device operation indicated by the operation execution signal, where the target device is the device closest to the target object.
根据目标时延,服务器可以确定第一设备和第二设备中距离目标对象最近的设备,并将确定出的设备确定为目标设备,确定的目标设备可以是待控制的智能设备。例如,对于操作执行信号为目标唤醒信号的场景,可以确定第一设备和第二设备中待唤醒的目标设备。According to the target delay, the server may determine the device closest to the target object among the first device and the second device, and determine the determined device as the target device. The determined target device may be the smart device to be controlled. For example, for a scenario where the operation execution signal is a target wake-up signal, the target device to be woken up among the first device and the second device may be determined.
在确定出目标设备之后,服务器可以控制目标设备执行操作执行信号所指示的设备操作。这里,操作执行信号中可以携带有操作执行指令,目标设备执行的可以是操作执行信号所携带的操作执行指令所指示的设备操作。例如,操作执行指令是控制提高温度的指令,则可以从允许执行提高温度的指令的两个设备,比如,热水器和浴霸,中选取出距离用户最近的一个设备执行提高温度的操作。After determining the target device, the server may control the target device to perform a device operation indicated by the operation execution signal. Here, the operation execution signal may carry an operation execution instruction, and the target device may perform the device operation indicated by the operation execution instruction carried in the operation execution signal. For example, if the operation execution instruction is an instruction to control increasing the temperature, then the device closest to the user can be selected from two devices that are allowed to execute the instruction to increase the temperature, such as a water heater and a bathroom heater, to perform the operation of increasing the temperature.
可选地,对于操作执行信号为目标唤醒信号的场景,如果第一设备和第二设备均已唤醒,则可以控制第一设备和第二设备中除了目标设备以外的另一设备由唤醒状态进入到睡眠状态。对于目标设备,其可以控制其保持处于唤醒状态,从而目标设备可以通过其上的声音采集部件采集目标对象后续发出的语音控制信号,并响应与采集到的语音控制信号,执行语音控制信号所指示的控制操作,本实施例中对此不做限定。Optionally, for a scenario where the operation execution signal is a target wake-up signal, if both the first device and the second device have woken up, the other device among the first device and the second device, except the target device, can be controlled to enter from the wake-up state. to sleep state. For the target device, it can be controlled to remain in the awake state, so that the target device can collect subsequent voice control signals sent by the target object through the sound collection component on it, and respond to the collected voice control signals to execute the instructions of the voice control signals. The control operation is not limited in this embodiment.
需要说明的是,虽然本实施例中以两个设备为例说明选取最近设备进行控制的场景,但并不限于此,对于同一场景内有超过两个可控制设备的场景,可以首先从多个设备中选取两个设备作为第一设备和第二设备,按照上述方式确定距离目标对象更近的设备(对应于上述目标设备),并将确定出的设备作为第一设备,继续从多个设备里剩余的设备中选取一个设备作为第二设备继续执行确定距离目标对象更近的设备的操作,直到多个设备里没有剩余的设备,最终确定的距离目标对象最近的设备,即为目标设备。It should be noted that although two devices are used as an example to illustrate the scenario of selecting the nearest device for control in this embodiment, it is not limited to this. For a scenario where there are more than two controllable devices in the same scenario, you can first select from multiple devices. Select two devices from among the devices as the first device and the second device, determine the device closer to the target object (corresponding to the above-mentioned target device) in the above manner, and use the determined device as the first device, and continue to select the device from multiple devices. Select one device from the remaining devices as the second device and continue to determine the device closer to the target object until there are no devices left among the multiple devices. The device finally determined to be closest to the target object is the target device.
通过上述步骤,获取第一设备接收的第一声音信号和第二设备接收到的第二声音信号,其中,第一声音信号和第二声音信号是与目标对象发出的操作执行信 号对应的声音信号;根据第一声音信号和第二声音信号,确定第一声音信号和第二声音信号的目标互功率谱;根据目标互功率谱,确定第一设备和第二设备之间的目标时延,其中,目标时延为操作执行信号到达第一设备和到达第二设备的时间差值;根据目标时延,从第一设备和第二设备中确定目标设备,并控制目标设备执行操作执行信号所指示的设备操作,其中,目标设备距离目标对象最近的设备,解决了相相关技术中的智能设备的控制方式存在在信噪比较低、混响较高的场景下由于存在噪声和混响导致的设备控制的准确性差的问题,提高了设备控制的准确性。Through the above steps, the first sound signal received by the first device and the second sound signal received by the second device are obtained, wherein the first sound signal and the second sound signal are sound signals corresponding to the operation execution signal sent by the target object. ; According to the first sound signal and the second sound signal, determine the target cross power spectrum of the first sound signal and the second sound signal; according to the target cross power spectrum, determine the target delay between the first device and the second device, where , the target delay is the time difference between the arrival of the operation execution signal at the first device and the arrival at the second device; according to the target delay, the target device is determined from the first device and the second device, and the target device is controlled to execute the instruction of the operation execution signal Device operation, in which the target device is the device closest to the target object, solves the problem of the control method of smart devices in related technologies due to the presence of noise and reverberation in scenarios with low signal-to-noise ratio and high reverberation. The problem of poor accuracy of equipment control improves the accuracy of equipment control.
在一个示例性实施例中,根据第一声音信号和第二声音信号,确定第一声音信号和第二声音信号的目标互功率谱,包括:In an exemplary embodiment, determining the target cross power spectrum of the first sound signal and the second sound signal according to the first sound signal and the second sound signal includes:
S11,获取第一声音信号和第二声音信号的初始互功率谱;S11, obtain the initial mutual power spectrum of the first sound signal and the second sound signal;
S12,获取与第一设备和第二设备对应的目标混响增益值,其中,目标混响增益值是用于抑制混响噪声的增益值;S12, obtain the target reverberation gain value corresponding to the first device and the second device, where the target reverberation gain value is a gain value used to suppress reverberation noise;
S13,使用目标混响增益值对初始互功率谱进行更新,得到目标互功率谱。S13, use the target reverberation gain value to update the initial cross power spectrum to obtain the target cross power spectrum.
在本实施例中,在确定第一声音信号和第二声音信号的互功率谱时,服务器可以获取第一声音信号和第二声音信号的初始互功率谱。获取初始互功率谱可以是:分别获取第一声音信号和第二声音信号的频谱信号,得到第一频域信号和第二频域信号,然后计算第一频域信号和第二频域信号的互功率谱,得到初始互功率谱。In this embodiment, when determining the cross power spectrum of the first sound signal and the second sound signal, the server may obtain the initial cross power spectrum of the first sound signal and the second sound signal. Obtaining the initial cross power spectrum may be: obtaining the spectrum signals of the first sound signal and the second sound signal respectively, obtaining the first frequency domain signal and the second frequency domain signal, and then calculating the first frequency domain signal and the second frequency domain signal. Cross power spectrum, the initial cross power spectrum is obtained.
例如,两个设备分别为设备A和设备B,分别从设备A和设备B的麦克风阵列采集到的语音信号中选取出一路信号,得到的两路信号分别为x 1和x 2,分属于设备A和设备B,如公式(1)所示: For example, two devices are device A and device B. One signal is selected from the voice signals collected by the microphone arrays of device A and device B respectively. The two obtained signals are x 1 and x 2 respectively, belonging to the device. A and device B, as shown in formula (1):
x i(n)=s i(n)+d i(n)         (1) x i (n) = s i (n) + di (n) (1)
其中,x为麦克风接收信号,i为麦克风序号,取值为1和2,n为时域采样点,s为声源信号,d为噪声。Among them, x is the microphone received signal, i is the microphone serial number, with values 1 and 2, n is the time domain sampling point, s is the sound source signal, and d is the noise.
分别对两路信号进行短时傅里叶变换(STFT,short-time Fourier transform,或short-term Fourier transform),变换到频域后的频域信号如公式(2):The short-time Fourier transform (STFT, short-time Fourier transform, or short-term Fourier transform) is performed on the two signals respectively. The frequency domain signal after transformation to the frequency domain is as follows: Formula (2):
X i(l,f)=S i(l,f)+D i(l,f)      (2) X i (l,f)=S i (l,f)+D i (l,f) (2)
其中,X、S和D分别为x、s和d变换后的频域信号,i为麦克风序号,取值为1和2,l是帧索引,f是频带。Where,
在确定两路信号对应的频域信号之后,可以基于两路频域信号确定两路信号的互功率谱
Figure PCTCN2022095335-appb-000001
确定两路信号的互功率谱参考相关技术,本实施例中在此不做赘述。此外,还可以对互功率谱进行平滑处理,平滑处理的互功率谱可以如公式(3)所示:
After determining the frequency domain signals corresponding to the two signals, the mutual power spectrum of the two signals can be determined based on the two frequency domain signals.
Figure PCTCN2022095335-appb-000001
Determining the cross power spectrum of the two signals refers to related technologies, which will not be described in detail in this embodiment. In addition, the cross power spectrum can also be smoothed, and the smoothed cross power spectrum can be shown in formula (3):
Figure PCTCN2022095335-appb-000002
Figure PCTCN2022095335-appb-000002
其中,
Figure PCTCN2022095335-appb-000003
为两路信号的互功率谱,σ是平滑因子,取值可以为0.9,*为共轭。
in,
Figure PCTCN2022095335-appb-000003
is the mutual power spectrum of the two signals, σ is the smoothing factor, the value can be 0.9, and * is the conjugate.
服务器还可以获取与第一设备和第二设备对应的目标混响增益值,这里,目标混响增益值是用于抑制混响噪声的增益值。目标混响增益值可以是基于第一设备接收到的与操作执行信号匹配的声音信号、以及第二设备接收到的与操作执行信号匹配的声音信号确定的,比如,第一声音信号和第二声音信号确定的,也可以是基于除了第一声音信号和第二声音信号以外的其他信号确定的。基于声音信号确定混响增益值的方式可以采用任意能够确定声音信号之间的混响增益值的方式,本实施例中对此不做限定。The server may also obtain a target reverberation gain value corresponding to the first device and the second device, where the target reverberation gain value is a gain value used to suppress reverberation noise. The target reverberation gain value may be determined based on the sound signal received by the first device that matches the operation execution signal, and the sound signal received by the second device that matches the operation execution signal, for example, the first sound signal and the second The sound signal may be determined based on signals other than the first sound signal and the second sound signal. The method of determining the reverberation gain value based on the sound signal can be any method capable of determining the reverberation gain value between sound signals, which is not limited in this embodiment.
在得到目标混响增益值,可以使用目标混响增益值对初始互功率谱进行更新,得到目标互功率谱,更新初始互功率谱的方式可以是使用目标混响增益值对初始互功率谱进行平滑处理,也可以是采用其他方式,本实施例中对于更新初始互功率谱的方式不做限定。After obtaining the target reverberation gain value, the target reverberation gain value can be used to update the initial cross power spectrum to obtain the target cross power spectrum. The way to update the initial cross power spectrum can be to use the target reverberation gain value to update the initial cross power spectrum. Smoothing processing may also be performed in other ways. In this embodiment, there is no limitation on the way of updating the initial cross power spectrum.
需要说明的是,除了混响增益以外,提升信号互功率谱信噪比的方式可以有多种,比如,增益值可以抑制其他噪声,比如,平稳噪声、非平稳噪声等,本实施例中对此不做具体描述。It should be noted that, in addition to the reverberation gain, there are many ways to improve the signal-to-noise ratio of the signal cross-power spectrum. For example, the gain value can suppress other noises, such as stationary noise, non-stationary noise, etc. In this embodiment, This will not be described in detail.
通过本实施例,通过混响增益值对互功率谱进行更新,在考虑了混响因素后,互功率谱可以更好地表征声源信号之间的关系,提高设备控制的准确性。Through this embodiment, the cross power spectrum is updated through the reverberation gain value. After considering the reverberation factor, the cross power spectrum can better characterize the relationship between sound source signals and improve the accuracy of equipment control.
在一个示例性实施例中,获取与第一设备和第二设备对应的目标混响增益 值,包括:In an exemplary embodiment, obtaining target reverberation gain values corresponding to the first device and the second device includes:
S21,获取第一设备和第二设备中的每个设备接收的多个声音信号,其中,多个声音信号均是与操作执行信号对应的声音信号;S21, obtain multiple sound signals received by each of the first device and the second device, where the multiple sound signals are sound signals corresponding to the operation execution signal;
S22,确定多个声音信号的相干函数,得到目标相干函数;S22, determine the coherence functions of multiple sound signals and obtain the target coherence function;
S23,根据目标相干函数对每个设备的混响抑制系数进行估计,得到目标混响抑制系数;S23, estimate the reverberation suppression coefficient of each device according to the target coherence function, and obtain the target reverberation suppression coefficient;
S24,将最小混响增益值和与目标混响抑制系数对应的混响增益值中的最大值,确定为每个设备的混响增益值。S24, determine the maximum value of the minimum reverberation gain value and the reverberation gain value corresponding to the target reverberation suppression coefficient as the reverberation gain value of each device.
在本实施例中,对于第一设备和第二设备中的任一设备,可以分别获取每个设备接收的多个声音信号,多个声音信号均是与操作执行信号对应的声音信号,其可以是同一设备的同一麦克风阵列中的不同通道的声音信号,即,不同的麦克风所采集到的不同声音信号。获取每个设备接收的多个声音信号的方式可以是:按照每个设备的麦克风阵列中的麦克风序号选取出多个麦克风,并将多个麦克风接收到的声音信号,确定为多个声音信号,也可以采用其他方式获取与每个设备对应的多个声音信号,本实施例中对此不做限定。In this embodiment, for any one of the first device and the second device, multiple sound signals received by each device can be obtained respectively. The multiple sound signals are all sound signals corresponding to the operation execution signal, which can They are sound signals of different channels in the same microphone array of the same device, that is, different sound signals collected by different microphones. The method of obtaining multiple sound signals received by each device may be: selecting multiple microphones according to the microphone serial numbers in the microphone array of each device, and determining the sound signals received by the multiple microphones as multiple sound signals, Other methods may also be used to obtain multiple sound signals corresponding to each device, which is not limited in this embodiment.
对于多个声音信号,服务器可以确定多个声音信号的相干函数,得到目标相干函数。确定多个声音信号相关函数的方式可以是:确定多个声音信号中的任意两个声音信号的相关函数,得到多个相干函数,目标相干函数可以包括多个相干函数、或者从多个相干函数中选取出的一个相干函数。在确定任意两个声音信号的相关函数时,可以首先确定两个声音信号的互功率谱;然后,基于两个声音信号的互功率谱,确定两个声音信号的相干函数。这里,获取两个声音信号的互功率谱的方式与前述实施例中类似,在此不做赘述。For multiple sound signals, the server can determine the coherence functions of the multiple sound signals and obtain the target coherence function. The method of determining the correlation functions of multiple sound signals may be: determining the correlation functions of any two sound signals among the multiple sound signals to obtain multiple coherence functions. The target coherence function may include multiple coherence functions, or be obtained from multiple coherence functions. A coherent function selected from . When determining the correlation function of any two sound signals, the mutual power spectrum of the two sound signals can be determined first; then, based on the mutual power spectrum of the two sound signals, the coherence function of the two sound signals is determined. Here, the method of obtaining the cross power spectrum of the two sound signals is similar to that in the previous embodiment, and will not be described again.
可选地,对于任意两个声音信号,例如,第三声音信号和第四声音信号,基于第三声音信号和第四声音信号的互功率谱,确定第三声音信号和第四声音信号的相干函数的方式可以如公式(4)所示:Optionally, for any two sound signals, for example, the third sound signal and the fourth sound signal, determine the coherence of the third sound signal and the fourth sound signal based on the mutual power spectrum of the third sound signal and the fourth sound signal. The function can be as shown in formula (4):
Figure PCTCN2022095335-appb-000004
Figure PCTCN2022095335-appb-000004
其中,
Figure PCTCN2022095335-appb-000005
为两个声音信号的相干函数,
Figure PCTCN2022095335-appb-000006
为两个声音信号x 3和x 4的互功率谱,
Figure PCTCN2022095335-appb-000007
为声音信号x 3的自功率谱,
Figure PCTCN2022095335-appb-000008
为声音信号x 4的自功率谱,x 3可以是第三声音信号,x 4可以是第四声音信号。
in,
Figure PCTCN2022095335-appb-000005
is the coherence function of two sound signals,
Figure PCTCN2022095335-appb-000006
is the cross power spectrum of the two sound signals x 3 and x 4 ,
Figure PCTCN2022095335-appb-000007
is the autopower spectrum of the sound signal x 3 ,
Figure PCTCN2022095335-appb-000008
is the autopower spectrum of the sound signal x 4 , x 3 may be the third sound signal, and x 4 may be the fourth sound signal.
由于声源角度(即,目标唤醒信号的方向与麦克风之间的角度)未知,在确定出目标相干函数之后,可以通过目标相干函数估计每个设备的混响抑制系数,估计混响抑制系数的方式可以如公式(5)所示:Since the sound source angle (i.e., the angle between the direction of the target wake-up signal and the microphone) is unknown, after determining the target coherence function, the reverberation suppression coefficient of each device can be estimated through the target coherence function, and the reverberation suppression coefficient of The method can be as shown in formula (5):
Figure PCTCN2022095335-appb-000009
Figure PCTCN2022095335-appb-000009
其中,
Figure PCTCN2022095335-appb-000010
为混响抑制系数,Γ n(f)=sinc(2πfd′/c),而噪声场用散射噪声场,d'为接收两个声音信号的麦克风之间的距离(间距已知),c为声速。
in,
Figure PCTCN2022095335-appb-000010
is the reverberation suppression coefficient, Γ n (f) = sinc (2πfd'/c), and the noise field uses the scattering noise field, d' is the distance between the microphones that receive the two sound signals (the spacing is known), and c is speed of sound.
基于目标混响抑制系数,可以进一步确定每个设备的混响增益值,即,目标混响增益值。确定每个设备的混响增益值的方式可以如公式(6)所示:Based on the target reverberation suppression coefficient, the reverberation gain value of each device may be further determined, that is, the target reverberation gain value. The way to determine the reverberation gain value of each device can be as shown in formula (6):
Figure PCTCN2022095335-appb-000011
Figure PCTCN2022095335-appb-000011
其中,G(l,f)为混响增益值,G min为最小混响增益值,ε为预设值,可以为0.9。 Among them, G(l,f) is the reverberation gain value, G min is the minimum reverberation gain value, and ε is the preset value, which can be 0.9.
可选地,混响增益值可以是基于不同设备分别选取的声音信号确定的。在此情况下,第一设备的混响增益值和第二设备的混响增益值可以是相同的,即,分别获取第一设备和第二设备接收到至少一个声音信号,得到多个声音信号;确定多个声音信号的相干函数,得到目标相干函数;根据目标相干函数对第一设备和第二设备之间的混响抑制系数进行估计,得到目标混响抑制系数;将最小混响增益值和与目标混响抑制系数对应的混响增益值中的最大值,确定为目标混响增益值。第三声音信号和第一声音信号可以是相同的声音信号,也可以是不同的声音信号,本实施例中对此不做限定。Optionally, the reverberation gain value may be determined based on sound signals respectively selected by different devices. In this case, the reverberation gain value of the first device and the reverberation gain value of the second device may be the same, that is, the first device and the second device receive at least one sound signal and obtain multiple sound signals. ; Determine the coherence functions of multiple sound signals to obtain the target coherence function; estimate the reverberation suppression coefficient between the first device and the second device according to the target coherence function to obtain the target reverberation suppression coefficient; convert the minimum reverberation gain value The maximum value among the reverberation gain values corresponding to the target reverberation suppression coefficient is determined as the target reverberation gain value. The third sound signal and the first sound signal may be the same sound signal or different sound signals, which is not limited in this embodiment.
通过本实施例,基于相干函数估计混响抑制系数,进而基于混响抑制系数确定混响增益值,可以提高混响增益值确定的准确性。Through this embodiment, the reverberation suppression coefficient is estimated based on the coherence function, and the reverberation gain value is determined based on the reverberation suppression coefficient, which can improve the accuracy of the reverberation gain value determination.
在一个示例性实施例中,所述获取所述第一设备和所述第二设备中的每个设 备接收的多个声音信号,包括:In an exemplary embodiment, the obtaining a plurality of sound signals received by each of the first device and the second device includes:
S31,获取所述每个设备接收的两个声音信号,其中,所述两个声音信号是所述每个设备的同一麦克风阵列的不同麦克风所接收到的、与所述操作执行信号对应的声音信号。S31. Obtain two sound signals received by each device, where the two sound signals are sounds corresponding to the operation execution signals received by different microphones of the same microphone array of each device. Signal.
在本实施例中,为了提高设备控制的及时性,获取的每个设备接收的声音信号的数量可以为两个。即,服务器可以获取每个设备接收的两个声音信号(或者说,两路声音信号),获取的两个声音信号是每个设备的同一麦克风阵列的不同麦克风所接收到的、与操作执行信号对应的声音信号。In this embodiment, in order to improve the timeliness of device control, the number of acquired sound signals received by each device may be two. That is, the server can obtain two sound signals (or two-way sound signals) received by each device. The two sound signals obtained are received by different microphones of the same microphone array of each device and are related to the operation execution signal. corresponding sound signal.
在获取到每个设备接收的两个声音信号之后,服务器可以采用与前述实施例中类似的方式确定两个声音信号的相干函数,得到目标相干函数,并根据得到的目标相干函数对每个设备的混响抑制系数进行估计,得到目标混响抑制系数;将最小混响增益值和与目标混响抑制系数对应的混响增益值中的最大值,确定为每个设备的混响增益值。确定每个设备的混响增益值的过程与前述实施例中类似,在此不做赘述。After acquiring the two sound signals received by each device, the server can determine the coherence functions of the two sound signals in a similar manner to the previous embodiment, obtain the target coherence function, and calculate the target coherence function for each device based on the obtained target coherence function. The reverberation suppression coefficient is estimated to obtain the target reverberation suppression coefficient; the maximum value of the minimum reverberation gain value and the reverberation gain value corresponding to the target reverberation suppression coefficient is determined as the reverberation gain value of each device. The process of determining the reverberation gain value of each device is similar to the previous embodiment and will not be described again.
通过本实施例中,通过分别获取每个设备接收到的两个声音信号确定每个设备的混响增益值,可以提高设备混响增益确定的及时性。In this embodiment, the reverberation gain value of each device is determined by separately obtaining two sound signals received by each device, which can improve the timeliness of determining the reverberation gain of the device.
在一个示例性实施例中,目标混响增益值包括与第一设备对应的第一混响增益值以及与第二设备对应的第二混响增益值,例如,基于公式(5)所示的计算方式确定出的混响增益值。第一混响增益值和第二混响增益值可以是相同的,也可以是不同的,本实施例中对此不做限定。In an exemplary embodiment, the target reverberation gain value includes a first reverberation gain value corresponding to the first device and a second reverberation gain value corresponding to the second device, for example, based on equation (5) The reverberation gain value determined by the calculation method. The first reverberation gain value and the second reverberation gain value may be the same or different, which is not limited in this embodiment.
对应地,使用目标混响增益值对初始互功率谱进行更新,得到目标互功率谱,包括:Correspondingly, the initial cross power spectrum is updated using the target reverberation gain value to obtain the target cross power spectrum, including:
S41,对第一混响增益值和第一频域信号的乘积、以及第二混响增益值和第二频域信号共轭的乘积执行相乘操作,得到混响参考信息,其中,第一频域信号是与第一声音信号对应的频域信号,第二频域信号是与第二声音信号对应的频域信号;S41, perform a multiplication operation on the product of the first reverberation gain value and the first frequency domain signal, and the product of the second reverberation gain value and the conjugate of the second frequency domain signal to obtain the reverberation reference information, where the first The frequency domain signal is a frequency domain signal corresponding to the first sound signal, and the second frequency domain signal is a frequency domain signal corresponding to the second sound signal;
S42,对初始互功率谱和混响参考信息执行加权求和操作,得到目标互功率谱。S42: Perform a weighted summation operation on the initial cross power spectrum and the reverberation reference information to obtain the target cross power spectrum.
基于第一混响增益值和第二混响增益值,服务器可以对第一混响增益值和第一频域信号的乘积、以及第二混响增益值和第二频域信号共轭的乘积执行相乘操作,得到混响参考信息,在得到混响参考信息之后,可以对初始互功率谱和混响参考信息执行加权求和操作,从而得到目标互功率谱。Based on the first reverberation gain value and the second reverberation gain value, the server may calculate the product of the first reverberation gain value and the first frequency domain signal, and the product of the second reverberation gain value and the conjugate of the second frequency domain signal. Perform a multiplication operation to obtain the reverberation reference information. After obtaining the reverberation reference information, a weighted summation operation can be performed on the initial cross power spectrum and the reverberation reference information to obtain the target cross power spectrum.
例如,对于设备A和B,各选取两个通道(间距已知),计算得到对应的混响增益值。再选择A和B的一个通道(即,x 1和x 2),经过数据校准后,计算其互功率谱,计算互功率谱的方式可以如公式(7)所示: For example, for devices A and B, select two channels each (the spacing is known), and calculate the corresponding reverberation gain values. Then select a channel of A and B (i.e., x 1 and x 2 ). After data calibration, calculate its cross power spectrum. The method of calculating the cross power spectrum can be as shown in formula (7):
Figure PCTCN2022095335-appb-000012
Figure PCTCN2022095335-appb-000012
其中,
Figure PCTCN2022095335-appb-000013
为x 1和x 2的互功率谱,G A(l,f)为设备A的混响增益值,G B(l,f)为设备B的混响增益值,X 1(l,f)为x 1的频域信号,X 2 *(l,f)为x 2的频域信号的共轭,θ为预设值,其可以是[0,1]之间的任意值,例如,0.9。
in,
Figure PCTCN2022095335-appb-000013
is the mutual power spectrum of x 1 and x 2 , G A (l,f) is the reverberation gain value of device A, G B (l,f) is the reverberation gain value of device B, X 1 (l,f) is the frequency domain signal of x 1 , X 2 * (l, f) is the conjugate of the frequency domain signal of x 2 , θ is a preset value, which can be any value between [0,1], for example, 0.9 .
通过本实施例,考虑混响因素对于互相关函数的影响,基于混响增益值和对应的频域信号对互功率谱进行更新,可以提高互功率谱表征声源信号之间的关系的能力,进而提高设备控制的准确性。Through this embodiment, by considering the impact of reverberation factors on the cross-correlation function and updating the cross-power spectrum based on the reverberation gain value and the corresponding frequency domain signal, the ability of the cross-power spectrum to represent the relationship between sound source signals can be improved. This improves the accuracy of equipment control.
在一个示例性实施例中,根据目标互功率谱,确定第一设备和第二设备之间的目标时延,包括:In an exemplary embodiment, determining the target delay between the first device and the second device according to the target cross power spectrum includes:
S51,根据目标互功率谱,确定与第一设备和第二设备对应的广义互相关相位变换函数,其中,广义互相关相位变换函数是以第一设备和第二设备之间的时延为变量的函数;S51. According to the target cross power spectrum, determine the generalized cross-correlation phase transformation function corresponding to the first device and the second device, where the generalized cross-correlation phase transformation function uses the time delay between the first device and the second device as a variable. The function;
S52,在第一时延范围内查找使得广义互相关相位变换函数的函数值最大的时延,得到目标时延,其中,第一时延范围是以时延阈值的相反数和时延阈值为端点的区间,时延阈值为第一设备和第二设备之间的距离除以音速所得到的值。S52. Find the delay that maximizes the function value of the generalized cross-correlation phase transformation function in the first delay range to obtain the target delay. The first delay range is based on the inverse number of the delay threshold and the delay threshold. In the endpoint interval, the delay threshold is the value obtained by dividing the distance between the first device and the second device by the speed of sound.
可选地,可基于与第一设备和第二设备对应的GCC-PHAT(Generalized Cross Correlation PHAse Transformation,广义互相关相位变换)函数确定第一设备和第二设备之间的时延。服务器可以根据目标互功率谱,确定与第一设备和第二设备对应的GCC-PHAT函数,这里的GCC-PHAT函数是以第一设备和第二设备之间的时延为变量的函数。例如,可以基于公式(8)确定GCC-PHAT函数
Figure PCTCN2022095335-appb-000014
Optionally, the time delay between the first device and the second device may be determined based on a GCC-PHAT (Generalized Cross Correlation PHAse Transformation) function corresponding to the first device and the second device. The server may determine the GCC-PHAT function corresponding to the first device and the second device according to the target cross power spectrum, where the GCC-PHAT function is a function with the time delay between the first device and the second device as a variable. For example, the GCC-PHAT function can be determined based on equation (8)
Figure PCTCN2022095335-appb-000014
Figure PCTCN2022095335-appb-000015
Figure PCTCN2022095335-appb-000015
其中,
Figure PCTCN2022095335-appb-000016
为两个设备之间的互功率谱,τ为两个设备之间的时延,l、f和h的含义与前述实施例中类似,在此不做赘述。
in,
Figure PCTCN2022095335-appb-000016
is the mutual power spectrum between the two devices, τ is the time delay between the two devices, the meanings of l, f and h are similar to those in the previous embodiments and will not be described again here.
按照已有经验,如果麦克风间距d已知,时延的取值范围为τ∈[-d/c,d/c],其中,c为声音传播的速度,即,声速。对于第一设备和第二设备,时延的取值范围为第一时延范围,第一时延范围是以时延阈值(即,d/c,d为两个设备之间的距离)的相反数和时延阈值为端点的区间。According to existing experience, if the microphone distance d is known, the value range of the delay is τ∈[-d/c,d/c], where c is the speed of sound propagation, that is, the speed of sound. For the first device and the second device, the delay value range is the first delay range, and the first delay range is based on the delay threshold (that is, d/c, d is the distance between the two devices) The inverse number and delay threshold are the endpoints of the interval.
在确定第一设备和第二设备之间的时延时,可以在第一时延范围内查找最优时延,即,在第一时延范围内查找使得
Figure PCTCN2022095335-appb-000017
(变量为τ)最大的时延,是一个求全局最优解的问题,可以通过插值法(通过大量插值的方式查找最优解)或者其他方式查找使得GCC-PHAT函数的函数值最大的时延,得到目标时延。
When determining the delay between the first device and the second device, the optimal delay can be found within the first delay range, that is, the optimal delay can be found within the first delay range such that
Figure PCTCN2022095335-appb-000017
(The variable is τ) The maximum delay is a problem of finding the global optimal solution. You can use interpolation (find the optimal solution through a large number of interpolations) or other methods to find the time that maximizes the function value of the GCC-PHAT function. delay to obtain the target delay.
通过本实施例,通过GCC-PHAT函数确定设备之间的时延,可以提高时延确定的准确性。Through this embodiment, the delay between devices is determined through the GCC-PHAT function, which can improve the accuracy of delay determination.
在一个示例性实施例中,在第一时延范围内查找使得广义互相关相位变换函数的函数值最大的时延,得到目标时延,包括:In an exemplary embodiment, the delay that maximizes the function value of the generalized cross-correlation phase transformation function is found within the first delay range to obtain the target delay, which includes:
S61,在第二时延范围内随机选取时延,得到随机时延,其中,第二时延范围是以零和时延阈值为端点的区间;S61: Randomly select a delay within the second delay range to obtain a random delay, where the second delay range is an interval with zero and the delay threshold as endpoints;
S62,确定广义互相关相位变换函数与随机时延对应的第一参数函数值、以及广义互相关相位变换函数与随机时延的相反数对应的第二参考函数值;S62: Determine the first parameter function value corresponding to the generalized cross-correlation phase transformation function and the random delay, and the second reference function value corresponding to the generalized cross-correlation phase transformation function and the inverse of the random delay;
S63,将第一参考函数值和第二参考函数值中的最大函数值所对应的时延,确定为初始时延;S63. Determine the delay corresponding to the maximum function value among the first reference function value and the second reference function value as the initial delay;
S64,基于初始时延和初始时延的相反数,在第一时延范围内执行插值操作,得到目标时延,其中,目标时延为初始时延以及插值操作所插入的时延中,使得广义互相关相位变换函数的函数值最大的时延。S64, based on the initial delay and the inverse of the initial delay, perform an interpolation operation within the first delay range to obtain the target delay, where the target delay is the initial delay and the delay inserted by the interpolation operation, such that The time delay at which the function value of the generalized cross-correlation phase transformation function is maximum.
可选地,可以通过插值法在第一时延范围内查找最优时延,例如,可以通过进行大量的插值计算来得到最优τ。示例性地,可以以时延阈值的相反数为起点,预设时间间隔为步长依次在第一时延范围内进行插值,得到的一组插入时延;分别确定GCC-PHAT函数与一组插入时延中的每个插入时延对应的函数值;从时延阈值的相反数、一组插入时延以及时延阈值中选取出对应的函数值最大的时延,得到目标时延。Optionally, the optimal delay can be found within the first delay range through an interpolation method. For example, the optimal τ can be obtained by performing a large number of interpolation calculations. For example, the opposite number of the delay threshold can be used as the starting point, and the preset time interval is the step size to sequentially interpolate within the first delay range to obtain a set of insertion delays; determine the GCC-PHAT function and a set of The function value corresponding to each insertion delay in the insertion delay; select the delay with the largest corresponding function value from the inverse of the delay threshold, a set of insertion delays and the delay threshold to obtain the target delay.
考虑到在实际交互场景中,受限于响应时间与设备的计算能力,往往需要求较小计算量,通过大量插值的方式查找最优时延的方式适用性较差。在本实施例中,可以通过设置参考时延,并基于参考时延、以及GCC-PHAT函数与参考时延对应的函数值选择插值的位置,以提高插值处理的便捷性。服务器可以在第二时延范围内随机选取时延,得到随机时延。可选地,随机时延可以为一个正值,对应地,第二时延范围可以是零和时延阈值为端点的区间,例如,假设两个麦克风之间的间距d最大是6m,时延随机在(0,d/c]之间取值,得到τ 0(随机时延的一种示例)。 Considering that in actual interaction scenarios, limited by the response time and the computing power of the device, a small amount of calculation is often required, and the method of finding the optimal delay through a large number of interpolations has poor applicability. In this embodiment, the convenience of interpolation processing can be improved by setting the reference delay and selecting the interpolation position based on the reference delay and the function value corresponding to the GCC-PHAT function and the reference delay. The server can randomly select a delay within the second delay range to obtain a random delay. Optionally, the random delay can be a positive value. Correspondingly, the second delay range can be an interval between zero and the delay threshold as the endpoint. For example, assuming that the distance d between two microphones is at most 6m, the delay Randomly select a value between (0,d/c] to obtain τ 0 (an example of random delay).
服务器还可以确定GCC-PHAT函数与随机时延对应的第一参数函数值、以及GCC-PHAT函数与随机时延的相反数对应的第二参考函数值,将第一参考函数值和第二参考函数值中的最大函数值所对应的时延,确定为初始时延,即,上述参考时延,并基于初始时延和初始时延的相反数,在第一时延范围内执行插值操作,得到目标时延,这里,目标时延为初始时延及插值操作所插入的时延中,使得GCC-PHAT函数的函数值最大的时延。The server may also determine the first parameter function value corresponding to the GCC-PHAT function and the random delay, and the second reference function value corresponding to the GCC-PHAT function and the inverse of the random delay, and combine the first reference function value and the second reference function value. The delay corresponding to the maximum function value among the function values is determined as the initial delay, that is, the above-mentioned reference delay, and based on the initial delay and the inverse number of the initial delay, an interpolation operation is performed within the first delay range, The target delay is obtained. Here, the target delay is the delay that maximizes the function value of the GCC-PHAT function among the initial delay and the delay inserted by the interpolation operation.
例如,计算比较R(τ 0)和R(-τ 0),将对应的R取其中最大值最为τ 1(上述初始时延的一种示例),另一个记作τ 1'(上述初始时延的相反数的一种示例),并基于τ 1和τ 1'在[-d/c,d/c]之间进行插值,以查找使得GCC-PHAT函数的函数值最大的时延。 For example, calculate and compare R(τ 0 ) and R(-τ 0 ), and take the maximum value of the corresponding R as τ 1 (an example of the above-mentioned initial delay), and record the other as τ 1 ' (the above-mentioned initial time delay). An example of the inverse of delay), and interpolate between [-d/c,d/c] based on τ 1 and τ 1 ' to find the delay that maximizes the function value of the GCC-PHAT function.
通过本实施例,通过随机选取时延作为初始时延进行插值处理,可以提高插值处理的便捷性。Through this embodiment, by randomly selecting the time delay as the initial time delay for interpolation processing, the convenience of the interpolation processing can be improved.
在一个示例性实施例中,基于初始时延和初始时延的相反数,在第一时延范围内执行插值操作,得到目标时延,包括:In an exemplary embodiment, based on the initial delay and the inverse of the initial delay, an interpolation operation is performed within the first delay range to obtain the target delay, including:
S71,循环执行以下插值步骤,直到满足循环停止条件,其中,循环停止条件包括以下至少之一:执行的插值步骤的次数达到预设次数(例如,10次),初始时延在预设时延范围内(可以是基于先验确定的时延范围,即,基于目标对象允许的活动范围所设置的时延范围),循环结束后的初始时延为目标时延:S71, perform the following interpolation steps in a loop until the loop stop condition is met, where the loop stop condition includes at least one of the following: the number of interpolation steps performed reaches a preset number of times (for example, 10 times), and the initial delay is within the preset delay Within the range (which can be based on the delay range determined a priori, that is, the delay range set based on the allowed activity range of the target object), the initial delay after the end of the cycle is the target delay:
步骤1,确定第一时延,其中,第一时延是在初始时延和时延阈值之间插入的时延。Step 1: Determine a first delay, where the first delay is a delay inserted between the initial delay and the delay threshold.
在每轮第一次插值时,可以在初始时延和时延阈值间(例如,(τ 1,d/c]之间)插入第一时延。例如,插入的时延τ 2(即,第一时延),为:τ 2=τ 1+α(τ 11′),这里,α为小于1的值,例如,α=0.6,α也可以是其他值。 In the first interpolation round of each round, the first delay can be inserted between the initial delay and the delay threshold (for example, between (τ 1 ,d/c]). For example, the inserted delay τ 2 (i.e., The first delay) is: τ 21 +α(τ 11 ′), where α is a value less than 1, for example, α=0.6, and α can also be other values.
步骤2,在广义互相关相位变换函数与第一时延对应的第一函数值大于第一参考函数值的情况下,确定第二时延,其中,第二时延是在初始时延和第一时延之间插入的时延。Step 2: Determine the second delay when the first function value corresponding to the generalized cross-correlation phase transformation function and the first delay is greater than the first reference function value, where the second delay is between the initial delay and the first delay. The delay inserted between delays.
在确定第一时延之后,可以确定GCC-PHAT函数与第一时延对应的第一函数值。如果第一函数值大于第一参考函数值,可以继续在初始时延和第一时延之间插入第二时延。例如,计算R(τ 2),如果R(τ 2)>R(τ 1),则插入时延τ 21(即,第二时延),为:τ 21=τ 1+β(τ 21),这里,β为小于1的值,例如,β=0.8,此时,插入的值在(τ 12)之间,β也可以是其他值。 After determining the first delay, a first function value corresponding to the GCC-PHAT function and the first delay may be determined. If the first function value is greater than the first reference function value, a second delay may continue to be inserted between the initial delay and the first delay. For example, calculate R(τ 2 ), if R(τ 2 )>R(τ 1 ), insert the time delay τ 21 (that is, the second time delay), which is: τ 211 +β(τ 2 - τ 1 ), where β is a value less than 1, for example, β = 0.8. At this time, the inserted value is between (τ 1 , τ 2 ), and β can also be other values.
步骤3,在广义互相关相位变换函数与第二时延对应的第二函数值大于第一函数值的情况下,使用第二时延对初始时延进行更新,得到更新后的初始时延。Step 3: When the second function value corresponding to the generalized cross-correlation phase transformation function and the second delay is greater than the first function value, use the second delay to update the initial delay to obtain an updated initial delay.
在确定第二时延之后,可以确定GCC-PHAT函数与第二时延对应的第二函数值。如果第二函数值大于第一函数值,则可以使用第二时延对初始时延进行更新,得到更新后的初始时延。After determining the second delay, a second function value corresponding to the GCC-PHAT function and the second delay may be determined. If the second function value is greater than the first function value, the second delay can be used to update the initial delay to obtain an updated initial delay.
步骤4,在第二函数值小于第一函数值的情况下,使用第一时延对初始时延进行更新,得到更新后的初始时延。Step 4: When the second function value is smaller than the first function value, use the first delay to update the initial delay to obtain the updated initial delay.
如果第二函数值小于第一函数值,则可以使用第一时延对初始时延进行更新,得到更新后的初始时延。例如,如果R(τ 21)>R(τ 2),则τ opt=τ 21,否则,τ opt=τ 2,这里,τ opt即为更新后的初始时延。 If the second function value is smaller than the first function value, the first delay can be used to update the initial delay to obtain an updated initial delay. For example, if R(τ 21 )>R(τ 2 ), then τ opt21 , otherwise, τ opt2 , where τ opt is the updated initial delay.
步骤5,在第一函数值小于第一参考函数值、且大于第二参考函数值的情况下,确定第三时延,其中,第三时延是在第一时延和时延阈值之间插入的时延。Step 5: When the first function value is smaller than the first reference function value and larger than the second reference function value, determine the third delay, where the third delay is between the first delay and the delay threshold. Insertion delay.
如果第一函数值小于第一参考函数值、且大于第二参考函数值,则可以在第一时延和时延阈值之间插入第三时延。例如,R(τ 1′)<R(τ 2)<R(τ 1),则插入时延τ 31(即,第三时延),为:τ 31=τ 1+γ(τ 21),这里,γ为大于1的值,例如,γ=1.2,此时,插入的值在(τ 2,d/c]间,γ也可以是其他值。 If the first function value is smaller than the first reference function value and larger than the second reference function value, a third delay may be inserted between the first delay and the delay threshold. For example, R(τ 1 ′)<R(τ 2 )<R(τ 1 ), then insert the time delay τ 31 (that is, the third time delay), which is: τ 311 +γ(τ 21 ), here, γ is a value greater than 1, for example, γ=1.2. At this time, the inserted value is between (τ 2 , d/c], and γ can also be other values.
步骤6,在广义互相关相位变换函数与第三时延对应的第三函数值大于第一函数值的情况下,使用第三时延对初始时延进行更新,得到更新后的初始时延。Step 6: When the third function value corresponding to the generalized cross-correlation phase transformation function and the third delay is greater than the first function value, use the third delay to update the initial delay to obtain an updated initial delay.
在确定第三时延后,可以确定GCC-PHAT函数与第三时延对应的第三函数值。如果第三函数值大于第一函数值,可以使用第三时延对初始时延进行更新,得到更新后的初始时延。例如,计算R(τ 31),如果R(τ 31)>R(τ 2),则τ opt=τ 31After determining the third delay, a third function value corresponding to the GCC-PHAT function and the third delay may be determined. If the third function value is greater than the first function value, the third delay can be used to update the initial delay to obtain the updated initial delay. For example, calculate R(τ 31 ), if R(τ 31 )>R(τ 2 ), then τ opt31 .
步骤7,在第三函数值小于第一函数值的情况下,确定第四时延,并使用第四时延对初始时延进行更新,得到更新后的初始时延,其中,第四时延是零与初始时延的相反数之间插入的时延。Step 7: When the third function value is smaller than the first function value, determine the fourth delay, and use the fourth delay to update the initial delay to obtain the updated initial delay, where the fourth delay is the delay inserted between zero and the inverse of the initial delay.
如果第三函数值小于第一函数值,可以继续在零与初始时延的相反数之间插入第四时延,并使用第四时延对初始时延进行更新,得到更新后的初始时延。例如,如果R(τ 31)<R(τ 2),插入时延τ 32(即,第三时延),为:τ 32=τ 1′+μ(τ 11′),这里,μ为小于1的值,例如,μ=0.125,此时,插入的值在(τ 1′,0)之间。在此情况下,τ opt=τ 32If the third function value is smaller than the first function value, you can continue to insert the fourth delay between zero and the opposite number of the initial delay, and use the fourth delay to update the initial delay to obtain the updated initial delay. . For example, if R(τ 31 )<R(τ 2 ), insert time delay τ 32 (that is, the third time delay), it is: τ 321 ′+μ(τ 11 ′), where, μ is a value less than 1, for example, μ=0.125. At this time, the inserted value is between (τ 1 ′, 0). In this case, τ opt32 .
步骤8,在第一函数值小于第二参考函数值的情况下,确定第五时延,其中,第五时延是在时延阈值的相反数和初始时延的相反数之间插入的时延。Step 8: When the first function value is less than the second reference function value, determine the fifth delay, where the fifth delay is the time inserted between the inverse of the delay threshold and the inverse of the initial delay. extension.
如果第一函数值小于第二参考函数值,则可以在时延阈值的相反数和初始时延的相反数之间插入第五时延。例如,如果R(τ 2)<R(τ 1′),则插入时延τ 41(即,第五时延),为:τ 41=τ 1+γ(τ 1′-τ 1),插入的τ 41在[d/c,τ 1′)之间。 If the first function value is smaller than the second reference function value, a fifth delay may be inserted between the inverse of the delay threshold and the inverse of the initial delay. For example, if R(τ 2 )<R(τ 1 ′), then insert the time delay τ 41 (ie, the fifth time delay), which is: τ 411 +γ(τ 1 ′-τ 1 ), insert τ 41 is between [d/c,τ 1 ′).
步骤9,在广义互相关相位变换函数与第五时延对应的第四函数值大于第一参考函数值的情况下,使用第五时延对初始时延进行更新,得到更新后的初始时延。Step 9: When the fourth function value corresponding to the generalized cross-correlation phase transformation function and the fifth time delay is greater than the first reference function value, use the fifth time delay to update the initial time delay to obtain the updated initial time delay. .
在确定第五时延后,可以确定GCC-PHAT函数与第五时延对应的第四函数值。如果第四函数值大于第一参考函数值,可以使用第五时延对初始时延进行更新,得到更新后的初始时延。例如,计算R(τ 41),如果R(τ 41)>R(τ 1),则τ opt=τ 41After the fifth delay is determined, a fourth function value corresponding to the GCC-PHAT function and the fifth delay can be determined. If the fourth function value is greater than the first reference function value, the fifth delay can be used to update the initial delay to obtain an updated initial delay. For example, calculate R(τ 41 ), if R(τ 41 )>R(τ 1 ), then τ opt41 .
步骤10,在第四函数值小于第一参考函数值的情况下,确定第六时延,并使用第六时延对初始时延进行更新,得到更新后的初始时延,其中,第六时延是在零与初始时延的相反数之间插入的时延。Step 10: When the fourth function value is less than the first reference function value, determine the sixth time delay, and use the sixth time delay to update the initial time delay to obtain the updated initial time delay, where the sixth time delay Delay is the delay inserted between zero and the inverse of the initial delay.
如果第四函数值小于第一参考函数值,可以继续在零与初始时延的相反数之间插入第六时延,并使用第六时延对初始时延进行更新,得到更新后的初始时延。例如,如果R(τ 41)<R(τ 1),则插入时延τ 42(即,第六时延),为:τ 42=τ 1′+μ(τ 11′)。此时,τ opt=τ 42If the fourth function value is smaller than the first reference function value, you can continue to insert a sixth delay between zero and the inverse of the initial delay, and use the sixth delay to update the initial delay to obtain the updated initial delay. extension. For example, if R(τ 41 )<R(τ 1 ), then insert time delay τ 42 (ie, the sixth time delay), which is: τ 421 ′+μ(τ 11 ′). At this time, τ opt42 .
在更新初始时延后,可以跳出本次循环,如果不满足循环停止条件,则可以使用更新后的初始执行重新执行插值步骤,例如,可以设置τ 1=τ opt,τ 1′=-τ opt,重复上述操作;如果满足循环停止条件,则可以将更新后的初始时延确定为目标时延,使用更新后的初始执行重新执行插值步骤的过程与上述步骤1至步骤10类似,在此不做赘述。 After updating the initial delay, you can jump out of this loop. If the loop stop condition is not met, you can use the updated initial execution to re-execute the interpolation step. For example, you can set τ 1opt , τ 1 ′ = -τ opt , repeat the above operation; if the loop stop condition is met, the updated initial delay can be determined as the target delay, and the process of re-executing the interpolation step using the updated initial execution is similar to the above steps 1 to 10, and will not be repeated here. To elaborate.
通过本实施例,通过在不同的区间范围内插值,并基于插值对应的函数值与已确定的函数值之间的关系,确定继续插值还是更新初始时延,可以提高插值的合理性,避免最优时延确定的准确性。Through this embodiment, by interpolating values in different intervals and determining whether to continue interpolation or update the initial delay based on the relationship between the function value corresponding to the interpolation and the determined function value, the rationality of the interpolation can be improved and the final result can be avoided. Accuracy of optimal delay determination.
在一个示例性实施例中,在本实施例中,可以采用与前述实施例中类似的方式确定目标时延,例如,可以采用与前述实施例中类似的方式确定第一时延至第六时延。为了提高目标时延确定的准确性,在确定第一时延至第六时延时,可以选取以下参数:α=0.6,β=0.8,γ=1.2,μ=0.125。对应地,基于初始时延和初始时延的相反数,在第一时延范围内执行插值操作,得到目标时延,包括:In an exemplary embodiment, in this embodiment, the target delay may be determined in a manner similar to that in the foregoing embodiment. For example, the first to sixth delays may be determined in a manner similar to that in the foregoing embodiment. . In order to improve the accuracy of target delay determination, the following parameters can be selected when determining the first delay to the sixth delay: α=0.6, β=0.8, γ=1.2, μ=0.125. Correspondingly, based on the initial delay and the inverse of the initial delay, an interpolation operation is performed within the first delay range to obtain the target delay, including:
S81,循环执行以下插值步骤,直到满足循环停止条件,其中,循环停止条件包括以下至少之一:执行的插值步骤的次数达到预设次数,初始时延在预设时延范围内,循环结束后的初始时延为目标时延:S81, perform the following interpolation steps in a loop until the loop stop condition is met, where the loop stop condition includes at least one of the following: the number of interpolation steps performed reaches a preset number, the initial delay is within the preset delay range, and after the loop ends The initial delay is the target delay:
确定第一时延,其中,第一时延为初始时延与初始时延的相反数的差值乘以 0.6所得到的值与初始时延的和(例如,τ 2=τ 1+0.6×(τ 11′)); Determine the first delay, where the first delay is the sum of the difference between the initial delay and the inverse of the initial delay multiplied by 0.6 and the initial delay (for example, τ 21 +0.6× (τ 11 ′));
在广义互相关相位变换函数与第一时延对应的第一函数值大于第一参考函数值的情况下,确定第二时延,其中,第二时延为第一时延与初始时延的差值乘以0.8所得到的值与初始时延的和(例如,τ 21=τ 1+0.8×(τ 21)); When the first function value corresponding to the generalized cross-correlation phase transformation function and the first delay is greater than the first reference function value, the second delay is determined, where the second delay is the sum of the first delay and the initial delay. The sum of the difference multiplied by 0.8 and the initial delay (for example, τ 211 +0.8×(τ 21 ));
在广义互相关相位变换函数与第二时延对应的第二函数值大于第一函数值的情况下,使用第二时延对初始时延进行更新,得到更新后的初始时延;When the second function value corresponding to the generalized cross-correlation phase transformation function and the second delay is greater than the first function value, use the second delay to update the initial delay to obtain the updated initial delay;
在第二函数值小于第一函数值的情况下,使用第一时延对初始时延进行更新,得到更新后的初始时延;When the second function value is smaller than the first function value, the first delay is used to update the initial delay to obtain the updated initial delay;
在第一函数值小于第一参考函数值、且大于第二参考函数值的情况下,确定第三时延,其中,第三时延为第一时延与初始时延的差值乘以1.2所得到的值与初始时延的和(例如,τ 31=τ 1+1.2×(τ 21)); When the first function value is smaller than the first reference function value and larger than the second reference function value, determine the third delay, where the third delay is the difference between the first delay and the initial delay multiplied by 1.2 The sum of the obtained value and the initial delay (for example, τ 311 +1.2×(τ 21 ));
在广义互相关相位变换函数与第三时延对应的第三函数值大于第一函数值的情况下,使用第三时延对初始时延进行更新,得到更新后的初始时延;When the third function value corresponding to the generalized cross-correlation phase transformation function and the third delay is greater than the first function value, use the third delay to update the initial delay to obtain the updated initial delay;
在第三函数值小于第一函数值的情况下,确定第四时延,并使用第四时延对初始时延进行更新,得到更新后的初始时延,其中,第四时延为初始时延与初始时延的相反数的差值乘以0.125所得到的值与初始时延的相反数的和(例如,τ 32=τ 1′+0.125×(τ 11′)); When the third function value is smaller than the first function value, determine the fourth delay, and use the fourth delay to update the initial delay to obtain the updated initial delay, where the fourth delay is the initial delay. The sum of the difference between the delay and the inverse of the initial delay multiplied by 0.125 and the inverse of the initial delay (for example, τ 321 ′+0.125×(τ 11 ′));
在第一函数值小于第二参考函数值的情况下,确定第五时延,其中,第五时延为初始时延的相反数与初始时延的差值乘以1.2所得到的值与初始时延的和(例如,τ 41=τ 1+1.2×(τ 1′-τ 1)); When the first function value is less than the second reference function value, the fifth delay is determined, where the fifth delay is the difference between the inverse of the initial delay and the initial delay multiplied by 1.2 and the value obtained by the initial delay. The sum of time delays (for example, τ 411 +1.2×(τ 1 ′-τ 1 ));
在广义互相关相位变换函数与第五时延对应的第四函数值大于第一参考函数值的情况下,使用第五时延对初始时延进行更新,得到更新后的初始时延;When the fourth function value corresponding to the generalized cross-correlation phase transformation function and the fifth delay is greater than the first reference function value, use the fifth delay to update the initial delay to obtain the updated initial delay;
在第四函数值小于第一参考函数值的情况下,确定第六时延,并使用第六时延对初始时延进行更新,得到更新后的初始时延,其中,第六时延为初始时延与初始时延的相反数的差值乘以0.125所得到的值与初始时延的相反数的和(例如,τ 42=τ 1′+0.125×(τ 11′))。 When the fourth function value is less than the first reference function value, determine the sixth delay, and use the sixth delay to update the initial delay to obtain the updated initial delay, where the sixth delay is the initial The sum of the difference between the delay and the inverse of the initial delay multiplied by 0.125 and the inverse of the initial delay (for example, τ 421 ′+0.125×(τ 11 ′)).
通过本实施例,通过设置合理的时延插入的相关参数,可以提高目标时延确 定的准确性,提高设备控制的准确性。Through this embodiment, by setting reasonable delay insertion related parameters, the accuracy of target delay determination can be improved, and the accuracy of device control can be improved.
在一个示例性实施例中,根据目标时延,从第一设备和第二设备中确定目标设备,包括:In an exemplary embodiment, determining the target device from the first device and the second device according to the target delay includes:
S71,在目标时延为正的情况下,将第二设备确定为目标设备;S71, when the target delay is positive, determine the second device as the target device;
S72,在目标时延为负的情况下,将第一设备确定为目标设备。S72: When the target delay is negative, determine the first device as the target device.
在本实施例中,可以基于目标时延的正负从第一设备和第二设备中选择待控制的智能设备。如果目标时延为正,操作执行信号到达第一设备的时间比到达第二设备的时间晚,即,操作执行信号更早的到达第二设备,第二设备距离目标对象更近,因此,可以将第二设备确定为目标设备。In this embodiment, the smart device to be controlled may be selected from the first device and the second device based on the positive or negative of the target delay. If the target delay is positive, the operation execution signal reaches the first device later than the second device, that is, the operation execution signal reaches the second device earlier, and the second device is closer to the target object. Therefore, it can Determine the second device as the target device.
类似地,如果目标时延为负,操作执行信号到达第一设备的时间比到达第二设备的时间早,即,操作执行信号更早的到达第一设备,第一设备距离目标对象更近,因此,可以将第一设备确定为目标设备。Similarly, if the target delay is negative, the operation execution signal reaches the first device earlier than the second device, that is, the operation execution signal reaches the first device earlier, and the first device is closer to the target object. Therefore, the first device can be determined as the target device.
通过本实施例,基于到达两个设备时延的正负选择待控制的智能设备,可以提高设备控制的合理性。Through this embodiment, the smart device to be controlled is selected based on the positive and negative delay of arrival of two devices, which can improve the rationality of device control.
下面结合可选示例对本公开实施例中的智能设备的控制方法进行解释说明。在本示例中,操作执行信号为第一设备为第一家电,第二设备为第二家电,操作执行信号为唤醒信号。The control method of the smart device in the embodiment of the present disclosure is explained below in combination with optional examples. In this example, the operation execution signal is that the first device is a first home appliance, the second device is a second home appliance, and the operation execution signal is a wake-up signal.
本可选示例中提供了一种多个设备就近唤醒判别的方案,在就近唤醒中,利用设备与设备之间的信号进行TDOA(Time Difference of Arrival,到达时间差)估计(即,通过求解两个设备信号之间的互相关函数来进行判别),并考虑混响因素,利用互相关函数可以进行混响抑制,提高TDOA估计方式对混响因素的鲁棒性。同时,对于两个设备的判别方式可以拓展至多个设备,先对两个设备进行判别,选取较优设备后,再与下一个设备判别,以此类推。This optional example provides a solution for determining nearby wake-up of multiple devices. In nearby wake-up, the signal between devices is used to estimate TDOA (Time Difference of Arrival) (that is, by solving two The cross-correlation function between device signals is used for discrimination), and the reverberation factor is considered. The cross-correlation function can be used to suppress the reverberation and improve the robustness of the TDOA estimation method to the reverberation factor. At the same time, the method of distinguishing two devices can be extended to multiple devices. The two devices are first judged, and the better device is selected, and then the next device is judged, and so on.
如图3所示,本可选示例中的智能设备的控制方法的流程可以包括以下步骤:As shown in Figure 3, the process of the control method of the smart device in this optional example may include the following steps:
步骤S302,从第一家电和第二家电的麦克风阵列中各选取两个通道,得到各两路声音信号,计算得到对应的混响增益值。这里,得到的两路声音信号均是与各个通道的麦克风采集到的唤醒信号的声音信号。Step S302: Select two channels from the microphone arrays of the first home appliance and the second home appliance, obtain two channels of sound signals, and calculate corresponding reverberation gain values. Here, the two sound signals obtained are the same as the wake-up signals collected by the microphones of each channel.
步骤S304,选择第一家电和第二家电的麦克风阵列中的一个通道,得到两路声音信号,经过数据校准后,计算其互功率谱。计算的互功率谱是考虑了混响增益值的互功率谱。Step S304: Select a channel in the microphone array of the first home appliance and the second home appliance to obtain two sound signals. After data calibration, calculate their mutual power spectrum. The calculated cross power spectrum is the cross power spectrum taking into account the reverberation gain value.
步骤S306,根据互功率谱,计算第一家电和第二家电的GCC_PHAT函数,并通过插值计算,确定出最优时延。Step S306: Calculate the GCC_PHAT function of the first home appliance and the second home appliance based on the cross power spectrum, and determine the optimal delay through interpolation calculation.
寻求最优TDOA的方式是一个非线性单变量最优,属于统计优化的问题,除了插值算法以外,还可以采用粒子群算法、最大似然估计、马尔可夫蒙特卡罗方法等,而不限于插值算法。The way to find the optimal TDOA is a nonlinear single-variable optimization, which is a statistical optimization problem. In addition to the interpolation algorithm, particle swarm optimization, maximum likelihood estimation, Markov Monte Carlo method, etc. can also be used, but not limited to interpolation algorithm.
步骤S308,通过判断最优时延的正负,得出声源距离哪个设备较近,并唤醒距离声源最近的设备,并唤醒距离声源最近的设备。这里的声源即为唤醒信号的声源。Step S308: By judging the positive and negative of the optimal delay, it is determined which device the sound source is closer to, and the device closest to the sound source is awakened, and the device closest to the sound source is awakened. The sound source here is the sound source of the wake-up signal.
通过本可选示例,利用设备与设备之间的互相关函数,寻找最优时延TDOA,通过判断TDOA的正负得到声源更近的设备,相较于基于能量的判别方式,具有更好的抗噪性;同时在计算互相关函数时,考虑了混响因素,在大混响环境下也有较好的鲁棒性;此外,利用简化后的插值计算方式寻找全局最优的TDOA,计算量大大减少,准确率提升。Through this optional example, the cross-correlation function between devices is used to find the optimal time delay TDOA. By judging the positive and negative of TDOA, the device with a closer sound source is obtained. Compared with the energy-based discrimination method, it has better Anti-noise; at the same time, the reverberation factor is taken into account when calculating the cross-correlation function, which also has good robustness in a large reverberation environment; in addition, the simplified interpolation calculation method is used to find the global optimal TDOA, and the calculation The volume is greatly reduced and the accuracy is improved.
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本公开并不受所描述的动作顺序的限制,因为依据本公开,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本公开所必须的。It should be noted that for the sake of simple description, the foregoing method embodiments are expressed as a series of action combinations. However, those skilled in the art should know that the present disclosure is not limited by the described action sequence. Because in accordance with the present disclosure, certain steps may be performed in other orders or simultaneously. Secondly, those skilled in the art should also know that the embodiments described in the specification are preferred embodiments, and the actions and modules involved are not necessarily necessary for the present disclosure.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到根据上述实施例的方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM(Read-Only Memory,只读存储器)/RAM(Random Access Memory,随机存取存储器)、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备 等)执行本公开各个实施例所述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is Better implementation. Based on this understanding, the technical solution of the present disclosure can be embodied in the form of a software product in essence or that contributes to the existing technology. The computer software product is stored in a storage medium (such as ROM (Read-Only Memory, Read-only memory)/RAM (Random Access Memory, disk, optical disk), including a number of instructions to make a terminal device (can be a mobile phone, computer, server, or network device, etc.) to execute this Methods described in various embodiments are disclosed.
根据本公开实施例的另一个方面,还提供了一种用于实施上述智能设备的控制方法的智能设备的控制装置。图4是根据本公开实施例的一种可选的智能设备的控制装置的结构框图,如图4所示,该装置可以包括:According to another aspect of an embodiment of the present disclosure, a control device for an intelligent device for implementing the above method for controlling an intelligent device is also provided. Figure 4 is a structural block diagram of an optional intelligent device control device according to an embodiment of the present disclosure. As shown in Figure 4, the device may include:
获取单元402,设置为获取第一设备接收的第一声音信号和第二设备接收到的第二声音信号,其中,所述第一声音信号和所述第二声音信号是与目标对象发出的操作执行信号对应的声音信号;The acquisition unit 402 is configured to acquire the first sound signal received by the first device and the second sound signal received by the second device, wherein the first sound signal and the second sound signal are operations related to the target object. Execute the sound signal corresponding to the signal;
第一确定单元404,与获取单元402相连,设置为根据所述第一声音信号和所述第二声音信号,确定所述第一声音信号和所述第二声音信号的目标互功率谱;The first determination unit 404 is connected to the acquisition unit 402 and is configured to determine the target cross power spectrum of the first sound signal and the second sound signal according to the first sound signal and the second sound signal;
第二确定单元406,与第一确定单元404相连,设置为根据所述目标互功率谱,确定所述第一设备和所述第二设备之间的目标时延,其中,所述目标时延为所述操作执行信号到达所述第一设备和到达所述第二设备的时间差值;The second determination unit 406 is connected to the first determination unit 404 and is configured to determine the target delay between the first device and the second device according to the target cross power spectrum, wherein the target delay The time difference between when the operation execution signal reaches the first device and when it reaches the second device;
执行单元408,与第二确定单元406相连,设置为根据所述目标时延,从所述第一设备和所述第二设备中确定目标设备,并控制所述目标设备执行所述操作执行信号所指示的设备操作,其中,所述目标设备距离所述目标对象最近的设备。The execution unit 408 is connected to the second determination unit 406 and is configured to determine a target device from the first device and the second device according to the target delay, and control the target device to execute the operation execution signal The indicated device operates in which the target device is the device closest to the target object.
需要说明的是,该实施例中的获取单元402可以设置为执行上述步骤S202,该实施例中的第一确定单元404可以设置为执行上述步骤S204,该实施例中的第二确定单元406可以设置为执行上述步骤S206,该实施例中的执行单元408可以设置为执行上述步骤S208。It should be noted that the acquisition unit 402 in this embodiment can be configured to perform the above step S202, the first determination unit 404 in this embodiment can be configured to perform the above step S204, and the second determination unit 406 in this embodiment can Set to execute the above step S206, the execution unit 408 in this embodiment may be set to execute the above step S208.
通过上述模块,获取第一设备接收的第一声音信号和第二设备接收到的第二声音信号,其中,第一声音信号和第二声音信号是与目标对象发出的操作执行信号对应的声音信号;根据第一声音信号和第二声音信号,确定第一声音信号和第二声音信号的目标互功率谱;根据目标互功率谱,确定第一设备和第二设备之间的目标时延,其中,目标时延为操作执行信号到达第一设备和到达第二设备的时间差值;根据目标时延,从第一设备和第二设备中确定目标设备,并控制目标设备执行操作执行信号所指示的设备操作,其中,目标设备距离目标对象最近的设 备,解决了相相关技术中的智能设备的控制方式存在在信噪比较低、混响较高的场景下由于存在噪声和混响导致的设备控制的准确性差的问题,提高了设备控制的准确性。Through the above module, the first sound signal received by the first device and the second sound signal received by the second device are obtained, wherein the first sound signal and the second sound signal are sound signals corresponding to the operation execution signal sent by the target object. ; According to the first sound signal and the second sound signal, determine the target cross power spectrum of the first sound signal and the second sound signal; according to the target cross power spectrum, determine the target delay between the first device and the second device, where , the target delay is the time difference between the arrival of the operation execution signal at the first device and the arrival at the second device; according to the target delay, the target device is determined from the first device and the second device, and the target device is controlled to execute the instruction of the operation execution signal Device operation, in which the target device is the device closest to the target object, solves the problem of the control method of smart devices in related technologies due to the presence of noise and reverberation in scenarios with low signal-to-noise ratio and high reverberation. The problem of poor accuracy of equipment control improves the accuracy of equipment control.
在一个示例性实施例中,第一确定单元包括:In an exemplary embodiment, the first determining unit includes:
第一获取模块,设置为获取第一声音信号和第二声音信号的初始互功率谱;The first acquisition module is configured to acquire the initial cross power spectrum of the first sound signal and the second sound signal;
第二获取模块,设置为获取与第一设备和第二设备对应的目标混响增益值,其中,目标混响增益值是用于抑制混响噪声的增益值;The second acquisition module is configured to acquire the target reverberation gain value corresponding to the first device and the second device, where the target reverberation gain value is a gain value used to suppress reverberation noise;
更新模块,设置为使用目标混响增益值对初始互功率谱进行更新,得到目标互功率谱。The update module is set to use the target reverberation gain value to update the initial cross power spectrum to obtain the target cross power spectrum.
在一个示例性实施例中,第二获取模块包括:In an exemplary embodiment, the second acquisition module includes:
获取子模块,设置为获取第一设备和第二设备中的每个设备接收的多个声音信号,其中,多个声音信号均是与操作执行信号对应的声音信号;The acquisition submodule is configured to acquire multiple sound signals received by each of the first device and the second device, where the multiple sound signals are sound signals corresponding to the operation execution signal;
第一确定子模块,设置为确定多个声音信号的相干函数,得到目标相干函数;The first determination sub-module is configured to determine the coherence functions of multiple sound signals and obtain the target coherence function;
估计子模块,设置为根据目标相干函数对每个设备的混响抑制系数进行估计,得到目标混响抑制系数;The estimation submodule is configured to estimate the reverberation suppression coefficient of each device based on the target coherence function to obtain the target reverberation suppression coefficient;
第二确定子模块,设置为将最小混响增益值和与目标混响抑制系数对应的混响增益值中的最大值,确定为每个设备的混响增益值。The second determination sub-module is configured to determine the maximum value of the minimum reverberation gain value and the reverberation gain value corresponding to the target reverberation suppression coefficient as the reverberation gain value of each device.
在一个示例性实施例中,获取子模块包括:In an exemplary embodiment, the acquisition sub-module includes:
获取子单元,设置为获取每个设备接收的两个声音信号,其中,两个声音信号是每个设备的同一麦克风阵列的不同麦克风所接收到的、与操作执行信号对应的声音信号。The acquisition subunit is configured to acquire two sound signals received by each device, where the two sound signals are sound signals received by different microphones of the same microphone array of each device and corresponding to the operation execution signal.
在一个示例性实施例中,目标混响增益值包括与第一设备对应的第一混响增益值以及与第二设备对应的第二混响增益值;更新模块包括:In an exemplary embodiment, the target reverberation gain value includes a first reverberation gain value corresponding to the first device and a second reverberation gain value corresponding to the second device; the update module includes:
第一执行子模块,设置为对第一混响增益值和第一频域信号的乘积、以及第二混响增益值和第二频域信号共轭的乘积执行相乘操作,得到混响参考信息,其中,第一频域信号是与第一声音信号对应的频域信号,第二频域信号是与第二声 音信号对应的频域信号;The first execution sub-module is configured to perform a multiplication operation on the product of the first reverberation gain value and the first frequency domain signal, and the product of the second reverberation gain value and the conjugate of the second frequency domain signal, to obtain the reverberation reference Information, wherein the first frequency domain signal is a frequency domain signal corresponding to the first sound signal, and the second frequency domain signal is a frequency domain signal corresponding to the second sound signal;
第二执行子模块,设置为对初始互功率谱和混响参考信息执行加权求和操作,得到目标互功率谱。The second execution submodule is configured to perform a weighted sum operation on the initial cross power spectrum and the reverberation reference information to obtain the target cross power spectrum.
在一个示例性实施例中,第二确定单元包括:In an exemplary embodiment, the second determining unit includes:
第一确定模块,设置为根据目标互功率谱,确定与第一设备和第二设备对应的广义互相关相位变换函数,其中,广义互相关相位变换函数是以第一设备和第二设备之间的时延为变量的函数;The first determination module is configured to determine the generalized cross-correlation phase transformation function corresponding to the first device and the second device according to the target cross power spectrum, wherein the generalized cross-correlation phase transformation function is based on the relationship between the first device and the second device. The delay is a function of variables;
查找模块,设置为在第一时延范围内查找使得广义互相关相位变换函数的函数值最大的时延,得到目标时延,其中,第一时延范围是以时延阈值的相反数和时延阈值为端点的区间,时延阈值为第一设备和第二设备之间的距离除以音速所得到的值。The search module is configured to search for the delay that maximizes the function value of the generalized cross-correlation phase transformation function within the first delay range, and obtain the target delay, wherein the first delay range is the inverse of the delay threshold and the time delay. The delay threshold is the interval between the endpoints, and the delay threshold is the value obtained by dividing the distance between the first device and the second device by the speed of sound.
在一个示例性实施例中,查找模块包括:In an exemplary embodiment, the search module includes:
选取子模块,设置为在第二时延范围内随机选取时延,得到随机时延,其中,第二时延范围是以零和时延阈值为端点的区间;Select the sub-module and set it to randomly select the delay within the second delay range to obtain the random delay, where the second delay range is the interval with zero and the delay threshold as endpoints;
第三确定子模块,设置为确定广义互相关相位变换函数与随机时延对应的第一参数函数值、以及广义互相关相位变换函数与随机时延的相反数对应的第二参考函数值;The third determination submodule is configured to determine the first parameter function value corresponding to the generalized cross-correlation phase transformation function and the random delay, and the second reference function value corresponding to the generalized cross-correlation phase transformation function and the inverse number of the random delay;
第四确定子模块,设置为将第一参考函数值和第二参考函数值中的最大函数值所对应的时延,确定为初始时延;The fourth determination sub-module is configured to determine the delay corresponding to the maximum function value among the first reference function value and the second reference function value as the initial delay;
第三执行子模块,设置为基于初始时延和初始时延的相反数,在第一时延范围内执行插值操作,得到目标时延,其中,目标时延为初始时延以及插值操作所插入的时延中,使得广义互相关相位变换函数的函数值最大的时延。The third execution sub-module is set to perform an interpolation operation within the first delay range based on the initial delay and the inverse of the initial delay to obtain the target delay, where the target delay is the initial delay and the interpolation operation inserted Among the time delays, the time delay that maximizes the function value of the generalized cross-correlation phase transformation function.
在一个示例性实施例中,第三执行子模块包括:In an exemplary embodiment, the third execution sub-module includes:
第一执行子单元,设置为循环执行以下插值步骤,直到满足循环停止条件,其中,循环停止条件包括以下至少之一:执行的插值步骤的次数达到预设次数,初始时延在预设时延范围内,循环结束后的初始时延为目标时延:The first execution subunit is configured to perform the following interpolation steps cyclically until the loop stop condition is met, where the loop stop condition includes at least one of the following: the number of interpolation steps executed reaches the preset number, and the initial delay is within the preset delay Within the range, the initial delay after the end of the loop is the target delay:
确定第一时延,其中,第一时延是在初始时延和时延阈值之间插入的时延;Determine the first delay, where the first delay is the delay inserted between the initial delay and the delay threshold;
在广义互相关相位变换函数与第一时延对应的第一函数值大于第一参考函数值的情况下,确定第二时延,其中,第二时延是在初始时延和第一时延之间插入的时延;When the first function value corresponding to the generalized cross-correlation phase transformation function and the first delay is greater than the first reference function value, the second delay is determined, wherein the second delay is the sum of the initial delay and the first delay. The delay inserted between;
在广义互相关相位变换函数与第二时延对应的第二函数值大于第一函数值的情况下,使用第二时延对初始时延进行更新,得到更新后的初始时延;When the second function value corresponding to the generalized cross-correlation phase transformation function and the second delay is greater than the first function value, use the second delay to update the initial delay to obtain the updated initial delay;
在第二函数值小于第一函数值的情况下,使用第一时延对初始时延进行更新,得到更新后的初始时延;When the second function value is smaller than the first function value, the first delay is used to update the initial delay to obtain the updated initial delay;
在第一函数值小于第一参考函数值、且大于第二参考函数值的情况下,确定第三时延,其中,第三时延是在第一时延和时延阈值之间插入的时延;When the first function value is smaller than the first reference function value and larger than the second reference function value, a third delay is determined, where the third delay is a time inserted between the first delay and the delay threshold. extend;
在广义互相关相位变换函数与第三时延对应的第三函数值大于第一函数值的情况下,使用第三时延对初始时延进行更新,得到更新后的初始时延;When the third function value corresponding to the generalized cross-correlation phase transformation function and the third delay is greater than the first function value, use the third delay to update the initial delay to obtain the updated initial delay;
在第三函数值小于第一函数值的情况下,确定第四时延,并使用第四时延对初始时延进行更新,得到更新后的初始时延,其中,第四时延是零与初始时延的相反数之间插入的时延;When the third function value is smaller than the first function value, determine the fourth delay, and use the fourth delay to update the initial delay to obtain the updated initial delay, where the fourth delay is zero and The delay inserted between the opposite numbers of the initial delay;
在第一函数值小于第二参考函数值的情况下,确定第五时延,其中,第五时延是在时延阈值的相反数和初始时延的相反数之间插入的时延;When the first function value is less than the second reference function value, determine a fifth delay, where the fifth delay is a delay inserted between the inverse of the delay threshold and the inverse of the initial delay;
在广义互相关相位变换函数与第五时延对应的第四函数值大于第一参考函数值的情况下,使用第五时延对初始时延进行更新,得到更新后的初始时延;When the fourth function value corresponding to the generalized cross-correlation phase transformation function and the fifth delay is greater than the first reference function value, use the fifth delay to update the initial delay to obtain the updated initial delay;
在第四函数值小于第一参考函数值的情况下,确定第六时延,并使用第六时延对初始时延进行更新,得到更新后的初始时延,其中,第六时延是在零与初始时延的相反数之间插入的时延。When the fourth function value is less than the first reference function value, determine the sixth delay, and use the sixth delay to update the initial delay to obtain the updated initial delay, where the sixth delay is Delay inserted between zero and the inverse of the initial delay.
在一个示例性实施例中,第三执行子模块包括:In an exemplary embodiment, the third execution sub-module includes:
第二执行子单元,设置为循环执行以下插值步骤,直到满足循环停止条件,其中,循环停止条件包括以下至少之一:执行的插值步骤的次数达到预设次数,初始时延在预设时延范围内,循环结束后的初始时延为目标时延:The second execution subunit is configured to execute the following interpolation steps cyclically until the loop stop condition is met, where the loop stop condition includes at least one of the following: the number of interpolation steps executed reaches the preset number, and the initial delay is within the preset delay Within the range, the initial delay after the end of the loop is the target delay:
确定第一时延,其中,第一时延为初始时延与初始时延的相反数的差值乘以0.6所得到的值与初始时延的和;Determine the first delay, where the first delay is the sum of the difference between the initial delay and the inverse of the initial delay multiplied by 0.6 and the initial delay;
在广义互相关相位变换函数与第一时延对应的第一函数值大于第一参考函数值的情况下,确定第二时延,其中,第二时延为第一时延与初始时延的差值乘以0.8所得到的值与初始时延的和;When the first function value corresponding to the generalized cross-correlation phase transformation function and the first delay is greater than the first reference function value, the second delay is determined, where the second delay is the sum of the first delay and the initial delay. The sum of the value obtained by multiplying the difference by 0.8 and the initial delay;
在广义互相关相位变换函数与第二时延对应的第二函数值大于第一函数值的情况下,使用第二时延对初始时延进行更新,得到更新后的初始时延;When the second function value corresponding to the generalized cross-correlation phase transformation function and the second delay is greater than the first function value, use the second delay to update the initial delay to obtain the updated initial delay;
在第二函数值小于第一函数值的情况下,使用第一时延对初始时延进行更新,得到更新后的初始时延;When the second function value is smaller than the first function value, the first delay is used to update the initial delay to obtain the updated initial delay;
在第一函数值小于第一参考函数值、且大于第二参考函数值的情况下,确定第三时延,其中,第三时延为第一时延与初始时延的差值乘以1.2所得到的值与初始时延的和;When the first function value is smaller than the first reference function value and larger than the second reference function value, determine the third delay, where the third delay is the difference between the first delay and the initial delay multiplied by 1.2 The sum of the obtained value and the initial delay;
在广义互相关相位变换函数与第三时延对应的第三函数值大于第一函数值的情况下,使用第三时延对初始时延进行更新,得到更新后的初始时延;When the third function value corresponding to the generalized cross-correlation phase transformation function and the third delay is greater than the first function value, use the third delay to update the initial delay to obtain the updated initial delay;
在第三函数值小于第一函数值的情况下,确定第四时延,并使用第四时延对初始时延进行更新,得到更新后的初始时延,其中,第四时延为初始时延与初始时延的相反数的差值乘以0.125所得到的值与初始时延的相反数的和;When the third function value is smaller than the first function value, determine the fourth delay, and use the fourth delay to update the initial delay to obtain the updated initial delay, where the fourth delay is the initial delay. The sum of the difference between the delay and the inverse of the initial delay multiplied by 0.125 and the inverse of the initial delay;
在第一函数值小于第二参考函数值的情况下,确定第五时延,其中,第五时延为初始时延的相反数与初始时延的差值乘以1.2所得到的值与初始时延的和;When the first function value is less than the second reference function value, the fifth delay is determined, where the fifth delay is the difference between the inverse of the initial delay and the initial delay multiplied by 1.2 and the value obtained by the initial delay. sum of delays;
在广义互相关相位变换函数与第五时延对应的第四函数值大于第一参考函数值的情况下,使用第五时延对初始时延进行更新,得到更新后的初始时延;When the fourth function value corresponding to the generalized cross-correlation phase transformation function and the fifth delay is greater than the first reference function value, use the fifth delay to update the initial delay to obtain the updated initial delay;
在第四函数值小于第一参考函数值的情况下,确定第六时延,并使用第六时延对初始时延进行更新,得到更新后的初始时延,其中,第六时延为初始时延与初始时延的相反数的差值乘以0.125所得到的值与初始时延的相反数的和。When the fourth function value is less than the first reference function value, determine the sixth delay, and use the sixth delay to update the initial delay to obtain the updated initial delay, where the sixth delay is the initial The sum of the difference between the delay and the inverse of the initial delay multiplied by 0.125 and the inverse of the initial delay.
在一个示例性实施例中,执行单元包括:In an exemplary embodiment, execution units include:
第二确定模块,设置为在目标时延为正的情况下,将第二设备确定为目标设备;The second determination module is configured to determine the second device as the target device when the target delay is positive;
第三确定模块,设置为在目标时延为负的情况下,将第一设备确定为目标设备。The third determination module is configured to determine the first device as the target device when the target delay is negative.
此处需要说明的是,上述模块与对应的步骤所实现的示例和应用场景相同,但不限于上述实施例所公开的内容。需要说明的是,上述模块作为装置的一部分可以运行在如图1所示的硬件环境中,可以通过软件实现,也可以通过硬件实现,其中,硬件环境包括网络环境。It should be noted here that the examples and application scenarios implemented by the above modules and corresponding steps are the same, but are not limited to the contents disclosed in the above embodiments. It should be noted that the above module, as part of the device, can run in the hardware environment as shown in Figure 1, and can be implemented by software or hardware, where the hardware environment includes a network environment.
根据本公开实施例的又一个方面,还提供了一种存储介质。可选地,在本实施例中,上述存储介质可以用于执行本公开实施例中上述任一项智能设备的控制方法的程序代码。According to yet another aspect of the embodiments of the present disclosure, a storage medium is also provided. Optionally, in this embodiment, the above-mentioned storage medium can be used to execute the program code of any of the above-mentioned control methods of the smart device in the embodiment of the present disclosure.
可选地,在本实施例中,上述存储介质可以位于上述实施例所示的网络中的多个网络设备中的至少一个网络设备上。Optionally, in this embodiment, the above storage medium may be located on at least one network device among multiple network devices in the network shown in the above embodiment.
可选地,在本实施例中,存储介质被设置为存储用于执行以下步骤的程序代码:Optionally, in this embodiment, the storage medium is configured to store program codes for performing the following steps:
S1,获取第一设备接收的第一声音信号和第二设备接收到的第二声音信号,其中,第一声音信号和第二声音信号是与目标对象发出的操作执行信号对应的声音信号;S1, obtain the first sound signal received by the first device and the second sound signal received by the second device, where the first sound signal and the second sound signal are sound signals corresponding to the operation execution signal sent by the target object;
S2,根据第一声音信号和第二声音信号,确定第一声音信号和第二声音信号的目标互功率谱;S2, determine the target cross power spectrum of the first sound signal and the second sound signal according to the first sound signal and the second sound signal;
S3,根据目标互功率谱,确定第一设备和第二设备之间的目标时延,其中,目标时延为操作执行信号到达第一设备和到达第二设备的时间差值;S3. Determine the target delay between the first device and the second device according to the target cross power spectrum, where the target delay is the time difference between the operation execution signal arriving at the first device and the second device;
S4,根据目标时延,从第一设备和第二设备中确定目标设备,并控制目标设备执行操作执行信号所指示的设备操作,其中,目标设备距离目标对象最近的设备。S4. According to the target delay, determine the target device from the first device and the second device, and control the target device to perform the device operation indicated by the operation execution signal, where the target device is the device closest to the target object.
可选地,本实施例中的具体示例可以参考上述实施例中所描述的示例,本实施例中对此不再赘述。Optionally, for specific examples in this embodiment, reference may be made to the examples described in the above embodiments, which will not be described again in this embodiment.
可选地,在本实施例中,上述存储介质可以包括但不限于:U盘、ROM、RAM、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。Optionally, in this embodiment, the above-mentioned storage medium may include but is not limited to: U disk, ROM, RAM, mobile hard disk, magnetic disk or optical disk and other various media that can store program codes.
根据本公开实施例的又一个方面,还提供了一种用于实施上述智能设备的控制方法的电子装置,该电子装置可以是服务器、终端、或者其组合。According to yet another aspect of the embodiments of the present disclosure, an electronic device for implementing the above control method of an intelligent device is also provided. The electronic device may be a server, a terminal, or a combination thereof.
图5是根据本公开实施例的一种可选的电子装置的结构框图,如图5所示,包括处理器502、通信接口504、存储器506和通信总线508,其中,处理器502、通信接口504和存储器506通过通信总线508完成相互间的通信,其中,Figure 5 is a structural block diagram of an optional electronic device according to an embodiment of the present disclosure. As shown in Figure 5, it includes a processor 502, a communication interface 504, a memory 506 and a communication bus 508. The processor 502, the communication interface 504 and memory 506 complete communication with each other through communication bus 508, where,
存储器506,设置为存储计算机程序; Memory 506 configured to store computer programs;
处理器502,设置为执行存储器506上所存放的计算机程序时,实现如下步骤:When the processor 502 is configured to execute the computer program stored on the memory 506, it implements the following steps:
S1,获取第一设备接收的第一声音信号和第二设备接收到的第二声音信号,其中,第一声音信号和第二声音信号是与目标对象发出的操作执行信号对应的声音信号;S1, obtain the first sound signal received by the first device and the second sound signal received by the second device, where the first sound signal and the second sound signal are sound signals corresponding to the operation execution signal sent by the target object;
S2,根据第一声音信号和第二声音信号,确定第一声音信号和第二声音信号的目标互功率谱;S2, determine the target cross power spectrum of the first sound signal and the second sound signal according to the first sound signal and the second sound signal;
S3,根据目标互功率谱,确定第一设备和第二设备之间的目标时延,其中,目标时延为操作执行信号到达第一设备和到达第二设备的时间差值;S3. Determine the target delay between the first device and the second device according to the target cross power spectrum, where the target delay is the time difference between the operation execution signal arriving at the first device and the second device;
S4,根据目标时延,从第一设备和第二设备中确定目标设备,并控制目标设备执行操作执行信号所指示的设备操作,其中,目标设备距离目标对象最近的设备。S4. According to the target delay, determine the target device from the first device and the second device, and control the target device to perform the device operation indicated by the operation execution signal, where the target device is the device closest to the target object.
可选地,通信总线可以是PCI(Peripheral Component Interconnect,外设部件互连标准)总线、或EISA(Extended Industry Standard Architecture,扩展工业标准结构)总线等。该通信总线可以分为地址总线、数据总线、控制总线等。为便于表示,图5中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。通信接口用于上述电子装置与其他设备之间的通信。Optionally, the communication bus may be a PCI (Peripheral Component Interconnect, Peripheral Component Interconnect Standard) bus, or an EISA (Extended Industry Standard Architecture, Extended Industrial Standard Architecture) bus, etc. The communication bus can be divided into address bus, data bus, control bus, etc. For ease of presentation, only one thick line is used in Figure 5, but it does not mean that there is only one bus or one type of bus. The communication interface is used for communication between the above-mentioned electronic device and other equipment.
存储器可以包括RAM,也可以包括非易失性存储器(non-volatile memory),例如,至少一个磁盘存储器。可选地,存储器还可以是至少一个位于远离前述处理器的存储装置。The memory may include RAM or non-volatile memory, such as at least one disk memory. Optionally, the memory may also be at least one storage device located remotely from the aforementioned processor.
作为一种示例,上述存储器506中可以但不限于包括上述智能设备的控制装 置中的获取单元402、第一确定单元404、第二确定406以及执行单元408。此外,还可以包括但不限于上述智能设备的控制装置中的其他模块单元,本示例中不再赘述。As an example, the memory 506 may include, but is not limited to, the acquisition unit 402, the first determination unit 404, the second determination 406 and the execution unit 408 in the control device of the smart device. In addition, it may also include but is not limited to other modular units in the control device of the above-mentioned smart device, which will not be described again in this example.
上述处理器可以是通用处理器,可以包含但不限于:CPU(Central Processing Unit,中央处理器)、NP(Network Processor,网络处理器)等;还可以是DSP(Digital Signal Processing,数字信号处理器)、ASIC(Application Specific Integrated Circuit,专用集成电路)、FPGA(Field-Programmable Gate Array,现场可编程门阵列)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。The above-mentioned processor can be a general-purpose processor, which can include but is not limited to: CPU (Central Processing Unit, central processing unit), NP (Network Processor, network processor), etc.; it can also be a DSP (Digital Signal Processing, digital signal processor) ), ASIC (Application Specific Integrated Circuit), FPGA (Field-Programmable Gate Array, field programmable gate array) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
可选地,本实施例中的具体示例可以参考上述实施例中所描述的示例,本实施例在此不再赘述。Optionally, for specific examples in this embodiment, reference may be made to the examples described in the above embodiments, which will not be described again in this embodiment.
本领域普通技术人员可以理解,图5所示的结构仅为示意,实施上述智能设备的控制方法的设备可以是终端设备,该终端设备可以是智能手机(如Android手机、iOS手机等)、平板电脑、掌上电脑以及移动互联网设备(Mobile Internet Devices,MID)、PAD等终端设备。图5其并不对上述电子装置的结构造成限定。例如,电子装置还可包括比图5中所示更多或者更少的组件(如网络接口、显示装置等),或者具有与图5所示的不同的配置。Those of ordinary skill in the art can understand that the structure shown in Figure 5 is only illustrative. The device that implements the above control method for smart devices can be a terminal device, and the terminal device can be a smart phone (such as an Android phone, iOS phone, etc.), a tablet Computers, handheld computers, and mobile Internet devices (Mobile Internet Devices, MID), PAD and other terminal equipment. FIG. 5 does not limit the structure of the above-mentioned electronic device. For example, the electronic device may also include more or fewer components (such as network interfaces, display devices, etc.) than shown in FIG. 5 , or have a different configuration than that shown in FIG. 5 .
本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令终端设备相关的硬件来完成,该程序可以存储于一计算机可读存储介质中,存储介质可以包括:闪存盘、ROM、RAM、磁盘或光盘等。Those of ordinary skill in the art can understand that all or part of the steps in the various methods of the above embodiments can be completed by instructing the hardware related to the terminal device through a program. The program can be stored in a computer-readable storage medium, and the storage medium can Including: flash disk, ROM, RAM, magnetic disk or optical disk, etc.
上述本公开实施例序号仅仅为了描述,不代表实施例的优劣。The above serial numbers of the embodiments of the present disclosure are only for description and do not represent the advantages and disadvantages of the embodiments.
上述实施例中的集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在上述计算机可读取的存储介质中。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在存储介质中,包括若干指令用以使得一台或多台计算机设备(可为个人计算机、服务器或者网络设备等)执行本公开各个实施例所述方法的全部或部分步骤。If the integrated units in the above embodiments are implemented in the form of software functional units and sold or used as independent products, they can be stored in the above computer-readable storage medium. Based on this understanding, the technical solution of the present disclosure is essentially or contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, It includes several instructions to cause one or more computer devices (which can be personal computers, servers or network devices, etc.) to execute all or part of the steps of the methods described in various embodiments of the present disclosure.
在本公开的上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。In the above-mentioned embodiments of the present disclosure, each embodiment is described with its own emphasis. For parts that are not described in detail in a certain embodiment, please refer to the relevant descriptions of other embodiments.
在本公开所提供的几个实施例中,应该理解到,所揭露的客户端,可通过其它的方式实现。其中,以上所描述的装置实施例仅仅是示意性的,例如所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,单元或模块的间接耦合或通信连接,可以是电性或其它的形式。In the several embodiments provided by this disclosure, it should be understood that the disclosed client can be implemented in other ways. Among them, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or may be Integrated into another system, or some features can be ignored, or not implemented. On the other hand, the coupling or direct coupling or communication connection between each other shown or discussed may be through some interfaces, and the indirect coupling or communication connection of the units or modules may be in electrical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例中所提供的方案的目的。The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place or distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution provided in this embodiment.
另外,在本公开各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in various embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above integrated units can be implemented in the form of hardware or software functional units.
以上所述仅是本公开的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本公开原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本公开的保护范围。The above are only preferred embodiments of the present disclosure. It should be pointed out that for those of ordinary skill in the art, several improvements and modifications can be made without departing from the principles of the present disclosure. These improvements and modifications can also be made. should be regarded as the scope of protection of this disclosure.

Claims (12)

  1. 一种智能设备的控制方法,包括:A control method for smart devices, including:
    获取第一设备接收的第一声音信号和第二设备接收到的第二声音信号,其中,所述第一声音信号和所述第二声音信号是与目标对象发出的操作执行信号对应的声音信号;Obtain the first sound signal received by the first device and the second sound signal received by the second device, wherein the first sound signal and the second sound signal are sound signals corresponding to the operation execution signal issued by the target object. ;
    根据所述第一声音信号和所述第二声音信号,确定所述第一声音信号和所述第二声音信号的目标互功率谱;Determine target cross power spectra of the first sound signal and the second sound signal according to the first sound signal and the second sound signal;
    根据所述目标互功率谱,确定所述第一设备和所述第二设备之间的目标时延,其中,所述目标时延为所述操作执行信号到达所述第一设备和到达所述第二设备的时间差值;According to the target cross power spectrum, a target delay between the first device and the second device is determined, wherein the target delay is the time when the operation execution signal reaches the first device and when it reaches the The time difference of the second device;
    根据所述目标时延,从所述第一设备和所述第二设备中确定目标设备,并控制所述目标设备执行所述操作执行信号所指示的设备操作,其中,所述目标设备距离所述目标对象最近的设备。According to the target delay, a target device is determined from the first device and the second device, and the target device is controlled to perform the device operation indicated by the operation execution signal, wherein the target device is at a distance The nearest device to the target object.
  2. 根据权利要求1所述的方法,其中,所述根据所述第一声音信号和所述第二声音信号,确定所述第一声音信号和所述第二声音信号的目标互功率谱,包括:The method according to claim 1, wherein determining the target cross power spectrum of the first sound signal and the second sound signal according to the first sound signal and the second sound signal includes:
    获取所述第一声音信号和所述第二声音信号的初始互功率谱;Obtain the initial cross power spectrum of the first sound signal and the second sound signal;
    获取与所述第一设备和所述第二设备对应的目标混响增益值,其中,所述目标混响增益值是用于抑制混响噪声的增益值;Obtain a target reverberation gain value corresponding to the first device and the second device, wherein the target reverberation gain value is a gain value used to suppress reverberation noise;
    使用所述目标混响增益值对所述初始互功率谱进行更新,得到所述目标互功率谱。The initial cross power spectrum is updated using the target reverberation gain value to obtain the target cross power spectrum.
  3. 根据权利要求2所述的方法,其中,所述获取与所述第一设备和所述第二设备对应的目标混响增益值,包括:The method according to claim 2, wherein said obtaining the target reverberation gain value corresponding to the first device and the second device includes:
    获取所述第一设备和所述第二设备中的每个设备接收的多个声音信号,其中,所述多个声音信号均是与所述操作执行信号对应的声音信号;Obtain a plurality of sound signals received by each of the first device and the second device, wherein the plurality of sound signals are sound signals corresponding to the operation execution signal;
    确定所述多个声音信号的相干函数,得到目标相干函数;Determine coherence functions of the plurality of sound signals to obtain a target coherence function;
    根据所述目标相干函数对所述每个设备的混响抑制系数进行估计,得到目标混响抑制系数;Estimate the reverberation suppression coefficient of each device according to the target coherence function to obtain the target reverberation suppression coefficient;
    将最小混响增益值和与所述目标混响抑制系数对应的混响增益值中的最大值,确定为所述每个设备的混响增益值。The maximum value of the minimum reverberation gain value and the reverberation gain value corresponding to the target reverberation suppression coefficient is determined as the reverberation gain value of each device.
  4. 根据权利要求3所述的方法,其中,所述获取所述第一设备和所述第二设备中的每个设备接收的多个声音信号,包括:The method of claim 3, wherein the obtaining a plurality of sound signals received by each of the first device and the second device includes:
    获取所述每个设备接收的两个声音信号,其中,所述两个声音信号是所述每个设备的同一麦克风阵列的不同麦克风所接收到的、与所述操作执行信号对应的声音信号。Two sound signals received by each device are obtained, wherein the two sound signals are sound signals received by different microphones of the same microphone array of each device and corresponding to the operation execution signal.
  5. 根据权利要求2所述的方法,其中,所述目标混响增益值包括与所述第一设备对应的第一混响增益值以及与所述第二设备对应的第二混响增益值;所述使用所述目标混响增益值对所述初始互功率谱进行更新,得到所述目标互功率谱,包括:The method of claim 2, wherein the target reverberation gain value includes a first reverberation gain value corresponding to the first device and a second reverberation gain value corresponding to the second device; The step of using the target reverberation gain value to update the initial cross power spectrum to obtain the target cross power spectrum includes:
    对所述第一混响增益值和第一频域信号的乘积、以及所述第二混响增益值和第二频域信号共轭的乘积执行相乘操作,得到混响参考信息,其中,所述第一频域信号是与所述第一声音信号对应的频域信号,所述第二频域信号是与所述第二声音信号对应的频域信号;Perform a multiplication operation on the product of the first reverberation gain value and the first frequency domain signal, and the product of the second reverberation gain value and the conjugate of the second frequency domain signal to obtain reverberation reference information, where, The first frequency domain signal is a frequency domain signal corresponding to the first sound signal, and the second frequency domain signal is a frequency domain signal corresponding to the second sound signal;
    对所述初始互功率谱和所述混响参考信息执行加权求和操作,得到所述目标互功率谱。A weighted sum operation is performed on the initial cross power spectrum and the reverberation reference information to obtain the target cross power spectrum.
  6. 根据权利要求1所述的方法,其中,所述根据所述目标互功率谱,确定所述第一设备和所述第二设备之间的目标时延,包括:The method according to claim 1, wherein determining the target delay between the first device and the second device according to the target cross power spectrum includes:
    根据所述目标互功率谱,确定与所述第一设备和所述第二设备对应的广义互相关相位变换函数,其中,所述广义互相关相位变换函数是以所述第一设备和所述第二设备之间的时延为变量的函数;According to the target cross power spectrum, a generalized cross-correlation phase transformation function corresponding to the first device and the second device is determined, wherein the generalized cross-correlation phase transformation function is based on the first device and the second device. The delay between the second device is a function of the variable;
    在第一时延范围内查找使得所述广义互相关相位变换函数的函数值最大的时延,得到所述目标时延,其中,所述第一时延范围是以时延阈值的相反数和时延阈值为端点的区间,所述时延阈值为所述第一设备和所述第二设备之间的距离除以音速所得到的值。Find the delay that maximizes the function value of the generalized cross-correlation phase transformation function within the first delay range to obtain the target delay, wherein the first delay range is the sum of the inverse of the delay threshold The delay threshold is an interval of endpoints, and the delay threshold is a value obtained by dividing the distance between the first device and the second device by the speed of sound.
  7. 根据权利要求6所述的方法,其中,所述在第一时延范围内查找使得所述广义互相关相位变换函数的函数值最大的时延,得到所述目标时延,包括:The method according to claim 6, wherein said finding the delay within the first delay range that maximizes the function value of the generalized cross-correlation phase transformation function to obtain the target delay includes:
    在第二时延范围内随机选取时延,得到随机时延,其中,所述第二时延范围是以零和所述时延阈值为端点的区间;Randomly select a delay within a second delay range to obtain a random delay, wherein the second delay range is an interval with zero and the delay threshold as endpoints;
    确定所述广义互相关相位变换函数与所述随机时延对应的第一参数函数值、以及所述广义互相关相位变换函数与所述随机时延的相反数对应的第二参考函数值;Determine a first parameter function value corresponding to the generalized cross-correlation phase transformation function and the random time delay, and a second reference function value corresponding to the inverse number of the generalized cross-correlation phase transformation function and the random time delay;
    将所述第一参考函数值和所述第二参考函数值中的最大函数值所对应的时延,确定为初始时延;Determine the delay corresponding to the maximum function value among the first reference function value and the second reference function value as the initial delay;
    基于所述初始时延和所述初始时延的相反数,在所述第一时延范围内执行插值操作,得到所述目标时延,其中,所述目标时延为所述初始时延以及所述插值操作所插入的时延中,使得所述广义互相关相位变换函数的函数值最大的时延。Based on the initial delay and the inverse of the initial delay, an interpolation operation is performed within the first delay range to obtain the target delay, where the target delay is the initial delay and Among the delays inserted by the interpolation operation, the delay that maximizes the function value of the generalized cross-correlation phase transformation function.
  8. 根据权利要求7所述的方法,其中,所述基于所述初始时延和所述初始时延的相反数,在所述第一时延范围内执行插值操作,得到所述目标时延,包括:The method according to claim 7, wherein the interpolation operation is performed within the first delay range based on the initial delay and the inverse number of the initial delay to obtain the target delay, including :
    循环执行以下插值步骤,直到满足循环停止条件,其中,所述循环停止条件包括以下至少之一:执行的所述插值步骤的次数达到预设次数,所述初始时延在预设时延范围内,循环结束后的所述初始时延为所述目标时延:The following interpolation steps are executed cyclically until a loop stop condition is met, wherein the loop stop condition includes at least one of the following: the number of interpolation steps performed reaches a preset number, and the initial delay is within a preset delay range. , the initial delay after the end of the cycle is the target delay:
    确定第一时延,其中,所述第一时延是在所述初始时延和所述时延阈值之间插入的时延;Determine a first delay, wherein the first delay is a delay inserted between the initial delay and the delay threshold;
    在所述广义互相关相位变换函数与所述第一时延对应的第一函数值大于所述第一参考函数值的情况下,确定第二时延,其中,所述第二时延是在所述初始时延和所述第一时延之间插入的时延;When the first function value corresponding to the generalized cross-correlation phase transformation function and the first delay is greater than the first reference function value, a second delay is determined, wherein the second delay is The delay inserted between the initial delay and the first delay;
    在所述广义互相关相位变换函数与所述第二时延对应的第二函数值大于所述第一函数值的情况下,使用所述第二时延对所述初始时延进行更新,得到更新后的所述初始时延;When the second function value corresponding to the generalized cross-correlation phase transformation function and the second delay is greater than the first function value, the second delay is used to update the initial delay, and we obtain The updated initial delay;
    在所述第二函数值小于所述第一函数值的情况下,使用所述第一时延对所述初始时延进行更新,得到更新后的所述初始时延;When the second function value is less than the first function value, use the first delay to update the initial delay to obtain the updated initial delay;
    在所述第一函数值小于所述第一参考函数值、且大于所述第二参考函数值的情况下,确定第三时延,其中,所述第三时延是在所述第一时延和所述时延阈值之间插入的时延;When the first function value is less than the first reference function value and greater than the second reference function value, a third time delay is determined, wherein the third time delay is at the first time The delay inserted between the delay and the delay threshold;
    在所述广义互相关相位变换函数与所述第三时延对应的第三函数值大于所述第一函数值的情况下,使用所述第三时延对所述初始时延进行更新,得到更新后的所述初始时延;When the third function value corresponding to the generalized cross-correlation phase transformation function and the third time delay is greater than the first function value, the third time delay is used to update the initial time delay, and we obtain The updated initial delay;
    在所述第三函数值小于所述第一函数值的情况下,确定第四时延,并使用所述第四时延对所述初始时延进行更新,得到更新后的所述初始时延,其中,所述第四时延是零与所述初始时延的相反数之间插入的时延;When the third function value is less than the first function value, determine a fourth delay, and use the fourth delay to update the initial delay to obtain the updated initial delay. , wherein the fourth delay is the delay inserted between zero and the inverse of the initial delay;
    在所述第一函数值小于所述第二参考函数值的情况下,确定第五时延,其中,所述第五时延是在所述时延阈值的相反数和所述初始时延的相反数之间插入的时延;When the first function value is less than the second reference function value, a fifth delay is determined, wherein the fifth delay is the inverse of the delay threshold and the initial delay. The delay inserted between opposite numbers;
    在所述广义互相关相位变换函数与所述第五时延对应的第四函数值大于所述第一参考函数值的情况下,使用所述第五时延对所述初始时延进行更新,得到更新后的所述初始时延;When the fourth function value corresponding to the generalized cross-correlation phase transformation function and the fifth delay is greater than the first reference function value, the fifth delay is used to update the initial delay, Obtain the updated initial delay;
    在所述第四函数值小于所述第一参考函数值的情况下,确定第六时延,并使用所述第六时延对所述初始时延进行更新,得到更新后的所述初始时延,其中,所述第六时延是在零与所述初始时延的相反数之间插入的时延。When the fourth function value is less than the first reference function value, determine a sixth delay, and use the sixth delay to update the initial delay to obtain the updated initial delay. delay, wherein the sixth delay is a delay inserted between zero and the inverse of the initial delay.
  9. 根据权利要求7所述的方法,其中,所述基于所述初始时延和所述初始时延的相反数,在所述第一时延范围内执行插值操作,得到所述目标时延,包括:The method according to claim 7, wherein the interpolation operation is performed within the first delay range based on the initial delay and the inverse number of the initial delay to obtain the target delay, including :
    循环执行以下插值步骤,直到满足循环停止条件,其中,所述循环停止条件包括以下至少之一:执行的所述插值步骤的次数达到预设次数,所述初始时延在预设时延范围内,循环结束后的所述初始时延为所述目标时延:The following interpolation steps are executed cyclically until a loop stop condition is met, wherein the loop stop condition includes at least one of the following: the number of interpolation steps performed reaches a preset number, and the initial delay is within a preset delay range. , the initial delay after the end of the cycle is the target delay:
    确定第一时延,其中,所述第一时延为所述初始时延与所述初始时延的相反数的差值乘以0.6所得到的值与所述初始时延的和;Determine the first delay, wherein the first delay is the sum of the difference between the initial delay and the inverse of the initial delay multiplied by 0.6 and the initial delay;
    在所述广义互相关相位变换函数与所述第一时延对应的第一函数值大于所述第一参考函数值的情况下,确定第二时延,其中,所述第二时延为所述第一时延与所述初始时延的差值乘以0.8所得到的值与所述初始时延的和;When the first function value corresponding to the generalized cross-correlation phase transformation function and the first delay is greater than the first reference function value, a second delay is determined, wherein the second delay is The sum of the difference between the first delay and the initial delay multiplied by 0.8 and the initial delay;
    在所述广义互相关相位变换函数与所述第二时延对应的第二函数值大于所述第一函数值的情况下,使用所述第二时延对所述初始时延进行更新,得到更新后的所述初始时延;When the second function value corresponding to the generalized cross-correlation phase transformation function and the second delay is greater than the first function value, the second delay is used to update the initial delay, and we obtain The updated initial delay;
    在所述第二函数值小于所述第一函数值的情况下,使用所述第一时延对所 述初始时延进行更新,得到更新后的所述初始时延;When the second function value is less than the first function value, use the first delay to update the initial delay to obtain the updated initial delay;
    在所述第一函数值小于所述第一参考函数值、且大于所述第二参考函数值的情况下,确定第三时延,其中,所述第三时延为所述第一时延与所述初始时延的差值乘以1.2所得到的值与所述初始时延的和;When the first function value is smaller than the first reference function value and larger than the second reference function value, a third delay is determined, wherein the third delay is the first delay The sum of the value obtained by multiplying the difference from the initial delay by 1.2 and the initial delay;
    在所述广义互相关相位变换函数与所述第三时延对应的第三函数值大于所述第一函数值的情况下,使用所述第三时延对所述初始时延进行更新,得到更新后的所述初始时延;When the third function value corresponding to the generalized cross-correlation phase transformation function and the third time delay is greater than the first function value, the third time delay is used to update the initial time delay, and we obtain The updated initial delay;
    在所述第三函数值小于所述第一函数值的情况下,确定第四时延,并使用所述第四时延对所述初始时延进行更新,得到更新后的所述初始时延,其中,所述第四时延为所述初始时延与所述初始时延的相反数的差值乘以0.125所得到的值与所述初始时延的相反数的和;When the third function value is less than the first function value, determine a fourth delay, and use the fourth delay to update the initial delay to obtain the updated initial delay. , wherein the fourth delay is the sum of the difference between the initial delay and the inverse of the initial delay multiplied by 0.125 and the inverse of the initial delay;
    在所述第一函数值小于所述第二参考函数值的情况下,确定第五时延,其中,所述第五时延为所述初始时延的相反数与所述初始时延的差值乘以1.2所得到的值与所述初始时延的和;When the first function value is less than the second reference function value, a fifth time delay is determined, wherein the fifth time delay is the difference between the inverse of the initial time delay and the initial time delay. The sum of the value obtained by multiplying the value by 1.2 and the initial delay;
    在所述广义互相关相位变换函数与所述第五时延对应的第四函数值大于所述第一参考函数值的情况下,使用所述第五时延对所述初始时延进行更新,得到更新后的所述初始时延;When the fourth function value corresponding to the generalized cross-correlation phase transformation function and the fifth delay is greater than the first reference function value, the fifth delay is used to update the initial delay, Obtain the updated initial delay;
    在所述第四函数值小于所述第一参考函数值的情况下,确定第六时延,并使用所述第六时延对所述初始时延进行更新,得到更新后的所述初始时延,其中,所述第六时延为所述初始时延与所述初始时延的相反数的差值乘以0.125所得到的值与所述初始时延的相反数的和。When the fourth function value is less than the first reference function value, determine a sixth delay, and use the sixth delay to update the initial delay to obtain the updated initial delay. delay, wherein the sixth delay is the sum of the difference between the initial delay and the inverse of the initial delay multiplied by 0.125 and the inverse of the initial delay.
  10. 根据权利要求1至9中任一项所述的方法,其中,所述根据所述目标时延,从所述第一设备和所述第二设备中确定目标设备,包括:The method according to any one of claims 1 to 9, wherein determining the target device from the first device and the second device according to the target delay includes:
    在所述目标时延为正的情况下,将所述第二设备确定为所述目标设备;When the target delay is positive, determine the second device as the target device;
    在所述目标时延为负的情况下,将所述第一设备确定为所述目标设备。When the target delay is negative, the first device is determined as the target device.
  11. 一种计算机可读的存储介质,所述计算机可读的存储介质包括存储的程序,其中,所述程序运行时执行权利要求1至10中任一项所述的方法。A computer-readable storage medium includes a stored program, wherein the method of any one of claims 1 to 10 is executed when the program is run.
  12. 一种电子装置,包括存储器和处理器,所述存储器中存储有计算机程序,所述 处理器被设置为通过所述计算机程序执行权利要求1至10中任一项所述的方法。An electronic device includes a memory and a processor, a computer program is stored in the memory, and the processor is configured to execute the method according to any one of claims 1 to 10 through the computer program.
PCT/CN2022/095335 2022-04-29 2022-05-26 Control method for smart device, and storage medium and electronic apparatus WO2023206686A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210469005.3A CN117014246A (en) 2022-04-29 2022-04-29 Control method of intelligent equipment, storage medium and electronic device
CN202210469005.3 2022-04-29

Publications (1)

Publication Number Publication Date
WO2023206686A1 true WO2023206686A1 (en) 2023-11-02

Family

ID=88517100

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/095335 WO2023206686A1 (en) 2022-04-29 2022-05-26 Control method for smart device, and storage medium and electronic apparatus

Country Status (2)

Country Link
CN (1) CN117014246A (en)
WO (1) WO2023206686A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040037436A1 (en) * 2002-08-26 2004-02-26 Yong Rui System and process for locating a speaker using 360 degree sound source localization
CN107271963A (en) * 2017-06-22 2017-10-20 广东美的制冷设备有限公司 The method and apparatus and air conditioner of auditory localization
CN108962263A (en) * 2018-06-04 2018-12-07 百度在线网络技术(北京)有限公司 A kind of smart machine control method and system
CN112735459A (en) * 2019-10-28 2021-04-30 清华大学 Voice signal enhancement method, server and system based on distributed microphones
CN113889138A (en) * 2021-06-07 2022-01-04 成都启英泰伦科技有限公司 Target voice extraction method based on double-microphone array

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040037436A1 (en) * 2002-08-26 2004-02-26 Yong Rui System and process for locating a speaker using 360 degree sound source localization
CN107271963A (en) * 2017-06-22 2017-10-20 广东美的制冷设备有限公司 The method and apparatus and air conditioner of auditory localization
CN108962263A (en) * 2018-06-04 2018-12-07 百度在线网络技术(北京)有限公司 A kind of smart machine control method and system
CN112735459A (en) * 2019-10-28 2021-04-30 清华大学 Voice signal enhancement method, server and system based on distributed microphones
CN113889138A (en) * 2021-06-07 2022-01-04 成都启英泰伦科技有限公司 Target voice extraction method based on double-microphone array

Also Published As

Publication number Publication date
CN117014246A (en) 2023-11-07

Similar Documents

Publication Publication Date Title
CN110211580B (en) Multi-intelligent-device response method, device, system and storage medium
US20240096348A1 (en) Linear Filtering for Noise-Suppressed Speech Detection
EP3398298B1 (en) Voice-controlled light switches
WO2023273747A1 (en) Wake-up method and apparatus for smart device, storage medium, and electronic device
US20200105295A1 (en) Linear Filtering for Noise-Suppressed Speech Detection Via Multiple Network Microphone Devices
WO2023246224A1 (en) Method and apparatus for determining orientation of sound source, storage medium, and electronic apparatus
CN110261816B (en) Method and device for estimating direction of arrival of voice
US10679617B2 (en) Voice enhancement in audio signals through modified generalized eigenvalue beamformer
CN110249637B (en) Audio capture apparatus and method using beamforming
CN111445919B (en) Speech enhancement method, system, electronic device, and medium incorporating AI model
CN109087663A (en) signal processor
CN110554357A (en) Sound source positioning method and device
WO2017160294A1 (en) Spectral estimation of room acoustic parameters
CN105210389A (en) Method and apparatus for determining a position of a microphone
US11749294B2 (en) Directional speech separation
US10871543B2 (en) Direction of arrival estimation of acoustic-signals from acoustic source using sub-array selection
WO2023206686A1 (en) Control method for smart device, and storage medium and electronic apparatus
CN110782884A (en) Far-field pickup noise processing method, device, equipment and storage medium
CN110610706A (en) Sound signal acquisition method and device, electrical equipment control method and electrical equipment
CN113393853B (en) Method and apparatus for processing mixed sound signal, storage medium, and electronic apparatus
Müller et al. Head-orientation-based device selection: Are you talking to me?
WO2023246223A1 (en) Speech enhancement method and apparatus for distributed wake-up, and storage medium
CN115171703B (en) Distributed voice awakening method and device, storage medium and electronic device
CN112653979A (en) Adaptive dereverberation method and device
CN117789744B (en) Voice noise reduction method and device based on model fusion and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22939519

Country of ref document: EP

Kind code of ref document: A1