CN112309412A

CN112309412A - Method and device for processing signal to be processed and signal processing system

Info

Publication number: CN112309412A
Application number: CN202010120632.7A
Authority: CN
Inventors: 不公告发明人
Original assignee: Beijing ByteDance Network Technology Co Ltd
Current assignee: Beijing ByteDance Network Technology Co Ltd
Priority date: 2020-02-26
Filing date: 2020-02-26
Publication date: 2021-02-02

Abstract

The embodiment of the disclosure discloses a method, a device and a signal processing system for processing a signal to be processed. One embodiment of the method for processing a signal to be processed includes: acquiring a signal to be processed, wherein the distance between the acquisition position of the signal to be processed and the generation position of the interference audio is smaller than a first preset distance threshold; collecting an interference signal interfering with the audio frequency through a target microphone; and eliminating the interference signal from the signal to be processed to obtain a target signal of the target audio. According to the implementation method, the target microphone can be used for collecting the interference signal to finally obtain the target signal, so that the difficulty of voice interaction is reduced, and the accuracy and speed of the voice interaction are improved.

Description

Method and device for processing signal to be processed and signal processing system

Technical Field

The embodiment of the disclosure relates to the technical field of computers, in particular to a method and a device for processing a signal to be processed and a signal processing system.

Background

At present, intelligent hardware is increasingly popular, and many of the intelligent hardware have a voice interaction function.

For example, in the use process of the smart speaker, the prior art usually adopts an echo cancellation algorithm to cancel the noise generated by the speaker and the echo of music. Therefore, the microphone can hear more voice commands of people, and the interference of the loudspeaker is avoided. However, in order to increase the sound effect, the smart sound box often needs to add a guide tube, a passive radiating membrane, a low-frequency sound unit and the like the traditional sound box. And the increase of the guide tube, the passive radiating membrane and the low-frequency sound will further introduce noise.

In addition, the intelligent voice wake-up technology is widely used in products such as intelligent projection, but noise generated by a projection fan is mixed with loudspeaker sound and is often difficult to eliminate.

Disclosure of Invention

The present disclosure proposes a method, an apparatus and a signal processing system for processing a signal to be processed.

In a first aspect, an embodiment of the present disclosure provides a method for processing a signal to be processed, the method including: acquiring a signal to be processed, wherein the distance between the acquisition position of the signal to be processed and the generation position of the interference audio is smaller than a first preset distance threshold; collecting an interference signal interfering with the audio frequency through a target microphone; and eliminating the interference signal from the signal to be processed to obtain a target signal of the target audio.

In some embodiments, the target audio comprises user speech audio; and, the method further comprises: based on the user voice audio, an operation corresponding to the user voice audio is performed.

In some embodiments, based on the user speech audio, performing an operation corresponding to the user speech audio includes at least one of: responding to the voice audio of the user and including the audio of the preset awakening word, and executing the preset awakening operation; generating a reply audio of the user speech audio.

In some embodiments, canceling the interfering signal from the signal to be processed comprises: and eliminating the interference signal from the signal to be processed by adopting an echo elimination algorithm.

In some embodiments, the signal waves of the interfering signal are low frequency harmonics.

In a second aspect, an embodiment of the present disclosure provides an apparatus for processing a signal to be processed, the apparatus including: the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is configured to acquire a signal to be processed, and the distance between the acquisition position of the signal to be processed and the generation position of the interference audio is smaller than a first preset distance threshold; an acquisition unit configured to acquire an interference signal that interferes with the audio through a target microphone; and the elimination unit is configured to eliminate the interference signal from the signal to be processed so as to obtain a target signal of the target audio.

In some embodiments, the target audio comprises user speech audio; and, the apparatus further comprises: and an execution unit configured to execute an operation corresponding to the user voice audio based on the user voice audio.

In some embodiments, the execution unit comprises at least one of: the execution subunit is configured to execute a preset wake-up operation in response to the user voice audio containing the audio of the preset wake-up word; a generating subunit configured to generate a reply audio of the user speech audio.

In some embodiments, the elimination unit comprises: a cancellation subunit configured to employ an echo cancellation algorithm to cancel the interference signal from the signal to be processed.

In a third aspect, embodiments of the present disclosure provide a signal processing system comprising an interfering audio generating module, a microphone, and a signal processing module, wherein: in the operation process of the signal processing system, an interference audio generating module generates an interference audio of a target audio; the microphone is used for collecting an interference signal which interferes the audio frequency; the signal processing module is used for acquiring a signal to be processed, wherein the distance between the acquisition position of the signal to be processed and the position of the interference audio generation module is smaller than a first preset distance threshold; and eliminating the interference signal from the signal to be processed to obtain a target signal of the target audio.

In some embodiments, the distance between the interfering audio generating module and the microphone is less than a second preset distance threshold.

In some embodiments, the signal processing system includes an enclosed space structure, and the interfering audio generating module is disposed within the enclosed space structure and the microphone.

In some embodiments, the signal processing system is a projector and the interfering audio generation module is a fan for dissipating heat from the projector.

In some embodiments, the signal processing system is an acoustic enclosure and the interfering audio generating module comprises at least one of a passive radiating membrane, a guide tube, and a low frequency sound emitting unit.

In a fourth aspect, an embodiment of the present disclosure provides an electronic device, including: one or more processors; a storage device, on which one or more programs are stored, which, when executed by the one or more processors, cause the one or more processors to implement the method of any of the embodiments of the method for processing a signal to be processed as described above.

In a fifth aspect, embodiments of the present disclosure provide a computer-readable medium, on which a computer program is stored, which when executed by a processor implements the method of any of the embodiments of the method for processing a signal to be processed as described above.

The method for processing the signal to be processed obtains the signal to be processed, wherein a distance between a collection position of the signal to be processed and a generation position of an interference signal interfering with audio is smaller than a first preset distance threshold, the interference signal is collected through a target microphone, and then the target signal of the target audio is separated from the signal to be processed based on the signal to be processed and the interference signal, so that the interference signal can be collected through the target microphone, and the target signal is finally obtained, thereby reducing difficulty of voice interaction and improving accuracy and speed of voice interaction.

The signal processing system provided by the embodiment of the present disclosure includes an interference audio generating module, a microphone, and a signal processing module, wherein: in the operation process of the signal processing system, the interference audio generating module generates interference audio of a target audio, the microphone is used for acquiring an interference signal of the interference audio, the signal processing module is used for acquiring a signal to be processed, wherein the distance between the acquisition position of the signal to be processed and the position of the interference audio generating module is smaller than a first preset distance threshold, and the interference signal is eliminated from the signal to be processed to acquire the target signal of the target audio, so that the interference signal can be acquired through the target microphone, and the target signal is finally acquired, thereby reducing the difficulty of voice interaction and improving the accuracy and speed of voice interaction.

Drawings

Other features, objects and advantages of the disclosure will become more apparent upon reading of the following detailed description of non-limiting embodiments thereof, made with reference to the accompanying drawings in which:

FIG. 1 is an exemplary system architecture diagram in which one embodiment of the present disclosure may be applied;

FIG. 2 is a flow diagram of one embodiment of a method for processing a signal to be processed according to the present disclosure;

3A-3C are schematic diagrams of an application scenario of a method for processing a signal to be processed according to the present disclosure;

FIG. 4 is a flow diagram of yet another embodiment of a method for processing a signal to be processed according to the present disclosure;

FIG. 5 is a schematic block diagram of one embodiment of an apparatus for processing a signal to be processed according to the present disclosure;

FIG. 6 is a schematic block diagram of a first embodiment of a signal processing system according to the present disclosure;

FIG. 7 is a schematic block diagram of a second embodiment of a signal processing system according to the present disclosure;

FIG. 8 is a schematic block diagram of a third embodiment of a signal processing system according to the present disclosure;

FIG. 9 is a schematic block diagram of a fourth embodiment of a signal processing system according to the present disclosure;

FIG. 10 is a schematic block diagram of a computer system suitable for use with an electronic device to implement embodiments of the present disclosure.

Detailed Description

The present disclosure is described in further detail below with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.

It should be noted that, in the present disclosure, the embodiments and features of the embodiments may be combined with each other without conflict. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

Fig. 1 shows an exemplary system architecture 100 of an embodiment of a method for processing a signal to be processed, an apparatus for processing a signal to be processed or a signal processing system to which embodiments of the present disclosure may be applied.

As shown in fig. 1, the system architecture 100 may include

terminal devices

101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the

terminal devices

101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

A user may use the

terminal devices

101, 102, 103 to interact with a server 105 over a network 104 to receive or transmit data (e.g., signals to be processed), etc. The

terminal devices

101, 102, 103 may have various communication client applications installed thereon, such as voice interaction type software, video playing software, news information type application, image processing type application, web browser application, shopping type application, search type application, instant messaging tool, mailbox client, social platform software, and the like.

The

terminal apparatuses

101, 102, and 103 may be hardware or software. When the

terminal devices

101, 102, 103 are hardware, they may be various electronic devices with audio interaction functions, including but not limited to smart speakers, smart projectors, smart refrigerators, smart phones, and so on. When the

terminal apparatuses

101, 102, 103 are software, they can be installed in the electronic apparatuses listed above. It may be implemented as multiple pieces of software or software modules (e.g., software or software modules used to provide distributed services) or as a single piece of software or software module. And is not particularly limited herein.

The server 105 may be a server that provides various services, such as a background server that processes signals to be processed acquired by the

terminal devices

101, 102, 103. The background server can separate a target signal of the target audio from the signal to be processed. Optionally, the server 105 may also feed back the processing result (e.g., the target signal) to the terminal device. As an example, the server 105 may be a cloud server.

The server may be hardware or software. When the server is hardware, it may be implemented as a distributed server cluster formed by multiple servers, or may be implemented as a single server. When the server is software, it may be implemented as multiple pieces of software or software modules (e.g., software or software modules used to provide distributed services), or as a single piece of software or software module. And is not particularly limited herein.

It should be further noted that the method for processing the signal to be processed provided by the embodiment of the present disclosure may be executed by a server, may also be executed by a terminal device, and may also be executed by the server and the terminal device in cooperation with each other. Accordingly, the various parts (e.g., the various units, sub-units, modules, and sub-modules) included in the apparatus for processing a signal to be processed may be all disposed in the server, may be all disposed in the terminal device, and may also be disposed in the server and the terminal device, respectively.

It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation. The system architecture may only include the electronic device on which the method for processing a signal to be processed is run, when the electronic device on which the method for processing a signal to be processed is run does not need to perform data transmission with other electronic devices.

With continued reference to fig. 2, a flow 200 of one embodiment of a method for processing a signal to be processed according to the present disclosure is shown. The method for processing the signal to be processed comprises the following steps:

step 201, acquiring a signal to be processed.

In the present embodiment, an execution subject (e.g., the terminal device shown in fig. 1) of the method for processing a signal to be processed may acquire the signal to be processed. The distance between the acquisition position of the signal to be processed and the generation position of the interference audio is smaller than a first preset distance threshold.

Here, the signal to be processed may be an electrical signal corresponding to an audio signal to be processed.

In practice, in the process of acquiring an audio signal, it is difficult to avoid acquiring a signal to be processed mixed with an interference signal (e.g., a noise signal). For example, in a case where the execution main body is an intelligent refrigerator having a voice interaction function, since a compressor of the intelligent refrigerator generates sound during operation, the signal to be processed collected by the intelligent refrigerator is often mixed with the interference signal.

The first preset distance threshold may be a distance at which the acquisition device located at the acquisition position of the signal to be processed can acquire an interference signal interfering with the audio. Because the distance between the acquisition position of the signal to be processed and the generation position of the interference audio is smaller than the first preset distance threshold, the signal to be processed mixed with the interference signal is acquired in the process of acquiring the audio signal.

Step 202, collecting an interference signal interfering with the audio through a target microphone.

In this embodiment, the execution body may collect an interference signal that interferes with the audio through a target microphone. The target microphone may be integrated with the execution body, or may be provided independently of the execution body.

Here, the target microphone may be used to convert a sound signal into an electric signal. The frequency of the signal wave of the interfering signal may be arbitrary. For example, the frequency of the signal wave of the interference signal may be a low frequency (e.g., 200 hz or less), an intermediate frequency (e.g., 200 hz to 6000 hz), or a high frequency (e.g., 6000 hz or more).

In some optional implementations of this embodiment, when the target microphone is integrated with the execution body, the execution body may include a closed space structure, and the interference audio generation module and the target microphone may be disposed in the closed space structure. Wherein, the interference audio generation module generates the interference audio in the operation process. As an example, in the case where the execution subject is an intelligent refrigerator, the interference audio generating module may be a compressor provided in the intelligent refrigerator; in the case that the execution subject is an intelligent projector, the interference audio generation module may be a fan disposed in the intelligent projector; in the case that the execution subject is a smart speaker, the interference audio generation module may be a microphone disposed in the smart speaker.

It can be understood that, in the above optional implementation manner, the microphone is arranged in the closed space structure, so that the interference signal obtained by the microphone is stronger and more accurate, thereby facilitating subsequent obtaining of the target signal and facilitating improvement of accuracy and speed of voice interaction.

In some optional implementations of this embodiment, the signal wave of the interference signal is a low frequency harmonic.

In this embodiment, the interference signal may be an electrical signal corresponding to the interference audio.

Here, it should be noted that the execution order of the step 201 and the step 202 is not limited in this embodiment. For example, the executing agent may execute step 201 before step 202, or may execute step 202 before step 201.

Step 203, eliminating the interference signal from the signal to be processed to obtain the target signal of the target audio.

In this embodiment, the executing entity may eliminate the interference signal collected in step 202 from the signal to be processed acquired in step 201, so as to obtain the target signal of the target audio. The target audio may be an audio signal, and the target signal may be an electrical signal corresponding to the target audio.

As an example, when the signal to be processed is mixed with a voice signal and an interference signal (for example, an echo signal of the voice signal), the target signal may be a voice signal obtained after the interference signal is removed from the signal to be processed mixed with the voice signal and the interference signal.

In some optional implementations in this implementation, the executing entity may employ An Echo Cancellation (AEC) algorithm to cancel the interference signal from the signal to be processed.

The echo eliminating algorithm is based on the correlation between the loudspeaker signal and the multipath echo, establishes the speech model of the far-end signal, estimates the echo by using the speech model, and continuously modifies the coefficient of the filter, so that the estimated value is closer to the real echo. The echo estimate is then subtracted from the input signal of the microphone to cancel the echo.

It will be appreciated that the alternative implementations described above may employ echo cancellation algorithms to improve the cancellation of the interfering signal from the signal to be processed.

Optionally, the executing body may further input the signal to be processed and the interference signal to a pre-trained signal extraction model to obtain a target signal of the target audio. The signal extraction model may be a convolutional neural network obtained by training based on a predetermined training sample set by using a machine learning algorithm. The training samples in the training sample set may include a signal to be processed, an interference signal, and a target signal of a target audio.

With continuing reference to fig. 3A-3C, fig. 3A-3C are schematic diagrams of an application scenario of the method for processing a signal to be processed according to the present embodiment. In fig. 3A, smart sound box 301 first obtains signal to be processed 302. Wherein, the distance between the acquisition position of the signal to be processed 302 and the generation position of the interference audio is smaller than a first preset distance threshold. Then, the smart speaker 301 acquires an interference signal 303 that interferes with the audio through a target microphone (e.g., a microphone provided in the smart speaker). Then, the smart speaker 301 eliminates the interference signal 303 from the signal to be processed 302, so as to obtain a target signal 304 of the target audio. As an example, please refer to fig. 3B and 3C. Smart speaker 301 removes the interfering signal from the signal to be processed shown in fig. 3B, thereby obtaining the target signal of the target audio shown in fig. 3C. The clean signal in the target signal is shown as reference numeral 304 in fig. 3B and 3C.

In the prior art, a method of reducing the generation of an interference signal or simulating the sound effect of an interference audio corresponding to the interference signal is generally adopted to obtain a target signal containing less interference signals as much as possible. For example, when using a smart speaker, to reduce low frequency distortion and interference caused by passive radiating membranes, guide tubes, etc., the bass effect is generally reduced at the time of tuning, thereby reducing harmonic distortion caused by bass. For another example, when using the smart projector, in order to remove the buzzing noise of the fan, etc., the sound effect of the fan is simulated at the algorithm end, and some sound elements are directly intercepted at the bottom layer of the sound elements, and then noise is eliminated, and the clean voice command is extracted.

According to the method provided by the above embodiment of the disclosure, the signal to be processed is acquired, wherein the distance between the acquisition position of the signal to be processed and the generation position of the interference audio is smaller than the first preset distance threshold, the interference signal of the interference audio is acquired through the target microphone, and then the interference signal is eliminated from the signal to be processed to obtain the target signal of the target audio, so that the interference signal can be acquired through the target microphone, and the target signal is finally obtained, thereby reducing the difficulty of voice interaction and improving the accuracy and speed of voice interaction.

In some optional implementations of this embodiment, the target audio may include user speech audio. And the execution main body can also execute the operation corresponding to the voice audio of the user based on the voice audio of the user.

As an example, the execution body may execute an operation instructed by the user voice audio, or may execute an operation of establishing an association relationship with the user voice audio.

For example, the execution subject may input the voice audio of the user to the pre-trained operation model to obtain the operation instruction. And then, according to the instruction of the obtained operation instruction, executing corresponding operation. The operation model can represent the corresponding relation between the voice audio of the user and the operation instruction. The operation model may be a two-dimensional table or database in which characters indicated by user voice and audio and operation instructions are stored in association, or may be a convolutional neural network obtained by training based on a training sample set by using a machine learning algorithm. The training samples in the training sample set may include text and operation instructions indicated by the voice and audio of the user.

It can be understood that the above alternative implementation manner may perform an operation corresponding to the user voice audio, and since the interfering audio in the user voice audio is less, the recognition speed of the user voice audio may be increased, and thus the speed and accuracy of operation execution may be increased.

In some application scenarios of the implementation manner, the execution main body may execute the preset wake-up operation when the voice audio of the user includes an audio of a preset wake-up word.

It can be understood that, because the user voice audio has less interfering audio, in the application scenario, the speed and accuracy of waking up the execution subject can be improved.

In some application scenarios of the foregoing implementation, the execution subject may also generate a reply audio of the user voice audio.

It can be understood that, because the interfering audio in the user speech audio is less, in the application scenario described above, the speed and accuracy of generating the reply audio of the user speech audio can be improved.

With further reference to fig. 4, a flow 400 of yet another embodiment of a method for processing a signal to be processed is shown. The flow 400 of the method for processing a signal to be processed comprises the steps of:

step 401, acquiring a signal to be processed.

In this embodiment, step 401 is substantially the same as step 201 in the corresponding embodiment of fig. 2, and is not described here again.

Step 402, collecting an interference signal interfering with the audio through a target microphone.

In this embodiment, step 402 is substantially the same as step 202 in the corresponding embodiment of fig. 2, and is not described herein again.

In step 403, an echo cancellation algorithm is used to cancel the interference signal from the signal to be processed.

In this embodiment, the executing entity may adopt an echo cancellation algorithm to cancel the interference signal from the signal to be processed.

It should be noted that, besides the above-mentioned contents, the embodiment of the present application may further include the same or similar features and effects as the embodiment corresponding to fig. 2, and details are not repeated herein.

As can be seen from fig. 4, the process 400 of the method for processing a signal to be processed in this embodiment may employ an echo cancellation algorithm to improve the effect of eliminating an interference signal from the signal to be processed.

With further reference to fig. 5, as an implementation of the method shown in the above figures, the present disclosure provides an embodiment of an apparatus for processing a signal to be processed, which corresponds to the embodiment of the method shown in fig. 2, and which may include the same or corresponding features as the embodiment of the method shown in fig. 2, in addition to the features described below, and which produces the same or corresponding effects as the embodiment of the method shown in fig. 2. The device can be applied to various electronic equipment.

As shown in fig. 5, the apparatus 500 for processing a signal to be processed of the present embodiment includes: an acquisition unit 501, an acquisition unit 502 and an elimination unit 503. The acquiring unit 501 is configured to acquire a signal to be processed, where a distance between an acquisition position of the signal to be processed and a generation position of the interference audio is smaller than a first preset distance threshold; an acquisition unit 502 configured to acquire an interference signal that interferes with the audio through a target microphone; the cancelling unit 503 is configured to cancel the interference signal from the signal to be processed to obtain a target signal of the target audio.

In this embodiment, the acquisition unit 501 of the apparatus 500 for processing a signal to be processed may acquire the signal to be processed. The distance between the acquisition position of the signal to be processed and the generation position of the interference audio is smaller than a first preset distance threshold.

In this embodiment, the capturing unit 502 may capture an interference signal that interferes with the audio through a target microphone.

In this embodiment, the eliminating unit 503 may eliminate the interference signal acquired by the acquiring unit 502 from the signal to be processed acquired by the acquiring unit 501, so as to obtain the target signal of the target audio. The target audio may be an audio signal, and the target signal may be an electrical signal corresponding to the target audio.

In some optional implementations of this embodiment, the target audio includes user speech audio; and, the apparatus 500 further comprises: and an execution unit (not shown in the figure) configured to execute an operation corresponding to the user voice audio based on the user voice audio.

In some optional implementations of this embodiment, the execution unit includes at least one of: an execution subunit (not shown in the figure) configured to execute a preset wake-up operation in response to the user voice audio containing the audio of the preset wake-up word; a generating subunit (not shown in the figure) configured to generate a reply audio of the user voice audio.

In some optional implementations of this embodiment, the eliminating unit 503 includes: a cancellation subunit (not shown in the figure) configured to cancel the interference signal from the signal to be processed using an echo cancellation algorithm.

In the apparatus provided by the above embodiment of the present disclosure, the to-be-processed signal is acquired by the acquisition unit 501, where a distance between the acquisition position of the to-be-processed signal and the generation position of the interference audio is smaller than a first preset distance threshold, and the acquisition unit 502 acquires the interference signal of the interference audio through the target microphone, and then the elimination unit 503 eliminates the interference signal from the to-be-processed signal to obtain the target signal of the target audio, so that the interference signal can be acquired through the target microphone, and the target signal is finally obtained, thereby reducing the difficulty of voice interaction and improving the accuracy and speed of voice interaction.

Next, with continuing reference to fig. 6, fig. 6 is a schematic structural diagram of a first embodiment of a signal processing system according to the present disclosure.

As shown in fig. 6, the signal processing system includes an interference audio generating module 601, a microphone 602, and a signal processing module 603. During the operation of the signal processing system, the interference audio generating module 601 may generate an interference audio of the target audio. The microphone 602 may be used to collect an interfering signal that interferes with the audio. The signal processing module 603 may be configured to obtain a signal to be processed, and eliminate an interference signal from the signal to be processed to obtain a target signal of a target audio. The distance between the acquisition position 604 of the signal to be processed and the position of the interference audio generating module 601 is smaller than a first preset distance threshold.

In practice, in the process of acquiring an audio signal, it is difficult to avoid acquiring a signal to be processed mixed with an interference signal (e.g., a noise signal). For example, when the signal processing system is an intelligent refrigerator with a voice interaction function, the compressor of the intelligent refrigerator generates sound during operation, and therefore, the signal to be processed collected by the intelligent refrigerator is often mixed with the interference signal.

In this embodiment, the interference signal may be an electrical signal corresponding to the interference audio. The target audio may be an audio signal, and the target signal may be an electrical signal corresponding to the target audio.

In some alternative implementations of the present disclosure, the signal processing system may employ an echo cancellation algorithm to cancel the interference signal from the signal to be processed.

Optionally, the signal processing system may further input the signal to be processed and the interference signal to a pre-trained signal extraction model to obtain a target signal of the target audio. The signal extraction model may be a convolutional neural network obtained by training based on a predetermined training sample set by using a machine learning algorithm. The training samples in the training sample set may include a signal to be processed, an interference signal, and a target signal of a target audio.

In some optional implementations of the present embodiment, a distance between the interference audio generating module 601 and the microphone 602 may be smaller than a second preset distance threshold.

It will be appreciated that in the above alternative implementation, in the case that the distance between the interfering audio generating module and the microphone is small (i.e. the distance is smaller than the second preset distance threshold), a purer target signal can be obtained compared with the prior art.

In some optional implementations of this embodiment, the signal processing system further includes a closed space structure, and the interference audio generating module and the microphone are disposed in the closed space structure.

As an example, please refer to fig. 7. Fig. 7 is a schematic block diagram of a second embodiment of a signal processing system according to the present disclosure.

In fig. 7, the signal processing system includes an interference audio generating module 701, a microphone 702, and a signal processing module 703. During the operation of the signal processing system, the interference audio generating module 701 may generate an interference audio of the target audio. Microphone 702 may be used to collect an interfering signal that interferes with audio. The signal processing module 703 may be configured to obtain a signal to be processed, and eliminate an interference signal from the signal to be processed to obtain a target signal of a target audio. The distance between the acquisition position 704 of the signal to be processed and the position of the interference audio generating module 701 is smaller than a first preset distance threshold. The signal processing system further comprises a closed space structure 705, and the interference audio generating module 701 and the microphone 702 are arranged in the closed space structure 705.

It can be understood that, in the above optional implementation manner, the interference audio generating module and the microphone are arranged in the closed space structure, so that the interference signal obtained by the microphone is stronger and more accurate, thereby facilitating subsequent obtaining of the target signal and further improving the accuracy and speed of voice interaction.

In some optional implementations of this embodiment, the signal processing system is a projector, and the interference audio generating module is a fan configured to dissipate heat for the projector.

As an example, please refer to fig. 8. Fig. 8 is a schematic structural diagram of a third embodiment of a signal processing system according to the present disclosure.

In fig. 8, the projector includes a fan 801 for dissipating heat for the projector, a microphone 802, and a signal processing module 803. Wherein the fan 801 may generate an interfering audio of the target audio during operation of the projector. The microphone 802 may be used to collect an interfering signal that interferes with the audio (i.e., a noise signal generated by the fan 801). The signal processing module 803 may be configured to obtain a signal to be processed, and eliminate an interference signal from the signal to be processed to obtain a target signal of a target audio. The distance between the acquisition position 804 of the signal to be processed and the position of the fan 801 is smaller than a first preset distance threshold.

Optionally, the projector further includes a sealed space structure 805, and the fan 801 and the microphone 802 are disposed in the sealed space structure 805.

It can be understood that the above optional implementation manner may be implemented by collecting an interference signal generated by a fan for dissipating heat of the projector through a microphone, and finally obtaining a target signal, thereby not only ensuring a heat dissipation effect of the projector, but also reducing difficulty of voice interaction in a process of using the projector, and improving accuracy and speed of the voice interaction.

In some optional implementations of this embodiment, the signal processing system is a sound box, and the interfering audio generating module includes at least one of a passive radiating membrane, a guide tube, and a low frequency sound emitting unit.

As an example, please refer to fig. 9. Fig. 9 is a schematic structural diagram of a fourth embodiment of a signal processing system according to the present disclosure.

In fig. 9, the loudspeaker comprises a passive radiating membrane 901, a microphone 902 and a signal processing module 903. During operation of the loudspeaker, the passive radiating membrane 901 may generate a disturbing audio of the target audio (e.g., an echo of the sound). The microphone 902 may be used to collect an interfering signal that interferes with the audio. The signal processing module 903 may be configured to obtain a signal to be processed, and eliminate an interference signal from the signal to be processed to obtain a target signal of a target audio. The distance between the acquisition position 904 of the signal to be processed and the position of the passive radiating membrane 901 is smaller than a first preset distance threshold.

Optionally, the sound box further includes a closed space structure 905, and at least one of the passive radiating membrane, the guide tube and the low frequency sounding unit (for example, the passive radiating membrane 901) and the microphone 902 are disposed in the closed space structure 905.

It can be understood that, in the above alternative implementation manner, the microphone may be used to collect an interference signal generated by at least one of the passive radiating membrane, the guide tube and the low-frequency pronunciation unit in the sound box, so as to finally obtain a target signal, thereby not only ensuring the sound quality of the sound box, but also reducing the difficulty of voice interaction in the process of using the sound box, and improving the accuracy and speed of voice interaction.

The transport system provided by the above embodiments of the present disclosure includes an interference audio generating module, a microphone, and a signal processing module, wherein: in the operation process of the signal processing system, the interference audio generating module generates an interference audio of a target audio, the microphone is used for acquiring an interference signal of the interference audio, the signal processing module is used for acquiring a signal to be processed, wherein the distance between the acquisition position of the signal to be processed and the position of the interference audio generating module is smaller than a first preset distance threshold, and the interference signal is eliminated from the signal to be processed to acquire the target signal of the target audio. Therefore, the interference signal can be collected through the target microphone, and the target signal is finally obtained, so that the difficulty of voice interaction is reduced, and the accuracy and the speed of the voice interaction are improved.

Referring now to FIG. 10, shown is a block diagram of a computer system 1000 suitable for use with the electronic device implementing embodiments of the present disclosure. The electronic device shown in fig. 10 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 10, the computer system 1000 includes a Central Processing Unit (CPU)1001 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)1002 or a program loaded from a storage section 1008 into a Random Access Memory (RAM) 1003. In the RAM 1003, various programs and data necessary for the operation of the system 1000 are also stored. The CPU 1001, ROM 1002, and RAM 1003 are connected to each other via a bus 1004. An input/output (I/O) interface 1005 is also connected to bus 1004.

The following components are connected to the I/O interface 1005: an input section 1006 including a keyboard, a mouse, and the like; an output section 1007 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 1008 including a hard disk and the like; and a communication section 1009 including a network interface card such as a LAN card, a modem, or the like. The communication section 1009 performs communication processing via a network such as the internet. The driver 1010 is also connected to the I/O interface 1005 as necessary. A removable medium 1011 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 1010 as necessary, so that a computer program read out therefrom is mounted into the storage section 1008 as necessary.

In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication part 1009 and/or installed from the removable medium 1011. The above-described functions defined in the method of the present disclosure are performed when the computer program is executed by a Central Processing Unit (CPU) 1001.

It should be noted that the computer readable medium in the present disclosure may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer-readable signal medium may include a propagated data signal with computer-readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Python, Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in the embodiments of the present disclosure may be implemented by software or hardware. The described units may also be provided in a processor, and may be described as: a processor includes an acquisition unit, a collection unit, and an elimination unit. The names of these units do not in some cases constitute a limitation on the unit itself, and for example, the acquisition unit may also be described as a "unit that acquires a signal to be processed".

As another aspect, the present disclosure also provides a computer-readable medium, which may be contained in the electronic device described in the above embodiments; or may exist separately without being assembled into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring a signal to be processed, wherein the distance between the acquisition position of the signal to be processed and the generation position of the interference audio is smaller than a first preset distance threshold; collecting an interference signal interfering with the audio frequency through a target microphone; and eliminating the interference signal from the signal to be processed to obtain a target signal of the target audio.

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention in the present disclosure is not limited to the specific combination of the above-mentioned features, but also encompasses other embodiments in which any combination of the above-mentioned features or their equivalents is possible without departing from the inventive concept as defined above. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.

Claims

1. A method for processing a signal to be processed, comprising:

acquiring a signal to be processed, wherein the distance between the acquisition position of the signal to be processed and the generation position of the interference audio is smaller than a first preset distance threshold;

collecting an interference signal of the interference audio through a target microphone;

and eliminating the interference signal from the signal to be processed to obtain a target signal of a target audio frequency.

2. The method of claim 1, wherein the target audio comprises user speech audio; and

the method further comprises the following steps:

and executing operation corresponding to the user voice audio based on the user voice audio.

3. The method of claim 2, wherein the performing, based on the user speech audio, an operation corresponding to the user speech audio comprises at least one of:

responding to the voice audio of the user containing the audio of a preset awakening word, and executing a preset awakening operation;

generating a reply audio to the user speech audio.

4. The method according to one of claims 1 to 3, wherein said cancelling the interfering signal from the signal to be processed comprises:

and eliminating the interference signal from the signal to be processed by adopting an echo elimination algorithm.

5. The method according to one of claims 1 to 3, wherein the signal waves of the interference signal are low frequency harmonics.

6. An apparatus for processing a signal to be processed, comprising:

the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is configured to acquire a signal to be processed, and the distance between the acquisition position of the signal to be processed and the generation position of the interference audio is smaller than a first preset distance threshold;

an acquisition unit configured to acquire an interference signal of the interference audio through a target microphone;

a cancellation unit configured to cancel the interference signal from the signal to be processed to obtain a target signal of a target audio.

7. A signal processing system comprising an interfering audio generation module, a microphone, and a signal processing module, wherein:

in the operation process of the signal processing system, the interference audio generating module generates interference audio of target audio;

the microphone is used for collecting an interference signal of the interference audio;

the signal processing module is used for acquiring a signal to be processed, wherein the distance between the acquisition position of the signal to be processed and the position of the interference audio generation module is smaller than a first preset distance threshold; and eliminating the interference signal from the signal to be processed to obtain a target signal of a target audio frequency.

8. The signal processing system of claim 7, wherein the signal wave of the interference signal is a low frequency harmonic.

9. The signal processing system of claim 7, wherein a distance between the interfering audio generating module and the microphone is less than a second preset distance threshold.

10. The signal processing system according to one of claims 7 to 9, wherein the signal processing system comprises a closed space structure, the interfering audio generating module and a microphone being arranged within the closed space structure.

11. The signal processing system according to one of claims 7 to 9, wherein the signal processing system is a projector and the interfering audio generating module is a fan for dissipating heat from the projector.

12. The signal processing system according to one of claims 7 to 9, wherein the signal processing system is an acoustic enclosure and the interfering audio generating module comprises at least one of a passive radiating membrane, a guide tube and a low frequency sound emitting unit.

13. An electronic device, comprising:

one or more processors;

a storage device having one or more programs stored thereon,

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-5.

14. A computer-readable medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the method of any one of claims 1-5.