CN112185353A

CN112185353A - Audio signal processing method and device, terminal and storage medium

Info

Publication number: CN112185353A
Application number: CN202010941863.4A
Authority: CN
Inventors: 孙云飞
Original assignee: Beijing Xiaomi Pinecone Electronic Co Ltd
Current assignee: Beijing Xiaomi Pinecone Electronic Co Ltd
Priority date: 2020-09-09
Filing date: 2020-09-09
Publication date: 2021-01-05

Abstract

The disclosure relates to a processing method, a device, a terminal and a storage medium of an audio signal, wherein the method is applied to the terminal and acquires the audio signal through a sound acquisition module; determining whether the sound source position of the collected audio signal is within a preset range or not according to the sound parameters; when the sound source position is located when predetermineeing the within range, discernment audio signal to can be in to the sound source position sound in predetermineeing the within range is discerned, realizes the discernment to the sound in the specific range, has promoted the user experience at terminal.

Description

Audio signal processing method and device, terminal and storage medium

Technical Field

The present disclosure relates to the field of terminal technologies, and in particular, to a method and an apparatus for processing an audio signal, a terminal, and a storage medium.

Background

In the related art, when a terminal performs speech recognition, if a plurality of persons from a plurality of position ranges speak simultaneously, the speech recognition recognizes all the speaking contents, and thus, different speakers cannot be distinguished and recognized. For example, in a conference that is enclosed together, if there are two people in different ranges who speak at the same time, in speech recognition, the speaking contents of the two people are recognized at the same time, which affects the user experience.

Disclosure of Invention

The disclosure provides a processing method, a processing device, a terminal and a storage medium of an audio signal.

According to a first aspect of the embodiments of the present disclosure, there is provided an audio signal processing method applied to a terminal, including:

collecting audio signals through a sound collection module;

determining whether the sound source position of the collected audio signal is within a preset range or not according to the sound parameters;

and when the sound source position is within the preset range, identifying the audio signal.

Optionally, the determining, according to the sound parameter, whether the sound source position of the acquired audio signal is within a preset range includes:

determining sound parameter difference according to sound parameters of the same sound source collected by any two of the N sound collection modules, wherein N is a positive integer greater than or equal to 2;

if the sound parameter difference is within a preset difference range determined by any two sound acquisition modules, determining that the sound source position of the audio signal is within the preset range;

and if the sound parameter difference is not within the preset difference range determined by any two sound acquisition modules, determining that the sound source position of the audio signal is not within the preset range.

Optionally, the preset difference range determined by any two of the sound collection modules includes: and the same sound emitted by the sound sources positioned at different positions in the preset range reaches a difference range formed by the sound parameter difference generated by any two sound collection modules.

Optionally, when N is a positive integer greater than 2,

determining whether the collected sound source position information of the audio signal is within the preset range according to the sound parameter difference, including:

determining whether the sound source position of the acquired audio signal is within the preset range according to the sound parameter difference of the M groups of sound acquisition modules aiming at the same sound of the same sound source; wherein M is greater than or equal to 2 and less than or equal to

The positive integer of (2), a set of the sound collection module includes two the sound collection module, different group at least one between the sound collection module is different.

Optionally, the determining, according to the sound parameter difference of the M groups of sound collection modules for the same sound of the same sound source, whether the sound source position of the audio signal is within a preset range includes:

respectively determining whether the difference ranges of the sound collection modules of the corresponding groups are within the difference range of the sound collection modules of the corresponding groups to obtain M results according to M sound parameter differences of the sound collection modules of the M groups to the same sound of the same sound source, wherein M is a positive integer less than or equal to M and greater than or equal to 1;

and determining whether the sound source position of the audio signal is within the preset range according to the m results.

Optionally, the determining, according to the sound parameter, whether a sound source position of the audio signal is within a preset range includes:

determining arrival time difference according to a first arrival time when the same sound of the same sound source arrives at the first sound acquisition module and a second arrival time when the same sound of the same sound source arrives at the second sound acquisition module;

judging whether the arrival time difference is within a preset time difference range corresponding to the first sound acquisition module and the second sound acquisition module;

if so, determining that the sound source position of the audio signal is within the preset range;

and if not, determining that the sound source position of the audio signal is not located in the preset range.

Optionally, the method further comprises:

determining two boundary points of the preset range according to the preset range;

determining first time information when the same sound emitted by a first sound source positioned at a first boundary point reaches a first sound acquisition module; and second time information reaching the second sound collection module; determining a first time difference according to the first time information and the second time information;

determining third time information of the same sound emitted by a second sound source at a second boundary point reaching the first sound acquisition module and fourth time information of the same sound reaching the second sound acquisition module; determining a second time difference according to the third time information and the fourth time information;

and determining the range of the preset time difference according to the first time difference and the second time difference.

According to a second aspect of the embodiments of the present disclosure, there is provided an audio signal processing apparatus, applied to a terminal, including:

the acquisition module is configured to acquire an audio signal through the sound acquisition module;

the first determination module is configured to determine whether the sound source position of the acquired audio signal is within a preset range according to the sound parameter;

an identification module configured to identify the audio signal when the sound source position is within the preset range.

Optionally, the first determining module is further configured to:

Optionally, the preset difference range determined by any two of the sound collection modules includes: and the same sound emitted by the sound sources positioned at different positions in the preset difference range reaches the difference range formed by the sound parameter difference generated by any two sound collection modules.

Optionally, when N is a positive integer greater than 2,

the first determination module further configured to:

Optionally, the first determining module is further specifically configured to:

respectively determining whether the sound collection modules of the M groups fall into a preset difference range of the sound collection modules of the corresponding group to obtain M results according to M sound parameter differences of the same sound source, wherein M is a positive integer smaller than or equal to M and larger than or equal to 1;

Optionally, the first determining module is further specifically configured to:

Optionally, the apparatus further comprises: a second determination module, wherein the second determination module comprises:

a first determining submodule configured to determine two boundary points of the preset range according to the preset range;

the second determining submodule is configured to determine first time information of the same sound emitted by a first sound source located at a first boundary point reaching the first sound collecting module; and second time information reaching the second sound collection module; determining a first time difference according to the first time information and the second time information;

a third determining submodule configured to determine third time information when the same sound emitted by a second sound source located at a second boundary point reaches the first sound collection module, and fourth time information when the same sound reaches the second sound collection module; determining a second time difference according to the third time information and the fourth time information;

a fourth determining submodule configured to determine the preset time difference range according to the first time difference and the second time difference.

According to a third aspect of the embodiments of the present disclosure, there is provided a terminal, including:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to:

collecting audio signals through a sound collection module;

According to a fourth aspect of embodiments of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon a computer program for execution by a processor to perform the method steps of any of the above.

The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:

the audio signal processing method provided by the embodiment of the disclosure is applied to a terminal and acquires an audio signal through a sound acquisition module; determining whether the sound source position of the collected audio signal is within a preset range or not according to the sound parameters; and when the sound source position is within the preset range, identifying the audio signal. Therefore, the embodiment of the disclosure can identify the sound of the sound source position in the preset range, reduce the identification of the sound outside the preset range, realize the identification of the sound in the specific range, and improve the user experience of the terminal.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.

FIG. 1 is a flow chart illustrating a method of processing an audio signal according to an exemplary embodiment;

FIG. 2 is a schematic diagram illustrating a scenario of a method of processing an audio signal according to an exemplary embodiment;

FIG. 3 is a block diagram illustrating an apparatus for processing an audio signal according to an exemplary embodiment;

fig. 4 is a block diagram illustrating a terminal according to an example embodiment.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.

Fig. 1 is a flowchart illustrating a method for processing an audio signal according to an exemplary embodiment, where the method is applied to a terminal, as shown in fig. 1, and includes the following steps:

step 101: collecting audio signals through a sound collection module;

step 102: determining whether the sound source position of the collected audio signal is within a preset range or not according to the sound parameters;

step 103: and when the sound source position is within the preset range, identifying the audio signal.

Here, the terminal may include a mobile terminal or a fixed terminal. The mobile terminal can comprise a mobile phone, a tablet computer, an electronic reader, a wearable device and the like; the fixed terminal can comprise a desktop computer, an all-in-one machine, intelligent household equipment and the like; the smart home devices may include smart televisions, smart air conditioners, smart refrigerators, and the like.

Here, the sound parameter includes at least one of a time when the sound is collected, a loudness of the collected sound, and an intensity of the collected sound.

In the embodiment of the present disclosure, the terminal may recognize only an audio signal obtained by recognizing a sound emitted at the sound source position located within the preset range.

In some scenarios, it is assumed that a terminal is a mobile terminal or a fixed terminal of a conference center, and the preset range is a range within a preset distance in front of the terminal, for example, the range is a range in which the terminal can operate. In this scenario, the terminal may only recognize sounds within a range in which the terminal is operable, so that interference recognition of sounds outside a preset range is reduced, and user experience is improved.

In other scenes, assuming that the terminal is an intelligent television, the preset range is a range in which a living room sofa is located, when the intelligent television receives a sound, the intelligent television can determine the sound source position of the audio signal according to the sound parameter of the sound, and if the sound source position is within the preset range, the audio signal corresponding to the sound is identified. Therefore, sound in a range outside the sofa of the living room can not be recognized through the method, and misoperation of the smart television is reduced.

In some embodiments, the determining whether the sound source position of the acquired audio signal is within a preset range according to the sound parameter may include: determining sound parameter difference according to sound parameters of the same sound source collected by any two of the N sound collection modules, wherein N is a positive integer greater than or equal to 2; if the sound parameter difference is within a preset difference range determined by any two sound acquisition modules, determining that the sound source position of the audio signal is within the preset range; and if the sound parameter difference is not within the preset difference range determined by any two sound acquisition modules, determining that the sound source position of the audio signal is not within the preset range.

It can be understood that the sound collection modules at different positions have different collected sound parameters when receiving the same sound from the same sound source. For example, the sound collection modules at different positions may cause different times of receiving the same sound from the same sound source due to different distances from the same sound source, or may cause different loudness of receiving the same sound from the same sound source, or may cause different intensity of receiving the same sound from the same sound source.

In this embodiment, any two of N sound collection modules are used for determining the sound parameter difference of the same sound source, and whether the sound source position of the audio signal is located in the preset range is determined according to whether the sound parameter difference is located in the preset difference range determined by any two sound collection modules, so that the realization is simple, a sensor such as an infrared detector is not required to be installed for detecting the position of a sound generating body, the hardware requirement of the terminal is reduced, the manufacturing cost of the terminal is reduced, and the universality of the terminal is improved.

It should be added that the preset difference range is determined by any two sound collection modules, and can be stored in the terminal in advance; the terminal may also send the location information of the preset range to the server, and receive the preset difference range fed back by the server based on the location information of the preset range, where an obtaining manner of the preset difference range is not limited at all.

In this embodiment, the difference range is used as a criterion for determining whether the difference range is within a preset range, it is not necessary to calculate specific position information of the sound source position, and then it is determined whether the sound source position of the audio signal is within the preset range according to the position information of the sound source position, so that the calculation of the terminal is simplified, and the recognition efficiency is improved.

In some embodiments, the preset difference range determined by any two of the sound collection modules includes: and the same sound emitted by the sound sources positioned at different positions in the preset range reaches a difference range formed by the sound parameter difference generated by any two sound collection modules.

It should be understood that, the same sound emitted by the sound source at any position within the preset range reaches the sound parameters of any two sound collection modules, which may generate sound parameter differences, and these sound parameter differences may actually constitute the preset difference range determined by any two sound collection modules.

It should be noted that, for the same preset range, the preset difference ranges determined by the two different sound collection modules are different.

In practical application, the terminal can determine the preset difference range determined by the two corresponding sound collection modules according to the preset range.

In some embodiments, the terminal may send the location information to a server according to the location information of the preset range relative to the terminal, and obtain the preset difference range determined by the two corresponding sound collection modules from the server.

In other embodiments, the terminal stores a corresponding relationship between the position information of the preset range and the preset difference range, and determines the preset difference range determined by the two corresponding sound collection modules according to the corresponding relationship.

In the embodiment, the terminal does not need to calculate the preset difference range corresponding to the preset range, so that the processing speed of the terminal is increased, and the recognition efficiency is improved.

In other embodiments, in order to improve the accuracy of the preset difference range, the preset difference range may be determined in real time based on the preset range and the sound parameter difference of the same sound source collected by any two of the sound collection modules within the preset range.

Specifically, the first difference information may include a sound parameter difference between any two of the sound collection modules, and the second difference information also includes a sound parameter difference between any two of the sound collection modules. Here, the first difference information may be formed of the same sound emitted from a sound source located at one boundary point of the preset range, and the second difference information may be formed of the same sound emitted from a sound source located at another boundary point of the preset range.

In this embodiment, since the preset difference range determined by any two sound collection modules is calculated in real time according to the current environment, and the obtained preset difference range is instantaneous, the preset range can be represented more accurately based on the preset difference range, and the accuracy of determining whether the preset range is included in the judgment based on the preset difference range is improved.

In some embodiments, when N is a positive integer greater than 2, in order to improve the accuracy of the determination, the determining whether the sound source position of the acquired audio signal is located within the preset range according to the sound parameter difference may include:

Here, according to the sound parameter difference of at least two groups of sound collection modules to the same sound of the same sound source, compared with the sound parameter difference of only one group of sound collection modules to the same sound of the same sound source, whether the sound source position of the audio signal is within the preset range is determined, and the accuracy of judgment is improved.

Specifically, the determining, according to the sound parameter difference of the same sound source by the M groups of sound collection modules, whether the sound source position of the collected audio signal is within the preset range includes:

respectively determining whether the difference ranges of the sound collection modules of the M groups are within preset difference ranges of the sound collection modules of the corresponding groups according to M sound parameter differences of the sound collection modules of the M groups to the same sound of the same sound source so as to obtain M results, wherein M is a positive integer less than or equal to M and greater than or equal to 1; and determining whether the sound source position of the audio signal is within the preset range according to the m results.

Here, the determining whether the sound source position of the audio signal is within the preset range according to the m results may include:

according to the m results, if the result which is larger than or equal to the preset proportion in the m results indicates that the sound source position of the audio signal is located in the preset range, determining that the sound source position of the audio signal is located in the preset range; and if the m results smaller than the preset proportion indicate that the sound source position of the audio signal is located in the preset range, determining that the sound source position of the audio signal is not located in the preset range.

Here, the m results may be verified again based on a preset ratio, so as to improve the accuracy of determining whether the sound source position is within the preset range.

In still other embodiments, determining whether a sound source position of the audio signal is within the preset range according to the m results may further include:

and determining whether the sound source position of the audio signal is within the preset range or not according to the m results and the weights corresponding to the results.

Wherein, the weight corresponding to the result can be understood as the corresponding weight of the sound collection module group corresponding to the result. It can be understood that, in the multiunit among the sound collection module, the sound collection module that different groups are collected because distance difference between the sound collection module etc. can lead to the different accuracy of gathering the sound parameter difference of the same sound to same sound of sound source between the module to be different, consequently, the higher sound collection module group of accuracy then the weight that corresponds, so, the weight of the result that corresponds is also higher.

In this embodiment, it is determined whether the sound source position of the audio signal is within the preset range according to the m results, and the determination accuracy is higher compared with a case that it is determined whether the sound source position of the audio signal is within the preset range by using one result determined by any two sound collection modules.

In a specific embodiment, the sound parameter is, for example, a time of the collected sound, and the sound collection module includes a first sound collection module and a second sound collection module.

The determining whether the sound source position of the audio signal is within a preset range according to the sound parameter includes:

In this embodiment, it may be determined whether the sound source position of the audio signal is within the preset range by determining whether the arrival time difference is within a preset time difference range based on only the arrival time difference. Since the error rate of the acquisition of time by the sound acquisition module is lower relative to the loudness and the intensity, the accuracy of determining whether the sound source position of the audio signal is within the preset range by judging whether the arrival time difference is within the preset time difference range based on the arrival time difference is higher.

Here, the preset time range may also be pre-stored in the terminal, and in other embodiments, the preset time range may also be obtained by sending, by the terminal, the location information of the preset range to the server, and receiving, based on the location information of the preset range, the preset time difference range fed back by the server.

In order to ensure the accuracy of the determination of the preset time difference range, the method further comprises:

determining first time information when the same sound emitted by a first sound source positioned at a first boundary point reaches the first sound acquisition module and second time information when the same sound reaches a second sound acquisition module; determining a first time difference according to the first time information and the second time information;

It can be understood that, the distance from the first sound collection module to the first boundary point is different from the distance from the second sound collection module to the first boundary point, and therefore, the first time information when the same sound emitted by the first sound source at the first boundary point reaches the first sound collection module and the second time information when the same sound reaches the second sound collection module form a first time difference. Similarly, the distances from the second boundary point to the first sound collection module and the second sound collection module are different, so that the same sound emitted by the second sound source at the second boundary point reaches the first sound collection module and reaches the second time information of the second sound collection module, and a second time difference is formed.

In this embodiment, the preset time difference range is determined by the first sound collection module and the second sound collection module which are calculated in real time according to the current environment, and the obtained preset time difference range is instantaneous, so that the preset range can be represented more accurately based on the preset time difference range, and the accuracy of the judgment of whether the preset range is determined based on the preset time difference range is improved.

Further, the present disclosure provides a specific embodiment to further understand the audio signal processing method provided by the embodiment of the present disclosure.

Referring to fig. 2, fig. 2 is a schematic view illustrating a scene of a method for processing an audio signal according to an exemplary embodiment. In this embodiment, the terminal takes a mobile phone 20 as an example, and the mobile phone includes two microphones, it can be understood that the two microphones are equivalent to any two of the sound collection modules described above. Here, the two microphones include: a first microphone 21 and a second microphone 22. Assume that handset 20 is lying flat on a table.

In a first specific embodiment, the preset range is, for example, a preset distance from the side of the mobile phone 20, and the sound parameter is, for example, a time of the collected sound.

In this embodiment, first, two boundary points of the preset range are determined according to the position information of the preset range relative to the terminal. Here, two of the boundary points include: a first boundary point 201 and a second boundary point 202.

Next, when the first boundary point 201 is used as the sound source position, the time for the sound emitted from the first sound source located at the first boundary point 201 to travel to the first microphone 21 is t1, and the time for the sound emitted from the first sound source located at the first boundary point 201 to travel to the second microphone 22 is t2, then it can be determined that the first time difference is t12, where t12 is equal to t1 minus t 2. Here, the first time difference may be understood as the first time difference described in the above embodiment.

Similarly, when the second boundary point 202 is used as the sound source position, the time when the sound emitted from the second sound source located at the second boundary point 202 travels to the first microphone 21 is t3, and the time when the sound emitted from the second sound source located at the second boundary point 202 travels to the second microphone is t4, then the second time difference is t34, where t34 is equal to t3 minus t 4. Here, the second time difference may be understood as the second time difference described in the above embodiment.

It should be noted that the time when the first sound source propagates from the first boundary point 201 to the first microphone 21 and the time when the first sound source propagates to the second microphone 22 may be calculated according to the distances from the first boundary point 201 to the first microphone 21 and the second microphone 22, respectively.

Similarly, the time of the second sound source from the second boundary point 201 to the first microphone 21 and the time of the second sound source to the second microphone 22 can be calculated according to the distance from the second boundary point 202 to the first microphone 21 and the second microphone 22.

According to the first time difference t12 and the second time difference t34, a preset time difference range corresponding to the predetermined distance determined by the first microphone and the second microphone according to the embodiment can be determined.

Finally, when the mobile phone collects the sound emitted by a certain sound source, if the time difference between the sound and the time difference between the. At this time, the terminal recognizes an audio signal corresponding to the sound.

On the contrary, if the time difference between the arrival of the sound at the first microphone 21 and the arrival at the second microphone 22 are within the predetermined time difference range, it is determined that the sound source position of the sound is not within the predetermined range determined by the first boundary point 201 and the second boundary point 202, that is, the sound source position of the sound is not at the predetermined distance from the side of the mobile phone. At this time, the system does not recognize the audio signal corresponding to the sound.

In a second embodiment, the preset range is a preset distance from the upper side of the mobile phone 20, and the sound parameter is a time of the collected sound.

In this embodiment, first, two boundary points of the preset range are determined according to the position information of the preset range relative to the terminal. Here, the two boundary points include: a third boundary point 203 and a fourth boundary point 204.

Secondly, when the third boundary point 203 is taken as the sound source position, the time for the sound emitted by the third sound source located at the third boundary point 203 to propagate to the first microphone 21 is t5, and the time for the sound emitted by the third sound source located at the third boundary point 203 to propagate to the second microphone 22 is t6, then it can be determined that the third time difference for the sound with the third boundary point 203 as the sound source position is t56, wherein t56 is equal to t5 minus t 6. In fact, the third time difference here can also be understood as the first time difference in the above embodiment.

Similarly, when the fourth boundary point 204 is taken as the sound source position, the time for the sound emitted from the fourth sound source located at the fourth boundary point 204 to travel to the first microphone 21 is t7, and the time for the sound emitted from the fourth sound source located at the fourth boundary point 204 to travel to the second microphone is t8, then it can be determined that the fourth time difference for the sound with the fourth boundary point 204 as the sound source position is t78, wherein t78 is equal to t7 minus t 8. In fact, the fourth time difference here can also be understood as the second time difference described in the above embodiments.

It should be noted that the time when the sound source propagates the same sound from the third boundary point 203 to the first microphone 21 and the time when the sound source propagates to the second microphone 22 may be calculated according to the distances from the third boundary point 203 to the first microphone 21 and the second microphone 22, respectively.

Similarly, the time when the sound source propagates from the fourth boundary point 204 to the first microphone 21 and the time when the sound source propagates to the second microphone 22 can be calculated according to the distances from the fourth boundary point 204 to the first microphone 21 and the second microphone 22, respectively.

According to the third time difference t56 and the second time difference t78, a preset time difference range corresponding to the predetermined distance determined by the first microphone and the second microphone according to the present embodiment can be determined.

Finally, when the mobile phone collects the sound emitted by a sound source, if the time difference between the sound and the time when the sound reaches the first microphone 21 and the second microphone 22 is within the predetermined time difference range, it is determined that the sound source position of the sound is within the predetermined range determined by the third boundary point 203 and the fourth boundary point 204, that is, the sound source position of the sound is a predetermined distance above the mobile phone. At this time, the system identifies an audio signal corresponding to the sound.

On the contrary, if the time difference of the sound reaching the first microphone 21 and the second microphone 22 is not within the above-determined preset time difference range, it is determined that the sound source position of the sound is not within the preset range determined by the third boundary point 203 and the fourth boundary point 204, that is, the sound source position of the sound is not a predetermined distance above the mobile phone. At this time, the system does not recognize the audio signal corresponding to the sound.

According to the embodiment, the voice in the designated position range of the recognition terminal can be recognized, the problem of omnidirectional voice of the recognition terminal is avoided, voice recognition can be performed in a targeted mode, and user experience is improved.

Fig. 3 is a block diagram illustrating an apparatus for processing an audio signal according to an exemplary embodiment. Referring to fig. 3, the apparatus includes an acquisition module 31, a first determination module 32, and an identification module 33; wherein,

the acquisition module 31 is configured to acquire an audio signal through a sound acquisition module;

the first determining module 32 is configured to determine whether a sound source position of the acquired audio signal is within a preset range according to a sound parameter;

an identifying module 33 configured to identify the audio signal when the sound source position is within the preset range.

In an optional embodiment, the first determining module 32 is further configured to:

determining sound parameter difference according to sound parameters of the same sound source collected by any two sound collection modules in N, wherein N is a positive integer greater than or equal to 2;

In an optional embodiment, the preset difference range determined by any two sound collection modules includes: and the same sound emitted by the sound sources positioned at different positions in the preset range reaches a difference range formed by the sound parameter difference generated by any two sound collection modules.

In an alternative embodiment, when N is a positive integer greater than 2,

the first determination module 32, further configured to:

In some embodiments, the first determining module 32 is further specifically configured to:

In an optional embodiment, the first determining module 32 is further specifically configured to:

if not, determining that the sound source position of the audio signal is not located in the preset range

In some embodiments, the apparatus further comprises: a second determination module, wherein the second determination module comprises:

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

Fig. 4 is a block diagram illustrating a terminal 400 according to an example embodiment. For example, the terminal 400 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, and the like.

Referring to fig. 4, the terminal 400 may include one or more of the following components: a processing component 402, a memory 404, a power component 406, a multimedia component 404, an audio component 410, an interface for input/output (I/O) 412, a sensor component 414, and a communication component 416.

The processing component 402 generally controls overall operation of the terminal 400, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 402 may include one or more processors 420 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 402 can include one or more modules that facilitate interaction between the processing component 402 and other components. For example, the processing component 402 can include a multimedia module to facilitate interaction between the multimedia component 408 and the processing component 402.

The memory 404 is configured to store various types of data to support operations at the terminal 400. Examples of such data include instructions for any application or method operating on the terminal 400, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 404 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

The power components 406 provide power to the various components of the terminal 400. The power components 406 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the terminal 400.

The multimedia component 404 includes a screen providing an output interface between the terminal 400 and the user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 404 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the terminal 400 is in an operation mode, such as a photographing mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.

The audio component 410 is configured to output and/or input audio signals. For example, the audio component 410 includes a Microphone (MIC) configured to receive external audio signals when the terminal 400 is in an operating mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 404 or transmitted via the communication component 416. In some embodiments, audio component 410 also includes a speaker for outputting audio signals.

The I/O interface 412 provides an interface between the processing component 402 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.

The sensor component 414 includes one or more sensors for providing various aspects of status assessment for the terminal 400. For example, the sensor assembly 414 can detect an open/closed state of the terminal 400, relative positioning of components, such as a display and keypad of the terminal 400, the sensor assembly 414 can also detect a change in position of the terminal 400 or a component of the terminal 400, the presence or absence of user contact with the terminal 400, orientation or acceleration/deceleration of the terminal 400, and a change in temperature of the terminal 400. The sensor assembly 414 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 414 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 414 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 416 is configured to facilitate communications between the terminal 400 and other terminals in a wired or wireless manner. The terminal 400 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 416 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 416 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the terminal 400 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.

In an exemplary embodiment, a non-transitory computer-readable storage medium comprising instructions, such as the memory 404 comprising instructions, executable by the processor 420 of the terminal 400 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

A non-transitory computer-readable storage medium, wherein instructions of the storage medium, when executed by a processor of a terminal, enable the terminal to perform the audio signal processing method according to the above embodiments.

Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.

It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims

1. A method for processing an audio signal, applied to a terminal, includes:

collecting audio signals through a sound collection module;

2. The method according to claim 1, wherein the determining whether the sound source position of the collected audio signal is within a preset range according to the sound parameter comprises:

3. The method according to claim 2, wherein the predetermined difference range determined by any two sound collection modules comprises: and the same sound emitted by the sound sources positioned at different positions in the preset range reaches a difference range formed by the sound parameter difference generated by any two sound collection modules.

4. The method of claim 2, wherein when N is a positive integer greater than 2,

the determining whether the sound source position of the collected audio signal is within the preset range according to the sound parameter difference includes:

5. The method according to claim 4, wherein the determining whether the sound source position of the acquired audio signal is within the preset range according to the sound parameter difference of the same sound source of the M groups of sound acquisition modules comprises:

6. The method of claim 1, wherein the determining whether a sound source position of the audio signal is within a preset range according to the sound parameter comprises:

7. The method of claim 6, further comprising:

determining first time information when the same sound emitted by a first sound source positioned at the first boundary point reaches the first sound acquisition module and second time information when the same sound reaches the second sound acquisition module; determining a first time difference according to the first time information and the second time information;

determining third time information of the same sound emitted by a second sound source positioned at the second boundary point and reaching the first sound acquisition module and fourth time information of the same sound reaching the second sound acquisition module; determining a second time difference according to the third time information and the fourth time information;

8. An audio signal processing apparatus, applied to a terminal, includes:

9. The apparatus of claim 8, wherein the first determining module is further configured to:

10. The apparatus according to claim 9, wherein the predetermined difference range determined by any two sound collection modules comprises: and the same sound emitted by the sound sources positioned at different positions in the preset range reaches a difference range formed by the sound parameter difference generated by any two sound collection modules.

11. The apparatus of claim 9, wherein when N is a positive integer greater than 2,

the first determination module further configured to:

12. The apparatus of claim 11, the first determination module further specifically configured to:

13. The apparatus of claim 8, wherein the first determining module is further specifically configured to:

14. The apparatus of claim 13, further comprising: a second determination module, wherein the second determination module comprises:

15. A terminal, comprising:

a processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to:

collecting audio signals through a sound collection module;

16. A non-transitory computer-readable storage medium, on which a computer program is stored, characterized in that the program is executed by a processor to implement the method steps of any of claims 1 to 7.