CN113571086A

CN113571086A - Sound signal processing method and device, electronic equipment and readable storage medium

Info

Publication number: CN113571086A
Application number: CN202010351751.3A
Authority: CN
Inventors: 熊飞飞; 冯津伟; 黄伟隆; 杜秉聰
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2020-04-28
Filing date: 2020-04-28
Publication date: 2021-10-29
Anticipated expiration: 2040-04-28
Also published as: CN113571086B

Abstract

The embodiment of the disclosure discloses a sound signal processing method, a sound signal processing device, an electronic device and a readable storage medium, wherein the sound signal processing method comprises the steps of acquiring an input sound signal; acquiring processing parameters determined according to sound source information of the input sound signals; and processing the input sound signal according to the processing parameter to obtain an output sound signal. According to the technical scheme, the processing parameters are determined through the sound source information, intelligent automatic gain control is performed on the input sound signals of different users, and compared with the method that gain adjustment is performed only according to the amplitude of the input sound signals, the gain coefficient can be adjusted to a proper value more quickly, so that the method and the device are suitable for application scenes of quick switching of the input sound signals of different users.

Description

Sound signal processing method and device, electronic equipment and readable storage medium

Technical Field

The present disclosure relates to the field of computer application technologies, and in particular, to a sound signal processing method and apparatus, an electronic device, and a readable storage medium.

Background

With the development of internet technology, more and more users conduct a teleconference or a video conference through electronic devices. In the conference process, the amplitude of the input sound signal can be adjusted by using automatic gain control in the electronic equipment, so that the amplitude of the output sound signal obtained after adjustment is controlled within a preset range. However, in the process of a teleconference or a video conference by multiple users, when the input sound signals of multiple users acquired by the electronic device at one end are switched, the adjustment speed of the automatic gain control is not timely due to different amplitudes of the input sound signals, so that the automatic gain control cannot timely reach a steady state, and the effect of the teleconference or the video conference is affected.

Disclosure of Invention

In order to solve the problems in the related art, embodiments of the present disclosure provide a sound signal processing method and apparatus, an electronic device, and a readable storage medium.

In a first aspect, a sound signal processing method is provided in the disclosed embodiments.

Specifically, the sound signal processing method includes:

acquiring an input sound signal;

acquiring processing parameters determined according to sound source information of the input sound signals;

and processing the input sound signal according to the processing parameter to obtain an output sound signal.

With reference to the first aspect, in a first implementation manner of the first aspect, the acquiring a processing parameter determined according to sound source information of the input sound signal includes:

and when the sound source of the input sound signal is determined to be different from the sound source of the input sound signal processed last time according to the sound source information, acquiring the processing parameter determined according to the sound source information of the input sound signal.

With reference to the first aspect, in a second implementation manner of the first aspect, the acquiring a processing parameter determined according to sound source information of the input sound signal includes:

and when the sound source of the input sound signal is determined to be the same as the sound source of the input sound signal processed most recently according to the sound source information, adjusting the processing parameters according to a preset rule so that the amplitude of the output sound signal meets a preset condition.

With reference to the first aspect, in a third implementation manner of the first aspect, the sound source information includes at least one of: voiceprint information of the input sound signal, orientation information of a sound source of the input sound signal, identification information of a microphone generating the input sound signal, a geographical location of the input sound signal, and environment information of the input sound signal.

With reference to the first aspect, in a fourth implementation manner of the first aspect, the present disclosure further includes:

and determining the processing parameters according to the preset corresponding relation between the sound source information and the processing parameters.

With reference to the first aspect, in a fifth implementation manner of the first aspect, the present disclosure further includes:

determining previous processing parameters for processing previous input sound signals from a sound source corresponding to the sound source information according to the sound source information;

determining the processing parameter from the previous processing parameter.

With reference to the first aspect, the present disclosure provides in a sixth implementation form of the first aspect, the method is performed by a first electronic device comprising or connected to a microphone that generates the input sound signal; or

The method is performed by a server in communication with the first electronic device; or

The method is performed by a second electronic device in communication with the first electronic device.

With reference to the sixth implementation manner of the first aspect, the present disclosure provides, in a seventh implementation manner of the first aspect, when the method is executed by a first electronic device including or connected to a microphone that generates the input sound signal, the acquiring the processing parameter includes:

the processing parameters are obtained locally at the first electronic device.

With reference to the sixth implementation manner of the first aspect, in an eighth implementation manner of the first aspect, when the method is executed by a first electronic device that includes or is connected to a microphone that generates the input sound signal, the obtaining the processing parameter includes:

sending the sound source information to a first external device;

receiving, from the first external device, the processing parameter determined by the first external device from the sound source information.

With reference to the eighth implementation manner of the first aspect, in a ninth implementation manner of the first aspect, the first external device is a server that communicates with the first electronic device.

With reference to the sixth implementation manner of the first aspect, in a tenth implementation manner of the first aspect, when the method is executed by a server in communication with the first electronic device, the obtaining the processing parameter includes:

obtaining the processing parameters locally at the server or receiving the processing parameters determined by the first electronic device according to the sound source information from the first electronic device.

With reference to the sixth implementation manner of the first aspect, in an eleventh implementation manner of the first aspect, the obtaining the processing parameter when the method is executed by a second electronic device in communication with the first electronic device includes:

obtaining the processing parameters locally at the second electronic device or receiving the processing parameters determined by the second external device according to the sound source information from a second external device.

With reference to the eleventh implementation manner of the first aspect, in a twelfth implementation manner of the first aspect, the second external device is the first electronic device or the server.

With reference to the sixth implementation manner of the first aspect, in a thirteenth implementation manner of the first aspect, the processing parameter is determined according to a correspondence between the sound source information and the processing parameter; or

The processing parameters are determined from previous processing parameters for processing a previous input sound signal from a sound source corresponding to the sound source information.

With reference to the first aspect, in a fourteenth implementation manner of the first aspect, the processing parameter includes a gain factor.

In a second aspect, a sound signal processing method is provided in an embodiment of the present disclosure.

Specifically, the sound signal processing method includes:

acquiring a first sound signal and client information corresponding to the first sound signal;

acquiring processing parameters according to the client information;

and processing the first sound signal according to the processing parameter to obtain a second sound signal.

With reference to the second aspect, in a first implementation manner of the second aspect, the obtaining a processing parameter according to the client information includes:

and when the client information is different from the client information corresponding to the first sound signal processed last time, acquiring a processing parameter determined according to the client information.

With reference to the second aspect, in a second implementation manner of the second aspect, the acquiring a processing parameter according to the client information includes:

and when the client information is the same as the client information corresponding to the first sound signal processed most recently, adjusting the processing parameter according to a preset rule so that the amplitude of the second sound signal meets a preset condition.

With reference to the second aspect, in a third implementation manner of the second aspect, the present disclosure further includes:

and determining the processing parameters according to the preset corresponding relation between the client information and the processing parameters.

With reference to the second aspect, in a fourth implementation manner of the second aspect, the present disclosure further includes:

determining previous processing parameters for processing a previous first sound signal from a client corresponding to the client information according to the client information;

determining the processing parameter from the previous processing parameter.

With reference to the second aspect, the present disclosure provides in a fifth implementation form of the second aspect, the method is performed by a server in communication with the client.

With reference to the second aspect, the present disclosure provides in a sixth implementation manner of the second aspect, wherein the processing parameter includes a gain factor; and/or

The client information includes at least one of: identification information of the client, a geographic location of the client, and environment information of the client.

In a third aspect, a sound signal processing method is provided in the disclosed embodiments.

Specifically, the sound signal processing method includes:

acquiring an input sound signal;

acquiring a user ID corresponding to the input sound signal;

acquiring a processing parameter according to the user ID;

With reference to the third aspect, in a first implementation manner of the third aspect, the obtaining a user ID corresponding to the input sound signal includes:

determining a user ID corresponding to the input sound signal according to the characteristic information of the input sound signal, wherein the characteristic information of the input sound signal comprises at least one of the following items: the voice print information of the input sound signal, the microphone identification information corresponding to the input sound signal, the client information corresponding to the input sound signal, and the semantic information corresponding to the input sound signal.

With reference to the third aspect, in a second implementation manner of the third aspect, the acquiring, according to the user ID, a processing parameter includes:

when the user ID is different from the user ID corresponding to the input sound signal processed last time, acquiring a processing parameter determined according to the user ID; and/or

Acquiring a processing parameter according to the user ID and at least one of the following items: the geographical position of the user and the environment information of the user.

With reference to the third aspect, in a third implementation manner of the third aspect, the acquiring, according to the user ID, a processing parameter includes:

and when the user ID is the same as the user ID corresponding to the input sound signal processed last time, adjusting the processing parameter according to a preset rule so that the amplitude of the output sound signal meets a preset condition.

With reference to the third aspect, in a fourth implementation manner of the third aspect, the present disclosure further includes:

and determining the processing parameters according to the preset corresponding relation between the user ID and the processing parameters.

With reference to the third aspect, in a fifth implementation manner of the third aspect, the present disclosure further includes:

determining previous processing parameters for processing a previous input sound signal from a user corresponding to the user ID according to the user ID;

determining the processing parameter from the previous processing parameter.

With reference to the third aspect, the present disclosure provides in a sixth implementation form of the third aspect, the method being performed by a first electronic device comprising or connected to a microphone that generates the input sound signal; or

With reference to the sixth implementation manner of the third aspect, in a seventh implementation manner of the third aspect, when the method is performed by a first electronic device that includes or is connected to a microphone that generates the input sound signal, the obtaining the processing parameter includes:

the processing parameters are obtained locally at the first electronic device.

With reference to the sixth implementation manner of the third aspect, in an eighth implementation manner of the third aspect, when the method is performed by a first electronic device that includes or is connected to a microphone that generates the input sound signal, the obtaining the processing parameter includes:

transmitting the user ID to a first external device;

receiving, from the first external device, the processing parameter determined by the first external device according to the user ID.

With reference to the eighth implementation manner of the third aspect, in a ninth implementation manner of the third aspect, the first external device is a server that communicates with the first electronic device.

With reference to the sixth implementation manner of the third aspect, in a tenth implementation manner of the third aspect, when the method is performed by a server in communication with the first electronic device, the obtaining the processing parameter includes:

obtaining the processing parameters locally at the server or receiving the processing parameters from the first electronic device as determined by the first electronic device from the user ID.

With reference to the sixth implementation manner of the third aspect, in an eleventh implementation manner of the third aspect, the obtaining the processing parameter when the method is executed by a second electronic device in communication with the first electronic device includes:

obtaining the processing parameters locally at the second electronic device or receiving the processing parameters determined by the second external device from a second external device according to the user ID.

With reference to the eleventh implementation manner of the third aspect, in a twelfth implementation manner of the third aspect, the second external device is the first electronic device or the server.

With reference to the sixth implementation manner of the third aspect, in a thirteenth implementation manner of the third aspect, the processing parameter is determined according to a correspondence between the user ID and the processing parameter; or

The processing parameters are determined from previous processing parameters for processing a previous input sound signal from a user to which the user ID corresponds.

With reference to the third aspect, in a fourteenth implementation manner of the third aspect, the processing parameter includes a gain factor.

In a fourth aspect, an embodiment of the present disclosure provides a sound signal processing apparatus.

Specifically, the sound signal processing apparatus includes:

a first acquisition module configured to acquire an input sound signal;

a second acquisition module configured to acquire a processing parameter determined from sound source information of the input sound signal;

and the third acquisition module is configured to process the input sound signal according to the processing parameter to acquire an output sound signal.

With reference to the fourth aspect, in a first implementation manner of the fourth aspect, the obtaining processing parameters determined according to sound source information of the input sound signal includes:

With reference to the fourth aspect, in a second implementation manner of the fourth aspect, the acquiring processing parameters determined according to sound source information of the input sound signal includes:

With reference to the fourth aspect, in a third implementation manner of the fourth aspect, the sound source information includes at least one of: voiceprint information of the input sound signal, orientation information of a sound source of the input sound signal, identification information of a microphone generating the input sound signal, a geographical location of the input sound signal, and environment information of the input sound signal.

With reference to the fourth aspect, in a fourth implementation manner of the fourth aspect, the present disclosure further includes:

a first determining module configured to determine the processing parameter according to a preset correspondence between the sound source information and the processing parameter.

With reference to the fourth aspect, in a fifth implementation manner of the fourth aspect, the present disclosure further includes:

a second determining module configured to determine, according to the sound source information, previous processing parameters for processing a previous input sound signal from a sound source corresponding to the sound source information;

a third determination module configured to determine the treatment parameter from the previous treatment parameter.

With reference to the fourth aspect, the present disclosure provides in a sixth implementation form of the fourth aspect, the apparatus is implemented by a first electronic device comprising or connected to a microphone that generates the input sound signal; or

The apparatus is implemented by a server in communication with the first electronic device; or

The apparatus is implemented by a second electronic device in communication with the first electronic device.

With reference to the sixth implementation manner of the fourth aspect, in a seventh implementation manner of the fourth aspect, when the apparatus is implemented by a first electronic device that includes or is connected to a microphone that generates the input sound signal, the acquiring the processing parameter includes:

the processing parameters are obtained locally at the first electronic device.

With reference to the sixth implementation manner of the fourth aspect, in an eighth implementation manner of the fourth aspect, when the apparatus is implemented by a first electronic device that includes or is connected to a microphone that generates the input sound signal, the acquiring the processing parameter includes:

sending the sound source information to a first external device;

With reference to the eighth implementation manner of the fourth aspect, in a ninth implementation manner of the fourth aspect, the first external device is a server that communicates with the first electronic device.

With reference to the sixth implementation manner of the fourth aspect, in a tenth implementation manner of the fourth aspect, the obtaining the processing parameter when the apparatus is implemented by a server in communication with the first electronic device includes:

With reference to the sixth implementation manner of the fourth aspect, in an eleventh implementation manner of the fourth aspect, the obtaining the processing parameter when the apparatus is implemented by a second electronic device communicating with the first electronic device includes:

With reference to the eleventh implementation manner of the fourth aspect, in a twelfth implementation manner of the fourth aspect, the second external device is the first electronic device or the server.

With reference to the sixth implementation manner of the fourth aspect, in a thirteenth implementation manner of the fourth aspect, the processing parameter is determined according to a correspondence between the sound source information and the processing parameter; or

With reference to the fourth aspect, in a fourteenth implementation manner of the fourth aspect, the processing parameter includes a gain factor.

In a fifth aspect, an embodiment of the present disclosure provides a sound signal processing apparatus.

Specifically, the sound signal processing apparatus includes:

the fourth acquisition module is configured to acquire a first sound signal and client information corresponding to the first sound signal;

a fifth obtaining module configured to obtain a processing parameter according to the client information;

and the sixth acquisition module is configured to process the first sound signal according to the processing parameter to obtain a second sound signal.

With reference to the fifth aspect, in a first implementation manner of the fifth aspect, the acquiring, according to the client information, a processing parameter includes:

With reference to the fifth aspect, in a second implementation manner of the fifth aspect, the acquiring, according to the client information, a processing parameter includes:

With reference to the fifth aspect, in a third implementation manner of the fifth aspect, the present disclosure further includes:

a fourth determining module configured to determine the processing parameter according to a preset correspondence between the client information and the processing parameter.

With reference to the fifth aspect, in a fourth implementation manner of the fifth aspect, the present disclosure further includes:

a fifth determining module configured to determine, according to the client information, previous processing parameters for processing a previous first sound signal from a client to which the client information corresponds;

a sixth determination module configured to determine the treatment parameter from the previous treatment parameter.

With reference to the fifth aspect, the present disclosure provides in a fifth implementation form of the fifth aspect, wherein the apparatus is implemented by a server in communication with the client.

With reference to the fifth aspect, in a sixth implementation form of the fifth aspect, the processing parameter includes a gain factor; and/or

In a sixth aspect, an embodiment of the present disclosure provides a sound signal processing apparatus.

Specifically, the sound signal processing apparatus includes:

a seventh obtaining module configured to obtain an input sound signal;

an eighth acquiring module configured to acquire a user ID corresponding to the input sound signal;

a ninth obtaining module configured to obtain a processing parameter according to the user ID;

and the tenth acquisition module is configured to process the input sound signal according to the processing parameter to obtain an output sound signal.

With reference to the sixth aspect, in a first implementation manner of the sixth aspect, the obtaining a user ID corresponding to the input sound signal includes:

With reference to the sixth aspect, in a second implementation manner of the sixth aspect, the acquiring a processing parameter according to the user ID includes:

With reference to the sixth aspect, in a third implementation manner of the sixth aspect, the acquiring, according to the user ID, a processing parameter includes:

With reference to the sixth aspect, in a fourth implementation manner of the sixth aspect, the present disclosure further includes:

a seventh determining module configured to determine the processing parameter according to a preset correspondence between the user ID and the processing parameter.

With reference to the sixth aspect, in a fifth implementation manner of the sixth aspect, the present disclosure further includes:

an eighth determining module configured to determine, according to the user ID, a previous processing parameter for processing a previous input sound signal from a user corresponding to the user ID;

a ninth determination module configured to determine the treatment parameter from the previous treatment parameter.

With reference to the sixth aspect, the present disclosure provides in a sixth implementation form of the sixth aspect, the apparatus is implemented by a first electronic device comprising or connected to a microphone that generates the input sound signal; or

With reference to the sixth implementation manner of the sixth aspect, in a seventh implementation manner of the sixth aspect, when the apparatus is implemented by a first electronic device that includes or is connected to a microphone that generates the input sound signal, the acquiring the processing parameter includes:

the processing parameters are obtained locally at the first electronic device.

With reference to the sixth implementation manner of the sixth aspect, in an eighth implementation manner of the sixth aspect, when the apparatus is implemented by a first electronic device that includes or is connected to a microphone that generates the input sound signal, the acquiring the processing parameter includes:

transmitting the user ID to a first external device;

With reference to the eighth implementation manner of the sixth aspect, in a ninth implementation manner of the sixth aspect, the first external device is a server that communicates with the first electronic device.

With reference to the sixth implementation manner of the sixth aspect, in a tenth implementation manner of the sixth aspect, the obtaining the processing parameter when the apparatus is implemented by a server in communication with the first electronic device includes:

With reference to the sixth implementation manner of the sixth aspect, in an eleventh implementation manner of the sixth aspect, the obtaining the processing parameter when the apparatus is implemented by a second electronic device in communication with the first electronic device includes:

With reference to the eleventh implementation manner of the sixth aspect, in a twelfth implementation manner of the sixth aspect, the second external device is the first electronic device or the server.

With reference to the sixth implementation manner of the sixth aspect, in a thirteenth implementation manner of the sixth aspect, the processing parameter is determined according to a corresponding relationship between the user ID and the processing parameter; or

With reference to the sixth aspect, in a fourteenth implementation manner of the sixth aspect, the processing parameter includes a gain factor.

In a seventh aspect, the present disclosure provides an electronic device, including a memory and a processor, where the memory is configured to store one or more computer instructions, where the one or more computer instructions are executed by the processor to implement the method according to any one of the first aspect, the first implementation manner to the fourteenth implementation manner of the first aspect, the second aspect, the first implementation manner to the sixth implementation manner of the second aspect, the third aspect, and the first implementation manner to the fourteenth implementation manner of the third aspect.

In an eighth aspect, an embodiment of the present disclosure provides a computer-readable storage medium, on which computer instructions are stored, and the computer instructions, when executed by a processor, implement the method according to any one of the first aspect, the first implementation manner of the first aspect, the second aspect, the first implementation manner to the sixth implementation manner of the second aspect, the third aspect, and the first implementation manner to the fourteenth implementation manner of the third aspect.

According to the technical scheme provided by the embodiment of the disclosure, the input sound signal is acquired, the processing parameter determined according to the sound source information of the input sound signal is acquired, the input sound signal is processed according to the processing parameter, and the output sound signal is acquired. According to the embodiment of the disclosure, the processing parameters are determined through the sound source information, so that intelligent automatic gain control is performed on the input sound signals of different users, and compared with the method that gain adjustment is performed only according to the amplitude of the input sound signals, the gain coefficient can be adjusted to a proper value more quickly, so that the method is suitable for application scenarios of quick switching of the input sound signals of different users.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

Other features, objects, and advantages of the present disclosure will become more apparent from the following detailed description of non-limiting embodiments when taken in conjunction with the accompanying drawings. In the drawings:

fig. 1 shows a flow diagram of a sound signal processing method according to an embodiment of the present disclosure;

fig. 2 shows a flow chart of a sound signal processing method according to an embodiment of the present disclosure;

fig. 3 shows a schematic diagram of a sound signal processing method according to an embodiment of the present disclosure;

FIG. 4 shows a schematic diagram of obtaining the processing parameters according to an embodiment of the disclosure;

FIG. 5 shows a schematic diagram of obtaining the processing parameters according to an embodiment of the disclosure;

FIG. 6A shows a schematic diagram of obtaining the processing parameters according to an embodiment of the disclosure;

FIG. 6B shows a schematic diagram of obtaining the processing parameters according to an embodiment of the disclosure;

FIG. 7A shows a schematic diagram of obtaining the processing parameters according to an embodiment of the disclosure;

FIG. 7B shows a schematic diagram of obtaining the processing parameters according to an embodiment of the disclosure;

FIG. 7C illustrates a schematic diagram of obtaining the processing parameters according to an embodiment of the disclosure;

fig. 8 shows a flow chart of a sound signal processing method according to an embodiment of the present disclosure;

fig. 9 shows a flow chart of a sound signal processing method according to an embodiment of the present disclosure;

fig. 10 shows a flow chart of a sound signal processing method according to an embodiment of the present disclosure;

fig. 11 shows a flow chart of a sound signal processing method according to an embodiment of the present disclosure;

fig. 12 shows a schematic diagram of a sound signal processing method according to an embodiment of the present disclosure;

FIG. 13 illustrates a schematic diagram of obtaining the processing parameters according to an embodiment of the disclosure;

FIG. 14 shows a schematic diagram of obtaining the processing parameters according to an embodiment of the disclosure;

FIG. 15A shows a schematic diagram of obtaining the processing parameters according to an embodiment of the disclosure;

FIG. 15B shows a schematic diagram of obtaining the processing parameters according to an embodiment of the disclosure;

FIG. 16A shows a schematic diagram of obtaining the processing parameters according to an embodiment of the disclosure;

FIG. 16B shows a schematic diagram of obtaining the processing parameters according to an embodiment of the disclosure;

FIG. 16C shows a schematic diagram of obtaining the processing parameters according to an embodiment of the disclosure;

fig. 17 shows a block diagram of a structure of a sound signal processing apparatus according to an embodiment of the present disclosure;

fig. 18 shows a block diagram of a structure of a sound signal processing apparatus according to an embodiment of the present disclosure;

fig. 19 shows a block diagram of a structure of a sound signal processing apparatus according to an embodiment of the present disclosure;

FIG. 20 shows a block diagram of an electronic device according to an embodiment of the present disclosure;

fig. 21 shows a schematic structural diagram of a computer system suitable for implementing a sound signal processing method according to an embodiment of the present disclosure.

Detailed Description

Hereinafter, exemplary embodiments of the present disclosure will be described in detail with reference to the accompanying drawings so that those skilled in the art can easily implement them. Also, for the sake of clarity, parts not relevant to the description of the exemplary embodiments are omitted in the drawings.

In the present disclosure, it is to be understood that terms such as "including" or "having," etc., are intended to indicate the presence of the disclosed features, numbers, steps, behaviors, components, parts, or combinations thereof, and are not intended to preclude the possibility that one or more other features, numbers, steps, behaviors, components, parts, or combinations thereof may be present or added.

It should be further noted that the embodiments and features of the embodiments in the present disclosure may be combined with each other without conflict. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.

As described above, in the process of a teleconference or a video conference by multiple users, when the electronic device at one end acquires multiple input sound signals of multiple users, the amplitude of the input sound signals varies from person to person, and therefore, when the input sound signals of the multiple users are switched, the adjustment speed of the automatic gain control is not timely, the automatic gain control cannot reach a steady state in time, and the effect of the teleconference or the video conference is affected.

The present disclosure is made to solve, at least in part, the problems in the prior art that the inventors have discovered.

Fig. 1 illustrates a flow chart of a sound signal processing method according to an embodiment of the present disclosure. As shown in fig. 1, the sound signal processing method includes the following steps S101 to S103:

in step S101, an input sound signal is acquired;

in step S102, acquiring a processing parameter determined according to sound source information of the input sound signal;

in step S103, the input sound signal is processed according to the processing parameter, and an output sound signal is obtained.

According to the embodiment of the present disclosure, the processing parameter includes a gain coefficient, and may also include other parameters for processing the input sound signal.

According to the embodiment of the disclosure, during a teleconference or video conference by multiple users, multiple users may input sound through the audio input device, wherein the audio input device may include one or more microphones, for example, multiple users may input sound through one common microphone, or multiple users may input sound through respective microphones, or multiple users may input sound through a microphone array. An input sound signal, which refers to an electrical signal converted from a sound input by a user, may be acquired by the electronic device. The electronic device may be an electronic device that contains or is connected to the audio input device, or the electronic device may be a server or a remote electronic device that communicates with the electronic device that contains or is connected to the audio input device. The server may for example comprise an edge server of a content distribution network or a conference server for providing a network conference service, and the electronic device may for example be any one or combination of: a handheld terminal device, a notebook computer, a cellular phone, a smart phone, a personal digital assistant computer, a tablet computer, a cordless phone, an internet of things device (IOT device), or other terminal devices, which are not specifically limited in this disclosure.

According to the embodiments of the present disclosure, since the input sound signals of the multiple users are different from person to person, in order to quickly adjust the gain control parameter when the input sound signals of the multiple users are switched, the sound source information of the input sound signals can be obtained, where the sound source information includes information for determining the source of the input sound signals, for example, for determining the user who utters the sound corresponding to the input sound signals. Because the amplitudes of the input sound signals corresponding to different users are different, the processing parameters can be determined according to the sound source information of the input sound signals, that is, the corresponding processing parameters are determined according to different sound sources (for example, different users), the corresponding input sound signals are processed according to the determined processing parameters, and the corresponding output sound signals are obtained, so that the corresponding automatic gain control is performed on the input sound signals of different users.

According to an embodiment of the present disclosure, the sound source information includes at least one of: voiceprint information of the input sound signal, orientation information of a sound source of the input sound signal, identification information of a microphone generating the input sound signal, a geographical location of the input sound signal, and environment information of the input sound signal.

According to an embodiment of the present disclosure, the sound source information may include voiceprint information of the input sound signal. The voiceprint recognition method can acquire the voiceprint information of the input sound signal by using the voiceprint recognition model based on the input sound signal, and can judge the source user of the sound according to the voiceprint information of the input sound signal because different users have different voiceprint characteristics. The voiceprint recognition model is not specifically limited in the disclosure, and any model capable of recognizing voiceprint information is within the protection scope of the embodiment of the disclosure, for example, an I-vector, an X-vector, a deep learning model, or the like.

According to an embodiment of the present disclosure, the sound source information may include orientation information of a sound source of the input sound signal. The Direction of Arrival (DOA) estimation algorithm may be used to estimate the Direction of Arrival of the input sound signal, where the Direction of Arrival is used to indicate the incoming Direction of the sound corresponding to the input sound signal reaching the microphone array, so as to determine the Direction information of the target sound source corresponding to the input sound signal, and determine the source user of the sound according to the Direction information of the sound source of the input sound signal.

According to an embodiment of the present disclosure, the sound source information may include identification information of a microphone that generates the input sound signal, where the identification information of the microphone is used to identify different microphones, and the identification information is not specifically limited in the present disclosure and may be selected according to actual needs. When a plurality of users input sound signals through their own microphones, respectively, since different microphones may have different identification information, it is possible to determine the source user of the sound based on the identification information of the microphone that generates the input sound signal.

According to the embodiment of the disclosure, the sound source information may include a geographical position of the input sound signal, wherein the geographical position is used to indicate the geographical position of the sound source, so that different processing parameters may be determined according to the different geographical positions of the sound source. For example, a geographical location with high environmental noise, such as a kitchen, a mall, etc., may have a large gain factor; for another example, a geographical environment with less environmental noise, such as a conference room, an office, etc., may use a smaller gain factor.

According to the embodiment of the disclosure, the sound source information may include environment information of the input sound signal, where the environment information is used to represent an environment in which the sound source is located, so that different processing parameters may be determined according to the environment in which the sound source is located. For example, a noisy environment may have a larger gain factor and a quiet environment may have a smaller gain factor.

According to the technical scheme provided by the embodiment of the disclosure, the sound source information can be determined according to one or more items of voiceprint information of the input sound signal, azimuth information of a sound source of the input sound signal or identification information of a microphone generating the input sound signal, the geographic position of the input sound signal and environment information of the input sound signal, so as to conveniently judge a source user of sound corresponding to the input sound signal, and accordingly, corresponding processing parameters are acquired aiming at the input sound signals of users with different sources; or acquiring corresponding processing parameters according to the geographical position or the environmental information of the sound source of the input sound signal.

According to an embodiment of the present disclosure, the step S102 of acquiring the processing parameter determined according to the sound source information of the input sound signal includes: and when the sound source of the input sound signal is determined to be different from the sound source of the input sound signal processed last time according to the sound source information, acquiring the processing parameter determined according to the sound source information of the input sound signal.

According to the embodiment of the present disclosure, after determining the sound source information of the input sound signal according to the current input sound signal, the sound source of the current input sound signal may be compared with the sound source of the input sound signal processed last time, and when the sound source of the current input sound signal is different from the sound source of the input sound signal processed last time, it indicates that users corresponding to the input sound signals of the previous and subsequent times change, and since the input sound signals corresponding to different users are very different, it is necessary to determine the corresponding processing parameters according to the sound source information of the current input sound signal.

According to an embodiment of the present disclosure, the step S102 of acquiring the processing parameter determined according to the sound source information of the input sound signal includes: and when the sound source of the input sound signal is determined to be the same as the sound source of the input sound signal processed most recently according to the sound source information, adjusting the processing parameters according to a preset rule so that the amplitude of the output sound signal meets a preset condition.

According to the embodiment of the disclosure, after determining the sound source information of the input sound signal according to the current input sound signal, the sound source of the current input sound signal may be compared with the sound source of the input sound signal processed last time, and when the sound source of the current input sound signal is the same as the sound source of the input sound signal processed last time, it indicates that the users corresponding to the input sound signals of the previous and subsequent times do not change. The preset rules and the preset conditions are not specifically limited in the present disclosure, and can be selected according to actual needs. For example, the preset condition may be that the amplitude of the output sound signal is within a preset range, and the preset rule is that the value of the gain coefficient is adjusted so that the amplitude of the output sound signal obtained by amplifying the input sound signal by the gain coefficient is within the preset range.

The embodiment of the present disclosure will be described by taking the input sound signals of three users as an example, and it should be understood that this example is only used as an example and is not a limitation to the present disclosure. For example, assume that the electronic device can obtain input sound signals for user A, user B, and user C, assume t₁The sound source of the input sound signal processed at the time is user a. Let t₂Determining the sound source of the input sound signal as user B according to the sound source information at the moment, because of t₂The sound source users B of the input sound signals at the moment are different from t₁The sound source user A of the input sound signal processed at the time can be according to t₂The sound source information of the input sound signal at the time point determines corresponding processing parameters for the input sound signal of the user B. Let t₃Determining the sound source of the input sound signal as user B according to the sound source information at the moment, because of t₃Sound source and t of time input sound signal₂The sound sources of the input sound signals processed at all times are all users B, and the input sound signals of the same user in a fixed application scene generally do not fluctuate greatly, so that the processing parameters can be adjusted according to preset rules, so that the amplitude of the output sound signals corresponding to the input sound signals meets preset conditions.

According to the technical scheme provided by the embodiment of the disclosure, whether the sound source of the current input sound signal is the same as the sound source of the input sound signal processed last time is compared, and different processing parameter determining modes are adopted, so that the processing parameters can be determined efficiently and quickly according to the input sound signals of different users.

According to an embodiment of the present disclosure, the sound signal processing method further includes: and determining the processing parameters according to the preset corresponding relation between the sound source information and the processing parameters.

According to the embodiment of the present disclosure, a preset corresponding relationship between one or more sound source information and corresponding processing parameters may be pre-established, for example, assuming that the electronic device may obtain input sound signals of N users, where N is an integer greater than or equal to 1, assuming that the input sound signals are determined to originate from a user i according to the sound source information, where i is an integer greater than or equal to 1 and less than or equal to N, and the corresponding processing parameters may be parameters in an ith value range.

Table 1 shows the preset correspondence between the pre-established sound source information and the processing parameters:

sound source information	Processing parameters
		Sound source information corresponding to user 1	First value range
Audio source information corresponding to user 2	Second value range
		……	……
Sound source information corresponding to user N	Value range of Nth

According to the embodiment of the disclosure, after the sound source information of the current input sound signal is determined, the processing parameter corresponding to the current input sound signal may be determined according to the preset corresponding relationship between the sound source information and the corresponding processing parameter. For example, if the sound source information is the sound source information corresponding to the user 1, the corresponding processing parameter may be a parameter in the first value range; assuming that the sound source information is the sound source information corresponding to the user 2, the corresponding processing parameter may be a parameter in the second value range; assuming that the sound source information is the sound source information corresponding to the user N, the corresponding processing parameter may be a parameter in the nth value range.

According to the technical scheme provided by the embodiment of the disclosure, through the pre-established preset corresponding relationship between the sound source information and the processing parameters, after the sound source information is determined, the processing parameters corresponding to the sound source information can be rapidly determined through the preset corresponding relationship.

Fig. 2 illustrates a flow chart of a sound signal processing method according to an embodiment of the present disclosure. As shown in fig. 2, the sound signal processing method further includes the following steps S201 to S202:

in step S201, determining previous processing parameters for processing a previous input sound signal from a sound source corresponding to the sound source information according to the sound source information;

in step S202, the processing parameters are determined from the previous processing parameters.

Since the input sound signal of the same user in a fixed application scene generally does not have large fluctuation, the current processing parameters of the input sound signal of the user can be determined according to the previous processing parameters for processing the input sound signal of the same user. According to the embodiments of the present disclosure, according to the sound source information, a previous processing parameter for processing a previous input sound signal from a sound source corresponding to the sound source information may be determined, and a processing parameter corresponding to the current input sound signal may be determined according to the previous processing parameter, that is, the previous processing parameter of the previous input sound signal with the same sound source may be used as a reference for determining the processing parameter corresponding to the current input sound signal, for example, the previous processing parameter may be fine-tuned to obtain the processing parameter corresponding to the current input sound signal, or the previous processing parameter may be used as the processing parameter corresponding to the current input sound signal.

According to the technical scheme provided by the embodiment of the disclosure, the processing parameters corresponding to the current input sound signal can be quickly acquired by referring to the previous processing parameters of the previous input sound signal of the same sound source, so that the currently applicable processing parameters can be quickly determined.

According to an embodiment of the present disclosure, the method is performed by a first electronic device comprising or connected to a microphone generating the input sound signal; or

Fig. 3 shows a schematic diagram of a sound signal processing method according to an embodiment of the present disclosure. As shown in fig. 3, a user a, a user B, and a user C may input sound signals through microphones, and the microphones generating the input sound signals may be connected to a first electronic device 301, and the first electronic device 301 may communicate with a server 302, or may communicate with a second electronic device 303, or may communicate with the second electronic device 303 via the server 302. It should be understood that this example is used only as an example, and is not a limitation to the present disclosure, and the number of the users, the microphones, the electronic devices, and the servers in the present disclosure may be set according to actual needs, and the types and connection manners of the microphones, the electronic devices, and the servers in the present disclosure may be set according to actual needs, and the present disclosure is not limited thereto specifically.

The sound signal processing method in the embodiment of the present disclosure may be executed by the first electronic device 301, and the first electronic device 301 may locally and quickly acquire the processing parameters and output the sound signal, so that the real-time performance of sound signal processing may be improved. Meanwhile, the sound corresponding to the sound signal can be output through the audio output equipment (loudspeaker), so that the user can timely adjust the strength of the subsequent input sound signal according to the volume of the output sound. For example, if the user a finds that the volume of the output sound is relatively small, the intensity of the subsequent input sound signal may be increased, i.e., the volume of the input sound is increased.

The sound signal processing method in the embodiment of the present disclosure may be performed by a server 302 in communication with the first electronic device 301, wherein the server 302 may include a conference server of a web conference or an edge server of a content distribution network. After the server 302 acquires the output sound signal, the output sound signal may be transmitted to the first electronic device 301, thereby reducing the processing load of the near-end electronic device, i.e., the first electronic device 301. Meanwhile, after the server 302 acquires the output sound signal, the output sound signal can be sent to other electronic devices which need to acquire the output sound signal and communicate with the server 302, such as the second electronic device 303, so that the processing load of the remote electronic device is reduced.

The sound signal processing method in the embodiment of the present disclosure may be executed by the second electronic device 303 communicating with the first electronic device 301, and after the second electronic device 303 acquires the output sound signal, the user D and the user E local to the second electronic device 303 may acquire the output sound corresponding to the output sound signal in time, that is, the output sound corresponding to the sound signal input by the user a, the user B, and the user C may be acquired in time.

According to the technical solution provided by the embodiment of the present disclosure, according to the needs of an application scenario, a first electronic device, a second electronic device, or a server may be set as an execution subject of the sound signal processing method according to the embodiment of the present disclosure, so as to reduce the processing load of the electronic device and/or improve the real-time performance of the response of the electronic device, where the electronic device includes the first electronic device, the second electronic device, or the server.

According to an embodiment of the disclosure, when the method is performed by a first electronic device comprising or connected to a microphone generating the input sound signal, the obtaining the processing parameter comprises:

the processing parameters are obtained locally at the first electronic device.

Fig. 4 shows a schematic diagram of obtaining the processing parameters according to an embodiment of the disclosure. As shown in fig. 4, a user a, a user B, and a user C may input sound signals through microphones, and a microphone generating the input sound signal may be connected to the first electronic device 401. It should be understood that this example is only used as an example, and is not a limitation to the present disclosure, and the number of the user, the microphone, and the first electronic device 401 in the present disclosure may be set according to actual needs, and the type and the connection manner of the microphone and the first electronic device 401 in the present disclosure may be set according to actual needs, which is not specifically limited by the present disclosure.

According to an embodiment of the present disclosure, for example, after the user a inputs a sound signal through a microphone, the first electronic device 401 connected to the microphone may acquire the input sound signal. The first electronic device 401 may determine the sound source information of the input sound signal according to the input sound signal, for example, determine that the input sound signal originates from the user a. The first electronic device 401 may determine, according to the sound source information, a processing parameter corresponding to the current input sound signal of the user a. The first electronic device 401 may process the current input sound signal of the user a according to the processing parameter and acquire an output sound signal. The user a, the user B, and the user C can acquire output sounds corresponding to the output sound signals through the audio output device (speaker) of the first electronic device 401.

sending the sound source information to a first external device;

According to an embodiment of the present disclosure, the first external device is a server in communication with the first electronic device.

Fig. 5 shows a schematic diagram of obtaining the processing parameters according to an embodiment of the disclosure. As shown in fig. 5, the embodiment of the present disclosure will be described by taking the first external device as a server 502 communicating with the first electronic device 501 as an example, and it should be understood that this example is only used as an example and is not a limitation to the present disclosure.

According to the embodiment of the present disclosure, after the first electronic device 501 acquires the input sound signal, the sound source information of the input sound signal may be determined according to the input sound signal. In order to reduce the processing load of the first electronic device 501, the sound source information may be transmitted to the server 502 communicating with the first electronic device 501, and the server 502 may determine the processing parameters from the sound source information. The server 502, after obtaining the processing parameters, may send the processing parameters to the first electronic device 501. The first electronic device 501, after acquiring the processing parameters, may process the input sound signal according to the processing parameters and acquire the output sound signal.

According to an embodiment of the present disclosure, when the method is performed by a server in communication with the first electronic device, the obtaining the processing parameter includes:

Fig. 6A shows a schematic diagram of obtaining the processing parameter according to an embodiment of the disclosure. As shown in fig. 6A, after the first electronic device 601A acquires the input sound signal, in order to reduce the processing load of the first electronic device 601A, the input sound signal may be transmitted to the server 602A communicating with the first electronic device 601A. The server 602A may determine sound source information from the input sound signal and determine processing parameters from the sound source information. After acquiring the processing parameters, the server 602A may process the input sound signal according to the processing parameters and acquire the output sound signal, so as to send the output sound signal to the electronic device that needs to acquire the output sound signal, for example, the first electronic device 601A, or other electronic devices that need to acquire the output sound signal.

Fig. 6B shows a schematic diagram of obtaining the processing parameter according to an embodiment of the disclosure. As shown in fig. 6B, after the first electronic device 601B acquires the input sound signal, it may determine sound source information according to the input sound signal, and determine a processing parameter corresponding to the input sound signal according to the sound source information. To reduce a portion of the processing load of the first electronic device 601A, the first electronic device 601B may transmit the input sound signal and the processing parameters corresponding to the input sound signal to the server 602B in communication with the first electronic device 601B. After acquiring the input sound signal and the processing parameter, the server 602B may process the input sound signal according to the processing parameter and acquire the output sound signal, so as to send the output sound signal to the electronic device that needs to acquire the output sound signal, for example, the first electronic device 601B, thereby reducing the processing load of the first electronic device 601B to some extent; or the server 602B may send the output sound signal to other electronic devices that need to acquire the output sound signal, thereby reducing the processing load of the other electronic devices to some extent.

According to an embodiment of the present disclosure, when the method is performed by a second electronic device in communication with the first electronic device, the obtaining the processing parameter includes:

According to an embodiment of the present disclosure, the second external device is the first electronic device or the server.

Fig. 7A shows a schematic diagram of obtaining the processing parameter according to an embodiment of the disclosure. As shown in fig. 7A, after the first electronic device 701A acquires the input sound signal, the input sound signal may be transmitted to the second electronic device 703A in communication with the first electronic device 701A. The second electronic device 703A may determine sound source information from the input sound signal and determine a processing parameter corresponding to the input sound signal from the sound source information. After acquiring the processing parameter, the second electronic device 703A may process the input sound signal according to the processing parameter, and acquire the output sound signal, so that a user local to the second electronic device 703A may acquire the output sound corresponding to the output sound signal.

Fig. 7B shows a schematic diagram of obtaining the processing parameter according to an embodiment of the disclosure. As shown in fig. 7B, the embodiment of the present disclosure will be described by taking the second external device as the first electronic device 701B as an example, and it should be understood that this example is used only as an example and is not a limitation to the present disclosure.

According to the embodiment of the disclosure, after the first electronic device 701B acquires the input sound signal, the sound source information may be determined according to the input sound signal, and the processing parameter corresponding to the input sound signal may be determined according to the sound source information. The first electronic device 701B may transmit the input sound signal and the processing parameter corresponding to the input sound signal to the second electronic device 703B that is in communication with the first electronic device 701B. After acquiring the processing parameters, the second electronic device 703B may process the input sound signal according to the processing parameters, and acquire the output sound signal, so that a user local to the second electronic device 703B may acquire the output sound corresponding to the output sound signal, and at the same time, may reduce the processing loads of the first electronic device 701B and the second electronic device 703B to a certain extent.

Fig. 7C shows a schematic diagram of obtaining the processing parameter according to an embodiment of the disclosure. As shown in fig. 7C, the embodiment of the present disclosure will be described by taking the second external device as the server 702C, and it should be understood that this example is only used as an example and is not a limitation to the present disclosure.

According to an embodiment of the present disclosure, after the first electronic device 701C acquires the input sound signal, the input sound signal may be transmitted to the second electronic device 703C in communication with the first electronic device 701C. After the first electronic device 701C acquires the input sound signal, sound source information may be determined from the input sound signal, and may also be transmitted to the server 702C in communication with the first electronic device 701C. Server 702C may determine processing parameters corresponding to the input sound signal from the sound source information. The server 702C may send the processing parameter corresponding to the input sound signal to the second electronic device 703C. After acquiring the input sound signal and the processing parameter, the second electronic device 703C may process the input sound signal according to the processing parameter and acquire the output sound signal, so that a user located locally at the second electronic device 703C may acquire the output sound corresponding to the output sound signal, and at the same time, the processing loads of the first electronic device 701C, the second electronic device 703C, and the server 702C may be reduced to a certain extent.

According to the technical scheme provided by the embodiment of the disclosure, according to the needs of an application scenario, the first electronic device, the server or the second electronic device may be set as an execution subject for determining the processing parameters according to the sound source information, so as to reduce the processing load of the electronic device to a certain extent and/or improve the real-time performance of the response of the electronic device, where the electronic device includes the first electronic device, the second electronic device or the server.

According to an embodiment of the present disclosure, the processing parameter is determined according to a correspondence between the sound source information and the processing parameter; or

According to an embodiment of the present disclosure, when the first electronic device, the server, or the second electronic device is an execution subject that determines the processing parameter according to the sound source information, the processing parameter may be determined according to a correspondence between the sound source information and the processing parameter, or the processing parameter may be determined according to a previous processing parameter for processing a previously input sound signal from a sound source to which the sound source information corresponds. The specific determination method of the processing parameters is described in detail above, and is not described herein again.

Fig. 8 illustrates a flow chart of a sound signal processing method according to an embodiment of the present disclosure. As shown in fig. 8, the sound signal processing method includes the following steps S801 to S803:

in step S801, a first sound signal and client information corresponding to the first sound signal are acquired;

in step S802, a processing parameter is acquired according to the client information;

in step S803, the first sound signal is processed according to the processing parameter to obtain a second sound signal.

According to the embodiment of the present disclosure, during a teleconference or video conference by multiple users, each user may input sound through one or more clients, for example, each user may input sound through one fixed client, or each user may input sound through different clients, respectively. The client may be, for example, a combination of any one or more of the following: other terminal devices such as handheld terminal devices, notebook computers, cellular phones, smart phones, personal digital assistants, tablet computers, cordless phones, internet of things devices (IOT devices), and the like, are not specifically limited in this disclosure.

According to an embodiment of the present disclosure, the method is performed by a server in communication with the client.

According to an embodiment of the present disclosure, a first sound signal, which refers to an electrical signal converted from a sound input by a user, may be acquired by a server communicating with a client. The server may include, for example, an edge server of a content distribution network or a conference server for providing a network conference service, which is not particularly limited in this disclosure.

According to an embodiment of the present disclosure, the client information includes at least one of: identification information of the client, a geographic location of the client, and environment information of the client.

According to an embodiment of the present disclosure, since the first sound signals of the plurality of users are different from person to person, in order to quickly adjust the gain control parameter when the first sound signals of the plurality of users are switched, client information corresponding to the first sound signals may be acquired, where the client information may include at least one of: identification information of the client, geographic location of the client, and environment information of the client. The identification information of the client is used for identifying different clients so as to judge the user emitting the sound corresponding to the first sound signal; the geographical position of the client is used for representing the geographical position of the client which receives the sound corresponding to the first sound signal; the environment information of the client is used for representing the environment information of the client receiving the sound corresponding to the first sound signal.

According to the embodiment of the present disclosure, the processing parameter includes a gain coefficient, and may also include other parameters for processing the first sound signal. According to the embodiment of the disclosure, because the amplitudes of the first sound signals corresponding to different users are different, and the client used by the user is relatively fixed, the corresponding processing parameters can be determined according to the client information. In addition, the corresponding processing parameters may be adopted in consideration of different geographical locations, environmental information, and the like of the clients that receive the sound corresponding to the first sound signal. For example, a geographical location with high environmental noise (e.g., a kitchen, a mall, etc.) may select a large gain factor, and a geographical environment with low environmental noise (e.g., a conference room, an office, etc.) may select a small gain factor; or, the user receiving the sound corresponding to the first sound signal corresponds to different gain coefficients when the user is in different environments, for example, a noisy environment may select a larger gain coefficient, and a quiet environment may select a smaller gain coefficient. Therefore, the processing parameters can be determined according to the client information corresponding to the first sound signal, that is, the corresponding processing parameters are determined according to different client information, the corresponding first sound signal is processed according to the determined processing parameters, and the corresponding second sound signal is obtained, so that the corresponding automatic gain control can be performed on the first sound signals of different users.

According to the technical scheme provided by the embodiment of the disclosure, the first sound signal and the client information corresponding to the first sound signal are obtained, the processing parameter is obtained according to the client information, and the first sound signal is processed according to the processing parameter to obtain the second sound signal. According to the method and the device, the processing parameters are determined through the client information, intelligent automatic gain control is performed on the first sound signals of different users corresponding to different client information, and compared with the method and the device which only perform gain adjustment according to the amplitude of the first sound signal, the gain coefficient can be adjusted to a proper value more quickly, so that the method and the device are suitable for application scenarios of quick switching of the first sound signals of different users.

According to an embodiment of the present disclosure, the step S802, namely obtaining a processing parameter according to the client information, includes: and when the client information is different from the client information corresponding to the first sound signal processed last time, acquiring a processing parameter determined according to the client information.

According to the embodiment of the disclosure, after determining the client information corresponding to the first sound signal according to the current first sound signal, the client information corresponding to the current first sound signal may be compared with the client information corresponding to the first sound signal processed last time, and when the client information corresponding to the current first sound signal is different from the client information corresponding to the first sound signal processed last time, it indicates that the users corresponding to the first sound signal in the two previous times and the two next times change, and since the first sound signals corresponding to different users are very different, the corresponding processing parameters need to be determined according to the client information corresponding to the current first sound signal.

According to an embodiment of the present disclosure, the step S802, namely obtaining a processing parameter according to the client information, includes: and when the client information is the same as the client information corresponding to the first sound signal processed most recently, adjusting the processing parameter according to a preset rule so that the amplitude of the second sound signal meets a preset condition.

According to the embodiment of the present disclosure, after determining the client information corresponding to the first sound signal according to the current first sound signal, the client information corresponding to the current first sound signal may be compared with the client information corresponding to the last processed first sound signal, when the client information corresponding to the current first sound signal is the same as the client information corresponding to the first sound signal processed last time, it indicates that the users corresponding to the first sound signals before and after the last time have not changed, since the first sound signal of the same user in a fixed application scene does not fluctuate greatly, therefore, the processing parameter may be adjusted according to a preset rule so that the amplitude of the second sound signal corresponding to the first sound signal satisfies a preset condition, for example, the processing parameter may be determined according to a preset automatic gain control rule. The preset rules and the preset conditions are not specifically limited in the present disclosure, and can be selected according to actual needs. For example, the preset condition may be that the amplitude of the second sound signal is within a preset range, and the preset rule is that the value of the gain coefficient is adjusted so that the amplitude of the second sound signal obtained by amplifying the first sound signal by the gain coefficient is within the preset range.

The embodiment of the present disclosure will be described by taking the first sound signals of three users as an example, and it should be understood that this example is only used as an example and is not a limitation to the present disclosure. For example, suppose that user a inputs sound through client a, user B inputs sound through client B, and user C inputs sound through client C, the server may acquire first sound signals of user a, user B, and user C, respectively. Let t₁The client corresponding to the first sound signal processed at the moment is the client a, namely, the client corresponds to the user a. Let t₂Determining that the client corresponding to the first sound signal is the client B, namely the corresponding user B, at the moment according to the client information, because t₂The client B corresponding to the first sound signal at the moment is different from t₁The client A corresponding to the first sound signal processed at the moment, therefore, the client A can be processed according to t₂And the client information corresponding to the first sound signal at the moment determines corresponding processing parameters for the first sound signal of the user B. Let t₃Determining that the client corresponding to the first sound signal is the client B, namely the corresponding user B, at the moment according to the client information, because t₃Client corresponding to first sound signal at moment and t₂The clients corresponding to the first sound signal processed at any moment are all clients B, namely users B, and because the first sound signal of the same user in the fixed application scene generally does not fluctuate greatly, the processing parameters can be adjusted according to the preset rules, so that the amplitude of the second sound signal corresponding to the first sound signal meets the preset conditions.

According to the technical scheme provided by the embodiment of the disclosure, whether the client information corresponding to the current first sound signal is the same as the client information corresponding to the first sound signal processed most recently is compared, and different modes of determining the processing parameters are adopted, so that the processing parameters can be efficiently and quickly determined for the first sound signals of different users corresponding to different clients.

According to an embodiment of the present disclosure, the sound signal processing method further includes: and determining the processing parameters according to the preset corresponding relation between the client information and the processing parameters.

According to the embodiment of the present disclosure, a preset corresponding relationship between one or more pieces of client information and corresponding processing parameters may be established in advance, for example, it is assumed that a server may obtain first sound signals of N users and N pieces of client information corresponding to the first sound signals of the N users, where N is an integer greater than or equal to 1.

Table 2 shows the preset correspondence between the client information and the processing parameters established in advance:

client information	Processing parameters
		Client information 1 corresponding to user 1	First value range
Client information 2 corresponding to user 2	Second value range
		……	……
Client information N corresponding to user N	Value range of Nth

According to the embodiment of the disclosure, after the client information corresponding to the current first sound signal is determined, the processing parameter corresponding to the current first sound signal may be determined according to the preset corresponding relationship between the client information and the corresponding processing parameter. For example, if the client information is the client information 1 corresponding to the user 1, the corresponding processing parameter may be a parameter in the first value range; assuming that the client information is the client information 2 corresponding to the user 2, the corresponding processing parameter may be a parameter in the second value range; assuming that the client information is the client information N corresponding to the user N, the corresponding processing parameter may be a parameter in the nth value range.

According to the technical scheme provided by the embodiment of the disclosure, through the preset corresponding relation between the client information and the processing parameters which is established in advance, after the client information is determined, the processing parameters corresponding to the client information can be rapidly determined through the preset corresponding relation.

Fig. 9 illustrates a flowchart of a sound signal processing method according to an embodiment of the present disclosure. As shown in fig. 9, the sound signal processing method further includes the following steps S901 to S902:

in step S901, determining a previous processing parameter for processing a previous first sound signal from a client corresponding to the client information according to the client information;

in step S902, the processing parameters are determined from the previous processing parameters.

Since the first sound signal of the same user in a fixed application scenario generally does not fluctuate much, the current processing parameters of the first sound signal of the same user (i.e. the same client) may be determined according to the previous processing parameters for processing the first sound signal of the same user. According to the embodiment of the present disclosure, a previous processing parameter for processing a previous first sound signal from a client corresponding to the client information may be determined according to the client information, and a processing parameter corresponding to a current first sound signal may be determined according to the previous processing parameter, that is, the previous processing parameter of the previous first sound signal of the same client may be used as a reference for determining the processing parameter corresponding to the current first sound signal, for example, the previous processing parameter may be fine-tuned to obtain the processing parameter corresponding to the current first sound signal, or the previous processing parameter may be used as the processing parameter corresponding to the current first sound signal.

According to the technical scheme provided by the embodiment of the disclosure, by referring to the previous processing parameters of the previous first sound signal of the same client, the processing parameters corresponding to the current first sound signal can be quickly acquired, so that the currently applicable processing parameters can be quickly determined.

Fig. 10 illustrates a flowchart of a sound signal processing method according to an embodiment of the present disclosure. As shown in fig. 10, the sound signal processing method includes the following steps S1001 to S1004:

in step S1001, an input sound signal is acquired;

in step S1002, a user ID corresponding to the input sound signal is acquired;

in step S1003, acquiring a processing parameter according to the user ID;

in step S1004, the input audio signal is processed according to the processing parameter, so as to obtain an output audio signal.

According to the embodiment of the disclosure, since the input sound signals of the multiple users are different from person to person, in order to quickly adjust the gain control parameter when the input sound signals of the multiple users are switched, the user ID corresponding to the input sound signals can be acquired, where the user ID includes user identification information for identifying different users corresponding to different input sound signals. Because the input sound signals corresponding to different users have different amplitudes, the processing parameters can be determined according to the user IDs corresponding to the input sound signals, that is, the corresponding processing parameters are determined according to different user IDs, the corresponding input sound signals are processed according to the determined processing parameters, and the corresponding output sound signals are obtained, so that the corresponding automatic gain control is performed on the input sound signals of different users.

According to the technical scheme provided by the embodiment of the disclosure, the input sound signal is acquired, the user ID corresponding to the input sound signal is acquired, the processing parameter is acquired according to the user ID, and the input sound signal is processed according to the processing parameter to obtain the output sound signal. According to the embodiment of the disclosure, the processing parameters are determined through the user ID, so that intelligent automatic gain control is performed on the input sound signals of different users, and compared with the method that gain adjustment is performed only according to the amplitude of the input sound signals, the gain coefficient can be adjusted to a proper value more quickly, so that the method and the device are suitable for application scenarios of quick switching of the input sound signals of different users.

According to an embodiment of the present disclosure, the acquiring a user ID corresponding to the input sound signal includes:

According to an embodiment of the present disclosure, a user ID corresponding to an input sound signal may be determined according to voiceprint information of the input sound signal. The voiceprint recognition model can be used for acquiring the voiceprint information of the input sound signal based on the input sound signal, and different users have different voiceprint characteristics, so that the user ID corresponding to the input sound signal can be judged according to the voiceprint information of the input sound signal. The voiceprint recognition model is not specifically limited in the disclosure, and any model capable of recognizing voiceprint information is within the protection scope of the embodiment of the disclosure, for example, an I-vector, an X-vector, a deep learning model, or the like.

According to the embodiment of the disclosure, the user ID corresponding to the input sound signal can be determined according to the microphone identification information corresponding to the input sound signal, wherein the identification information of the microphone is used for identifying different microphones. When a plurality of users input sound signals through their own microphones, respectively, since different microphones may have different identification information, the user ID corresponding to the input sound signal may be determined based on the identification information of the microphone that generates the input sound signal.

According to the embodiment of the disclosure, the user ID corresponding to the input sound signal can be determined according to the client information corresponding to the input sound signal, where the client information includes identification information of the client for identifying different clients, and the identification information of the client is not specifically limited by the disclosure and can be selected according to actual needs. When a plurality of users input sound signals through their own clients, respectively, since different clients may have different identification information, the user ID corresponding to the input sound signal may be determined based on the identification information of the client that generated the input sound signal.

According to an embodiment of the present disclosure, a user ID corresponding to an input sound signal may be determined according to semantic information corresponding to the input sound signal. Semantic information of a sound corresponding to the input sound signal may be analyzed using a semantic analysis model based on the input sound signal, for example, the semantic information may include identification information of a user, such as a name or a work number, and a user ID corresponding to the input sound signal may be acquired based on the semantic information, and thus, the user ID corresponding to the input sound signal may be determined from the semantic information corresponding to the input sound signal. The semantic analysis model is not specifically limited in the present disclosure, and any model capable of implementing semantic analysis is within the scope of the embodiments of the present disclosure.

According to the technical scheme provided by the embodiment of the disclosure, the user ID corresponding to the input sound signal can be determined according to one or more items of voiceprint information of the input sound signal, microphone identification information corresponding to the input sound signal, client information corresponding to the input sound signal, and semantic information corresponding to the input sound signal, so that corresponding processing parameters are acquired for the input sound signals of different users.

According to an embodiment of the present disclosure, a processing parameter is acquired according to the user ID and at least one of: the geographical position of the user and the environment information of the user.

According to the embodiment of the disclosure, the user corresponds to different gain coefficients when being in different geographical positions, for example, a geographical position with higher environmental noise can select a larger gain coefficient, and a geographical environment with lower environmental noise can select a smaller gain coefficient; or, the user may correspond to different gain coefficients when the user is in different environments, for example, a noisy environment may select a larger gain coefficient, and a quiet environment may select a smaller gain coefficient. Therefore, the processing parameters can be determined according to the geographical position and/or the environment information where the user is located, that is, the corresponding processing parameters are determined according to different geographical positions and/or environment information where the user is located, the corresponding input sound signals are processed according to the determined processing parameters, and the corresponding output sound signals are obtained, so that the corresponding automatic gain control is performed on the input sound signals of which the user is located at different geographical positions and/or environment information.

According to an embodiment of the present disclosure, the step S1003, obtaining a processing parameter according to the user ID, includes: and when the user ID is different from the user ID corresponding to the input sound signal processed last time, acquiring the processing parameter determined according to the user ID.

According to the embodiment of the present disclosure, after determining the user ID corresponding to the input sound signal according to the current input sound signal, the user ID corresponding to the current input sound signal and the user ID corresponding to the input sound signal processed last time may be compared, and when the user ID corresponding to the current input sound signal is different from the user ID corresponding to the input sound signal processed last time, it indicates that the users corresponding to the input sound signals of the previous and subsequent times have changed, and since the input sound signals corresponding to different users are very different, it is necessary to determine the corresponding processing parameters according to the user ID corresponding to the current input sound signal.

According to an embodiment of the present disclosure, the step S1003, obtaining a processing parameter according to the user ID, includes: and when the user ID is the same as the user ID corresponding to the input sound signal processed last time, adjusting the processing parameter according to a preset rule so that the amplitude of the output sound signal meets a preset condition.

According to the embodiment of the present disclosure, after determining the user ID corresponding to the input sound signal according to the current input sound signal, the user ID corresponding to the current input sound signal may be compared with the user ID corresponding to the input sound signal processed last time, and when the user ID corresponding to the current input sound signal is the same as the user ID corresponding to the input sound signal processed last time, it indicates that the users corresponding to the input sound signals of the previous and subsequent times do not change. The preset rules and the preset conditions are not specifically limited in the present disclosure, and can be selected according to actual needs. For example, the preset condition may be that the amplitude of the output sound signal is within a preset range, and the preset rule is that the value of the gain coefficient is adjusted so that the amplitude of the output sound signal obtained by amplifying the input sound signal by the gain coefficient is within the preset range.

The embodiment of the present disclosure will be described by taking the input sound signals of three users as an example, and it should be understood that this example is only used as an example and is not a limitation to the present disclosure. For example, assume that the electronic device can obtain input sound signals for user A, user B, and user C, assume t₁The user ID corresponding to the input audio signal processed at the time is user a. Let t₂Determining the user ID corresponding to the input sound signal as the user B at the moment according to the user ID, wherein t is t₂The time of day input sound signal corresponds to a user B different from t₁The user A corresponding to the input sound signal processed at the moment, therefore, can be according to t₂And determining corresponding processing parameters for the input sound signal of the user B by the user ID corresponding to the input sound signal at the moment. Let t₃Determining the user ID corresponding to the input sound signal as the user B, i.e. t, according to the user ID at the moment₃User ID and t corresponding to time input sound signal₂The user IDs corresponding to the input sound signals processed at any moment are all the users B, and since the input sound signals of the same user in the fixed application scene generally do not fluctuate greatly, the processing parameters can be adjusted according to the preset rules, so that the amplitude of the output sound signals corresponding to the input sound signals meets the preset conditions.

According to the technical scheme provided by the embodiment of the disclosure, whether the user ID corresponding to the current input sound signal is the same as the user ID corresponding to the input sound signal processed last time is compared, and different processing parameter determining modes are adopted, so that the processing parameters can be determined efficiently and quickly according to the input sound signals of different users.

According to an embodiment of the present disclosure, the sound signal processing method further includes: and determining the processing parameters according to the preset corresponding relation between the user ID and the processing parameters.

According to the embodiments of the present disclosure, a preset correspondence between one or more user IDs and corresponding processing parameters may be established in advance, for example, assuming that the electronic device may acquire input sound signals of N users, where N is an integer greater than or equal to 1.

Table 3 shows the pre-established correspondence between user IDs and processing parameters:

user ID	Processing parameters
		User ID corresponding to user 1	First value range
User ID corresponding to user 2	Second value range
		……	……
User ID corresponding to user N	Value range of Nth

According to the embodiment of the present disclosure, after determining the user ID corresponding to the current input sound signal, the processing parameter corresponding to the current input sound signal may be determined according to the preset corresponding relationship between the user ID and the corresponding processing parameter. For example, if the user ID is the user ID corresponding to the user 1, the corresponding processing parameter may be a parameter in the first value range; assuming that the user ID is the user ID corresponding to the user 2, the corresponding processing parameter may be a parameter in the second value range; assuming that the user ID is the user ID corresponding to the user N, the corresponding processing parameter may be a parameter in the nth value range.

According to the technical scheme provided by the embodiment of the disclosure, through the preset corresponding relation between the pre-established user ID and the processing parameters, after the user ID is determined, the processing parameters corresponding to the user ID can be quickly determined through the preset corresponding relation.

Fig. 11 illustrates a flow chart of a sound signal processing method according to an embodiment of the present disclosure. As shown in fig. 11, the sound signal processing method further includes the following steps S1101-S1102:

in step S1101, determining previous processing parameters for processing a previous input sound signal from a user corresponding to the user ID, based on the user ID;

in step S1102, the processing parameters are determined from the previous processing parameters.

Since the input sound signal of the same user in a fixed application scene generally does not have large fluctuation, the current processing parameters of the input sound signal of the user can be determined according to the previous processing parameters for processing the input sound signal of the same user. According to the embodiment of the present disclosure, a previous processing parameter for processing a previous input sound signal from a user corresponding to a user ID may be determined according to the user ID, and a processing parameter corresponding to a current input sound signal may be determined according to the previous processing parameter, that is, the previous processing parameter of the previous input sound signal of the same user may be used as a reference for determining the processing parameter corresponding to the current input sound signal, for example, the previous processing parameter may be fine-tuned to obtain the processing parameter corresponding to the current input sound signal, or the previous processing parameter may be used as the processing parameter corresponding to the current input sound signal.

According to the technical scheme provided by the embodiment of the disclosure, the processing parameters corresponding to the current input sound signals can be quickly acquired by referring to the previous processing parameters of the previous input sound signals of the same user, so that the currently applicable processing parameters can be quickly determined.

Fig. 12 shows a schematic diagram of a sound signal processing method according to an embodiment of the present disclosure. As shown in fig. 12, a user a, a user B, and a user C can input sound signals through microphones, the microphones generating the input sound signals may be connected to a first electronic device 1201, and the first electronic device 1201 may communicate with a server 1202, or may communicate with a second electronic device 1203, or may communicate with the second electronic device 1203 via the server 1202. It should be understood that this example is used only as an example, and is not a limitation to the present disclosure, and the number of the users, the microphones, the electronic devices, and the servers in the present disclosure may be set according to actual needs, and the types and connection manners of the microphones, the electronic devices, and the servers in the present disclosure may be set according to actual needs, and the present disclosure is not limited thereto specifically.

The sound signal processing method in the embodiment of the present disclosure may be executed by the first electronic device 1201, and the first electronic device 1201 may locally and quickly acquire the user ID, the processing parameter, and the output sound signal corresponding to the input sound signal, so that the real-time performance of sound signal processing may be improved. Meanwhile, the sound corresponding to the sound signal can be output through the audio output equipment (loudspeaker), so that the user can timely adjust the strength of the subsequent input sound signal according to the volume of the output sound. For example, if the user a finds that the volume of the output sound is relatively small, the intensity of the subsequent input sound signal may be increased, i.e., the volume of the input sound is increased.

The sound signal processing method in the embodiments of the present disclosure may be performed by a server 1202 communicating with the first electronic device 1201, wherein the server 1202 may include a conference server of a web conference or an edge server of a content distribution network. After the server 1202 acquires the output sound signal, the output sound signal may be transmitted to the first electronic device 1201, thereby reducing the processing load of the near-end electronic device, i.e., the first electronic device 1201. Meanwhile, after the server 1202 acquires the output sound signal, the output sound signal may also be sent to another electronic device that needs to acquire the output sound signal and communicate with the server 1202, such as the second electronic device 1203, so as to reduce the processing load of the remote electronic device.

The sound signal processing method in the embodiment of the present disclosure may be executed by the second electronic device 1203 communicating with the first electronic device 1201, and after the second electronic device 1203 acquires the output sound signal, the user D and the user E local to the second electronic device 1203 may acquire the output sound corresponding to the output sound signal in time, that is, the output sound corresponding to the sound signal input by the user a, the user B, and the user C in time.

the processing parameters are obtained locally at the first electronic device.

Fig. 13 shows a schematic diagram of obtaining the processing parameter according to an embodiment of the present disclosure. As shown in fig. 13, a user a, a user B, and a user C may input sound signals through microphones, and a microphone generating the input sound signal may be connected to a first electronic device 1301. It should be understood that this example is used only as an example, and is not a limitation to the present disclosure, and the number of the user, the microphone, and the first electronic device 1301 in the present disclosure may be set according to actual needs, and the type and the connection manner of the microphone and the first electronic device 1301 in the present disclosure may be set according to actual needs, which is not specifically limited by the present disclosure.

According to an embodiment of the present disclosure, for example, after the user a inputs a sound signal through a microphone, the first electronic device 1301 connected to the microphone may acquire the input sound signal. The first electronic device 1301 may determine a user ID corresponding to the input sound signal according to the input sound signal, for example, determine that the input sound signal originates from the user a. The first electronic device 1301 may determine a processing parameter corresponding to the current input sound signal of the user a according to the user ID. The first electronic device 1301 may process the current input sound signal of the user a according to the processing parameter, and acquire an output sound signal. The user a, the user B, and the user C may obtain output sounds corresponding to the output sound signals through an audio output device (speaker) of the first electronic device 1301.

transmitting the user ID to a first external device;

Fig. 14 shows a schematic diagram of obtaining the processing parameter according to an embodiment of the disclosure. As shown in fig. 14, the embodiment of the present disclosure will be described by taking a first external device as an example of a server 1402 communicating with a first electronic device 1401, and it should be understood that this example is only used as an example and is not a limitation to the present disclosure.

According to the embodiment of the present disclosure, after the first electronic device 1401 acquires the input sound signal, the user ID corresponding to the input sound signal may be determined according to the input sound signal. In order to reduce the processing load of the first electronic device 1401, a user ID may be transmitted to the server 1402 communicating with the first electronic device 1401, and the server 1402 may determine a processing parameter according to the user ID. The server 1402, after obtaining the processing parameters, may send the processing parameters to the first electronic device 1401. The first electronic device 1401, after acquiring the processing parameters, can process the input sound signal according to the processing parameters and acquire the output sound signal.

Fig. 15A shows a schematic diagram of obtaining the processing parameter according to an embodiment of the present disclosure. As shown in fig. 15A, after the first electronic device 1501A acquires the input sound signal, in order to reduce the processing load of the first electronic device 1501A, the input sound signal may be transmitted to the server 1502A that communicates with the first electronic device 1501A. The server 1502A may determine a user ID from the input sound signal and determine processing parameters from the user ID. After acquiring the processing parameters, the server 1502A may process the input sound signal according to the processing parameters and acquire the output sound signal, so as to transmit the output sound signal to the electronic device that needs to acquire the output sound signal, for example, the first electronic device 1501A or other electronic devices that need to acquire the output sound signal.

Fig. 15B shows a schematic diagram of obtaining the processing parameter according to an embodiment of the disclosure. As shown in fig. 15B, after the first electronic device 1501B acquires the input sound signal, it may determine a user ID from the input sound signal and determine a processing parameter corresponding to the input sound signal according to the user ID. To reduce a portion of the processing load of the first electronic device 1501A, the first electronic device 1501B may send the input sound signal and the processing parameters corresponding to the input sound signal to a server 1502B in communication with the first electronic device 1501B. After acquiring the input sound signal and the processing parameter, the server 1502B may process the input sound signal according to the processing parameter and acquire the output sound signal so as to transmit the output sound signal to an electronic device that needs to acquire the output sound signal, for example, the first electronic device 1501B, thereby reducing the processing load of the first electronic device 1501B to some extent; or the server 1502B may send the output sound signal to other electronic devices that need to acquire the output sound signal, thereby reducing the processing load of the other electronic devices to some extent.

Fig. 16A shows a schematic diagram of obtaining the processing parameter according to an embodiment of the present disclosure. As shown in fig. 16A, after the first electronic device 1601A acquires the input sound signal, the input sound signal may be transmitted to the second electronic device 1603A in communication with the first electronic device 1601A. The second electronic device 1603A may determine a user ID from the input sound signal and determine a processing parameter corresponding to the input sound signal from the user ID. After acquiring the processing parameters, the second electronic device 1603A may process the input sound signal according to the processing parameters and acquire the output sound signal, so that a user local to the second electronic device 1603A may acquire the output sound corresponding to the output sound signal.

Fig. 16B shows a schematic diagram of obtaining the processing parameter according to an embodiment of the disclosure. As shown in fig. 16B, the embodiment of the disclosure will be described by taking the second external device as the first electronic device 1601B as an example, and it should be understood that this example is used only as an example and is not a limitation to the disclosure.

According to the embodiment of the present disclosure, after the first electronic device 1601B acquires the input sound signal, it may determine a user ID from the input sound signal and determine a processing parameter corresponding to the input sound signal according to the user ID. The first electronic device 1601B may transmit the input audio signal and the processing parameter corresponding to the input audio signal to the second electronic device 1603B in communication with the first electronic device 1601B. After acquiring the input audio signal and the processing parameter, the second electronic device 1603B may process the input audio signal according to the processing parameter and acquire the output audio signal, so that a user local to the second electronic device 1603B may acquire the output audio corresponding to the output audio signal, and at the same time, the processing load of the first electronic device 1601B and the second electronic device 1603B may be reduced to some extent.

Fig. 16C shows a schematic diagram of obtaining the processing parameter according to an embodiment of the disclosure. As shown in fig. 16C, the embodiment of the present disclosure will be described by taking the second external device as the server 1602C, and it should be understood that this example is used only as an example and is not a limitation to the present disclosure.

According to an embodiment of the present disclosure, after the first electronic device 1601C acquires the input sound signal, the input sound signal may be transmitted to the second electronic device 1603C in communication with the first electronic device 1601C. After the first electronic device 1601C acquires the input sound signal, the user ID may be determined from the input sound signal, and may also be transmitted to the server 1602C in communication with the first electronic device 1601C. The server 1602C may determine a processing parameter corresponding to the input sound signal from the user ID. The server 1602C may send the processing parameter corresponding to the input audio signal to the second electronic device 1603C. After acquiring the input audio signal and the processing parameter, the second electronic device 1603C may process the input audio signal according to the processing parameter and acquire the output audio signal, so that a user local to the second electronic device 1603C may acquire the output audio corresponding to the output audio signal, and at the same time, the processing loads of the first electronic device 1601C, the second electronic device 1603C, and the server 1602C may be reduced to some extent.

According to the technical scheme provided by the embodiment of the disclosure, according to the needs of an application scenario, the first electronic device, the server or the second electronic device may be set as an execution subject for determining the processing parameter according to the user ID, so as to reduce the processing load of the electronic device to a certain extent and/or improve the real-time performance of the response of the electronic device, where the electronic device includes the first electronic device, the second electronic device or the server.

According to an embodiment of the present disclosure, the processing parameter is determined according to a correspondence between the user ID and the processing parameter; or

According to an embodiment of the present disclosure, when the first electronic device, the server, or the second electronic device is an execution subject that determines the processing parameter according to the user ID, the processing parameter may be determined according to a correspondence between the user ID and the processing parameter, or the processing parameter may be determined according to a previous processing parameter for processing a previous input sound signal from a user to which the user ID corresponds. The specific determination method of the processing parameters is described in detail above, and is not described herein again.

Fig. 17 shows a block diagram of a structure of a sound signal processing apparatus according to an embodiment of the present disclosure. The apparatus may be implemented as part or all of an electronic device through software, hardware, or a combination of both. As shown in fig. 17, the sound signal processing apparatus 1700 includes a first acquisition module 1710, a second acquisition module 1720, and a third acquisition module 1730.

The first obtaining module 1710 is configured to obtain an input sound signal;

the second obtaining module 1720 is configured to obtain a processing parameter determined according to sound source information of the input sound signal;

the third obtaining module 1730 is configured to process the input sound signal according to the processing parameter, and obtain an output sound signal.

According to an embodiment of the present disclosure, the acquiring a processing parameter determined according to sound source information of the input sound signal includes:

According to an embodiment of the present disclosure, further comprising:

a first determining module 1740 configured to determine the processing parameter according to a preset correspondence between the sound source information and the processing parameter.

According to an embodiment of the present disclosure, further comprising:

a second determining module 1750, configured to determine, according to the sound source information, a previous processing parameter for processing a previous input sound signal from a sound source corresponding to the sound source information;

a third determining module 1760 configured to determine the treatment parameter from the previous treatment parameter.

According to an embodiment of the present disclosure, the apparatus is implemented by a first electronic device comprising or connected to a microphone that generates the input sound signal; or

According to an embodiment of the present disclosure, when the apparatus is implemented by a first electronic device including or connected to a microphone generating the input sound signal, the obtaining the processing parameter includes:

the processing parameters are obtained locally at the first electronic device.

sending the sound source information to a first external device;

According to an embodiment of the present disclosure, when the apparatus is implemented by a server in communication with the first electronic device, the acquiring the processing parameter includes:

According to an embodiment of the present disclosure, when the apparatus is implemented by a second electronic device in communication with the first electronic device, the acquiring the processing parameter includes:

According to an embodiment of the present disclosure, the processing parameter includes a gain factor.

Fig. 18 shows a block diagram of a structure of a sound signal processing apparatus according to an embodiment of the present disclosure. The apparatus may be implemented as part or all of an electronic device through software, hardware, or a combination of both. As shown in fig. 18, the sound signal processing apparatus 1800 includes a fourth obtaining module 1810, a fifth obtaining module 1820 and a sixth obtaining module 1830.

The fourth obtaining module 1810 is configured to obtain a first sound signal and client information corresponding to the first sound signal;

the fifth obtaining module 1820 is configured to obtain processing parameters according to the client information;

the sixth obtaining module 1830 is configured to process the first sound signal according to the processing parameter to obtain a second sound signal.

According to an embodiment of the present disclosure, the obtaining a processing parameter according to the client information includes:

According to an embodiment of the present disclosure, further comprising:

a fourth determining module 1840 configured to determine the processing parameter according to a preset correspondence between the client information and the processing parameter.

According to an embodiment of the present disclosure, further comprising:

a fifth determining module 1850 configured to determine, according to the client information, previous processing parameters for processing a previous first sound signal from a client to which the client information corresponds;

a sixth determination module 1860 configured to determine the treatment parameters from the previous treatment parameters.

According to an embodiment of the present disclosure, the apparatus is implemented by a server in communication with the client.

Fig. 19 shows a block diagram of a structure of a sound signal processing apparatus according to an embodiment of the present disclosure. The apparatus may be implemented as part or all of an electronic device through software, hardware, or a combination of both. As shown in fig. 19, the sound signal processing apparatus 1900 includes a seventh acquiring module 1910, an eighth acquiring module 1920, a ninth acquiring module 1930 and a tenth acquiring module 1940.

The seventh obtaining module 1910 is configured to obtain an input sound signal;

the eighth acquiring module 1920 is configured to acquire a user ID corresponding to the input sound signal;

the ninth obtaining module 1930 is configured to obtain processing parameters according to the user ID;

the tenth acquiring module 1940 is configured to process the input sound signal according to the processing parameters, resulting in an output sound signal.

According to an embodiment of the present disclosure, the acquiring a processing parameter according to the user ID includes:

According to an embodiment of the present disclosure, further comprising:

a seventh determining module 1950 configured to determine the processing parameters according to a preset correspondence between the user ID and the processing parameters.

According to an embodiment of the present disclosure, further comprising:

an eighth determining module 1960 configured to determine previous processing parameters for processing a previous input sound signal from a user corresponding to the user ID according to the user ID;

a ninth determining module 1970 configured to determine the processing parameter from the previous processing parameter.

According to an embodiment of the present disclosure, when the apparatus is implemented by a first electronic device including or connected to a microphone that generates the input sound signal, the obtaining the processing parameter includes:

the processing parameters are obtained locally at the first electronic device.

transmitting the user ID to a first external device;

According to an embodiment of the present disclosure, when the apparatus is implemented by a server in communication with the first electronic device, the obtaining the processing parameter includes:

According to an embodiment of the present disclosure, when the apparatus is implemented by a second electronic device in communication with the first electronic device, the obtaining the processing parameter includes:

According to an embodiment of the present disclosure, the processing parameter includes a gain factor; and/or

The present disclosure also discloses an electronic device, and fig. 20 shows a block diagram of the electronic device according to an embodiment of the present disclosure.

As shown in fig. 20, the electronic device 2000 includes a memory 2001 and a processor 2002; wherein the content of the first and second substances,

the memory 2001 is used to store one or more computer instructions, which are executed by the processor 2002 to implement a method according to embodiments of the present disclosure.

As shown in fig. 21, the computer system 2100 includes a processing unit 2101, which can execute various processes in the above-described embodiments according to a program stored in a Read Only Memory (ROM)2102 or a program loaded from a storage portion 2108 into a Random Access Memory (RAM) 2103. In the RAM2103, various programs and data necessary for the operation of the system 2100 are also stored. The processing unit 2101, ROM 2102 and RAM2103 are connected to each other via a bus 2104. An input/output (I/O) interface 2105 is also connected to bus 2104.

The following components are connected to the I/O interface 2105: an input portion 2106 including a keyboard, a mouse, and the like; an output portion 2107 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker and the like; a storage portion 2108 including a hard disk and the like; and a communication section 2109 including a network interface card such as a LAN card, a modem, or the like. The communication section 2109 performs communication processing via a network such as the internet. The driver 2110 is also connected to the I/O interface 2105 as necessary. A removable medium 2111 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 2110 as necessary, so that a computer program read out therefrom is mounted in the storage portion 2108 as necessary. The processing unit 2101 may be implemented as a CPU, a GPU, a TPU, an FPGA, an NPU, or other processing units.

In particular, the above described methods may be implemented as computer software programs according to embodiments of the present disclosure. For example, embodiments of the present disclosure include a computer program product comprising a computer program tangibly embodied on a medium readable thereby, the computer program comprising program code for performing the above-described method. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 2109, and/or installed from the removable medium 2111.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units or modules described in the embodiments of the present disclosure may be implemented by software or by programmable hardware. The units or modules described may also be provided in a processor, and the names of the units or modules do not in some cases constitute a limitation of the units or modules themselves.

As another aspect, the present disclosure also provides a computer-readable storage medium, which may be a computer-readable storage medium included in the electronic device or the computer system in the above embodiments; or it may be a separate computer readable storage medium not incorporated into the device. The computer readable storage medium stores one or more programs for use by one or more processors in performing the methods described in the present disclosure.

The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the invention in the present disclosure is not limited to the specific combination of the above-mentioned features, but also encompasses other embodiments in which any combination of the above-mentioned features or their equivalents is possible without departing from the inventive concept. For example, the above features and (but not limited to) the features disclosed in this disclosure having similar functions are replaced with each other to form the technical solution.

Claims

1. A sound signal processing method, comprising:

acquiring an input sound signal;

2. The method according to claim 1, wherein the obtaining of the processing parameter determined according to the sound source information of the input sound signal comprises:

3. The method according to claim 1, wherein the obtaining of the processing parameter determined according to the sound source information of the input sound signal comprises:

4. The method of claim 1, wherein:

the sound source information includes at least one of: voiceprint information of the input sound signal, orientation information of a sound source of the input sound signal, identification information of a microphone generating the input sound signal, a geographical location of the input sound signal, and environment information of the input sound signal.

5. The method of claim 1, further comprising:

6. The method of claim 1, further comprising:

determining the processing parameter from the previous processing parameter.

7. The method of claim 1, wherein:

the method is performed by a first electronic device comprising or connected to a microphone that generates the input sound signal; or

8. The method of claim 7, wherein when the method is performed by a first electronic device that includes or is connected to a microphone that generates the input sound signal, the obtaining the processing parameters comprises:

the processing parameters are obtained locally at the first electronic device.

9. The method of claim 7, wherein when the method is performed by a first electronic device that includes or is connected to a microphone that generates the input sound signal, the obtaining the processing parameters comprises:

sending the sound source information to a first external device;

10. The method of claim 9, wherein the first external device is a server in communication with the first electronic device.

11. The method of claim 7, wherein when the method is performed by a server in communication with the first electronic device, the obtaining the processing parameter comprises:

12. The method of claim 7, wherein obtaining the processing parameter when the method is performed by a second electronic device in communication with the first electronic device comprises:

13. The method of claim 12, wherein the second external device is the first electronic device or the server.

14. The method of claim 7, wherein:

the processing parameters are determined according to the corresponding relation between the sound source information and the processing parameters; or

15. The method of claim 1, wherein the processing parameter comprises a gain factor.

16. A sound signal processing method, comprising:

acquiring processing parameters according to the client information;

17. The method of claim 16, wherein obtaining processing parameters according to the client information comprises:

18. The method of claim 16, wherein obtaining processing parameters according to the client information comprises:

19. The method of claim 16, further comprising:

20. The method of claim 16, further comprising:

determining the processing parameter from the previous processing parameter.

21. The method of claim 16, wherein the method is performed by a server in communication with the client.

22. The method of claim 16, wherein:

the processing parameters include gain coefficients; and/or

23. A sound signal processing method, comprising:

acquiring an input sound signal;

acquiring a user ID corresponding to the input sound signal;

acquiring a processing parameter according to the user ID;

24. The method of claim 23, wherein obtaining the user ID corresponding to the input sound signal comprises:

25. The method of claim 23, wherein obtaining processing parameters based on the user ID comprises:

26. The method of claim 23, wherein obtaining processing parameters based on the user ID comprises:

27. The method of claim 23, further comprising:

28. The method of claim 23, further comprising:

determining the processing parameter from the previous processing parameter.

29. The method of claim 23, wherein:

30. The method of claim 29, wherein when the method is performed by a first electronic device that includes or is coupled to a microphone that generates the input sound signal, the obtaining the processing parameters comprises:

the processing parameters are obtained locally at the first electronic device.

31. The method of claim 29, wherein when the method is performed by a first electronic device that includes or is coupled to a microphone that generates the input sound signal, the obtaining the processing parameters comprises:

transmitting the user ID to a first external device;

32. The method of claim 31, wherein the first external device is a server in communication with the first electronic device.

33. The method of claim 29, wherein when the method is performed by a server in communication with the first electronic device, the obtaining the processing parameter comprises:

34. The method of claim 29, wherein obtaining the processing parameter when the method is performed by a second electronic device in communication with the first electronic device comprises:

35. The method of claim 34, wherein the second external device is the first electronic device or the server.

36. The method of claim 22, wherein:

the processing parameters are determined according to the corresponding relation between the user ID and the processing parameters; or

37. The method of claim 23, wherein the processing parameters comprise gain factors.

38. An acoustic signal processing apparatus, comprising:

a first acquisition module configured to acquire an input sound signal;

39. An acoustic signal processing apparatus, comprising:

40. An acoustic signal processing apparatus, comprising:

a seventh obtaining module configured to obtain an input sound signal;

41. An electronic device comprising a memory and a processor; wherein the memory is to store one or more computer instructions, wherein the one or more computer instructions are to be executed by the processor to implement the method steps of any one of claims 1-37.

42. A readable storage medium having stored thereon computer instructions, characterized in that the computer instructions, when executed by a processor, carry out the method steps of any of claims 1-37.