CN107172256B

CN107172256B - Earphone call self-adaptive adjustment method and device, mobile terminal and storage medium

Info

Publication number: CN107172256B
Application number: CN201710623664.7A
Authority: CN
Inventors: 杨宗业
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2017-07-27
Filing date: 2017-07-27
Publication date: 2020-05-05
Anticipated expiration: 2037-07-27
Also published as: CN107172256A

Abstract

The embodiment of the invention provides an earphone conversation self-adaptive adjusting method, an earphone conversation self-adaptive adjusting device, a mobile terminal and a storage medium, wherein the method comprises the following steps: when the mobile terminal is in an earphone call mode, voice signals in the environment are collected, the voice signals are analyzed, target voiceprint features are obtained through matching in the voiceprint features obtained through analysis through a preset voiceprint feature recognition library, the voice amplitude of the sound of the target voiceprint features in the voice signals is determined, the loudness value and the frequency value of the sound of the target voiceprint features in the voice signals are adjusted according to the voice amplitude, and compared with an AGC self-adaptive gain adjustment mode, the problem of amplification of environmental noise can be effectively avoided while call volume is improved, and call quality is improved.

Description

Earphone call self-adaptive adjustment method and device, mobile terminal and storage medium

Technical Field

The invention relates to the technical field of mobile terminals, in particular to an earphone call self-adaptive adjusting method, an earphone call self-adaptive adjusting device, a mobile terminal and a storage medium.

Background

At present, the application of mobile phones is very common, people are also more and more accustomed to using earphones for voice call, under normal conditions, people make a microphone normally droop to carry out call, but there is a problem of small call volume when carrying out call by using the method.

Disclosure of Invention

The embodiment of the invention provides an earphone call self-adaptive adjusting method, an earphone call self-adaptive adjusting device, a mobile terminal and a storage medium, which can solve the problem that the call quality is reduced due to the fact that the whole voice signal is amplified and the environmental noise in the voice signal is amplified certainly when the call volume is increased by means of AGC self-adaptive gain adjustment in the prior art.

In order to achieve the above object, a first aspect of the embodiments of the present invention provides a method for adaptively adjusting an earphone call, including:

collecting voice signals in the environment when the mobile terminal is in an earphone call mode;

analyzing the voice signal, and matching the voice signal with the voice print characteristics obtained by analysis through a preset voice print characteristic identification library to obtain target voice print characteristics;

determining the voice amplitude of the sound of the target voiceprint feature in the voice signal;

and adjusting the loudness value and the frequency value of the sound to which the target voiceprint feature belongs in the voice signal according to the voice amplitude.

In order to achieve the above object, a second aspect of the embodiments of the present invention provides an adaptive earphone call adjustment device, including:

the mobile terminal comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring voice signals in an environment when the mobile terminal is in an earphone conversation mode;

the analysis matching module is used for analyzing the voice signal and matching the voice signal with the voice print characteristics obtained by analysis through a preset voice print characteristic identification library to obtain target voice print characteristics;

the determining module is used for determining the voice amplitude of the sound of the target voiceprint feature in the voice signal;

and the adjusting module is used for adjusting the loudness value and the frequency value of the sound to which the target voiceprint feature belongs in the voice signal according to the voice amplitude.

In order to achieve the above object, a third aspect of embodiments of the present invention provides a mobile terminal, including: a memory, a processor and a computer program stored in the memory and operable on the processor, wherein the processor implements the steps of the adaptive earphone call adjustment method according to the first aspect when executing the computer program.

In order to achieve the above object, a fourth aspect of the embodiments of the present invention provides a storage medium, which is a computer-readable storage medium having a computer program stored thereon, where the computer program is executed by a processor to implement the steps in the adaptive earphone-call adjustment method according to the first aspect.

The embodiment of the invention provides an earphone conversation self-adaptive adjusting method, an earphone conversation self-adaptive adjusting device, a mobile terminal and a storage medium, wherein the method comprises the following steps: and when the mobile terminal is in an earphone communication mode, acquiring voice signals in the environment, analyzing the voice signals, matching the voice signals with the voice print characteristics obtained through analysis through a preset voice print characteristic identification library to obtain target voice print characteristics, determining the voice amplitude of the voice of the target voice print characteristics in the voice signals, and adjusting the loudness value and the frequency value of the voice of the target voice print characteristics in the voice signals according to the voice amplitude. Compared with the prior art, in the earphone call mode, the target voiceprint characteristics are matched through the preset voiceprint characteristic identification library aiming at the collected voice signals, the loudness value and the frequency value of the sound to which the target voiceprint characteristics belong are adjusted, and compared with an AGC self-adaptive gain adjustment mode, the problem of amplification of environmental noise can be effectively avoided while the call volume is increased, and the call quality is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a block diagram of a mobile terminal;

fig. 2 is a schematic flowchart of a method for adaptively adjusting a headset call according to a first embodiment of the present invention;

FIG. 3 is a schematic flow chart of the refinement step of step 202 in the first embodiment of the present invention;

FIG. 4 is a schematic flow chart of the refinement step of step 204 in the first implementation of the present invention;

fig. 5 is a schematic flowchart of a method for adaptively adjusting a headset call according to a second embodiment of the present invention;

fig. 6 is a schematic structural diagram of an adaptive earphone call adjustment device according to a third embodiment of the present invention;

FIG. 7 is a diagram illustrating a detailed structure of an analysis matching module 602 according to a third embodiment of the present invention;

FIG. 8 is a detailed structural diagram of the adjusting module 604 according to a third embodiment of the present invention;

fig. 9 is a schematic structural diagram of an adaptive earphone call adjustment device according to a fourth embodiment of the present invention.

Detailed Description

In order to make the objects, features and advantages of the present invention more obvious and understandable, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Fig. 1 shows a block diagram of a mobile terminal. The adaptive earphone call adjustment method provided by the embodiment of the present invention may be applied to the mobile terminal 10 shown in fig. 1, where the mobile terminal 10 may include, but is not limited to: the system is a smart phone, a notebook, a tablet computer, a wearable smart device and the like which need to maintain normal operation by depending on a battery and support network and downloading functions.

As shown in fig. 1, the mobile terminal 10 includes a memory 101, a memory controller 102, one or more processors 103 (only one shown), a peripheral interface 104, a radio frequency module 105, a key module 106, an audio module 107, and a touch screen 108. These components communicate with each other via one or more communication buses/signal lines 109.

It is to be understood that the structure shown in fig. 1 is only an illustration and does not limit the structure of the mobile terminal. The mobile terminal 10 may also include more or fewer components than shown in FIG. 1, or may have a different configuration than shown in FIG. 1. The components shown in fig. 1 may be implemented in hardware, software, or a combination thereof.

The memory 101 may be configured to store software programs and modules, such as program instructions/modules corresponding to the method and apparatus for adaptive adjustment of a headset call in the embodiment of the present invention, and the processor 103 executes various functional applications and data processing by running the software programs and modules stored in the memory 101, so as to implement the method and apparatus for adaptive adjustment of a headset call.

Memory 101 may include high speed random access memory and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some instances, the memory 101 may further include memory located remotely from the processor 103, which may be connected to the mobile terminal 10 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof. Access to the memory 101 by the processor 103 and possibly other components may be under the control of the memory controller 102.

The peripheral interface 104 couples various input/output devices to the CPU and to the memory 101. The processor 103 executes various software, instructions within the memory 101 to perform various functions of the mobile terminal 10 and to perform data processing.

In some embodiments, the peripheral interface 104, the processor 103, and the memory controller 102 may be implemented in a single chip. In other examples, they may be implemented separately from the individual chips.

The rf module 105 is used for receiving and transmitting electromagnetic waves, and implementing interconversion between the electromagnetic waves and electrical signals, so as to communicate with a communication network or other devices. The rf module 105 may include various existing circuit elements for performing these functions, such as an antenna, an rf transceiver, a digital signal processor, an encryption/decryption chip, a Subscriber Identity Module (SIM) card, memory, and so forth. The rf module 105 may communicate with various networks such as the internet, an intranet, a preset type of wireless network, or other devices through a preset type of wireless network. The preset types of wireless networks described above may include cellular telephone networks, wireless local area networks, or metropolitan area networks. The Wireless network of the above-mentioned preset type may use various communication standards, protocols and technologies, including but not limited to Global System for mobile communication (GSM), Enhanced Data GSM Environment (EDGE), Wideband Code Division Multiple Access (W-CDMA), Code Division Multiple Access (CDMA), Time Division Multiple Access (TDMA), bluetooth, Wireless Fidelity (WiFi) (e.g., IEEE802.11a, IEEE802.11 b, IEEE802.11g and/or IEEE802.11 n), Voice over internet protocol (VoIP), Global internet Access (wimax), other protocols for email, instant messaging, and short messaging, and any other suitable messaging protocol.

The key module 106 provides an interface for user input to the mobile terminal, and the user may cause the mobile terminal 10 to perform different functions by pressing different keys.

Audio module 107 provides an audio interface to a user that may include one or more microphones, one or more speakers, and audio circuitry. The audio circuitry receives audio data from the peripheral interface 104, converts the audio data to electrical information, and transmits the electrical information to the speaker. The speaker converts the electrical information into sound waves that the human ear can hear. The audio circuitry also receives electrical information from the microphone, converts the electrical information to voice data, and transmits the voice data to the peripheral interface 104 for further processing. The audio data may be retrieved from the memory 101 or through the radio frequency module 105. In addition, the audio data may also be stored in the memory 101 or transmitted through the radio frequency module 105. In some examples, audio module 107 may also include a headphone jack for providing an audio interface to headphones or other devices.

The touch screen 108 provides both an output and an input interface between the mobile terminal and the user. In particular, the touch screen 108 displays video output to the user, the content of which may include text, graphics, video, and any combination thereof. Some of the output results are for some of the user interface objects. The touch screen 108 also receives user inputs, such as user clicks, swipes, and other gesture operations, for the user interface objects to respond to these user inputs. The technique of detecting user input may be based on resistive, capacitive, or any other possible touch detection technique. Specific examples of touch screen 108 display units include, but are not limited to, liquid crystal displays or light emitting polymer displays.

The adaptive earphone call adjusting method in the embodiment of the invention is described based on the mobile terminal.

In the prior art, the whole voice signal is amplified while the call volume is increased by means of AGC self-adaptive gain adjustment, so that the environmental noise in the voice signal is amplified, and the call quality is reduced.

In order to solve the above problems, the present invention provides an earphone call adaptive adjustment method, in an earphone call mode, a preset voiceprint feature recognition library is used to match a target voiceprint feature for an acquired voice signal, and a loudness value and a frequency value of a sound to which the target voiceprint feature belongs are adjusted.

Referring to fig. 2, a flow chart of a method for adaptively adjusting a headset call according to a first embodiment of the present invention is shown, where the method includes:

step 201, collecting voice signals in an environment when the mobile terminal is in an earphone communication mode;

in the embodiment of the present invention, the above-mentioned earphone call adaptive adjustment method is implemented by an earphone call adaptive adjustment device (hereinafter, referred to as an adjustment device), which is a program module, stored in a computer-readable storage medium of a mobile terminal, and executable by a processor.

In the process of communication, if the mobile terminal is in an earphone communication mode, a microphone on the mobile terminal will collect a voice signal in the environment, and the adjusting device will acquire the voice signal collected by the microphone in real time.

If the mobile terminal is in an earphone communication mode, two earphone communication modes are available, wherein the first mode is to make a microphone of the earphone close to the mouth to perform communication, and the second mode is to make the microphone normally droop to perform communication.

Step 202, analyzing the voice signal, and matching the voice signal with the voice print characteristics obtained by analysis through a preset voice print characteristic identification library to obtain target voice print characteristics;

in the embodiment of the invention, when the voiceprint is displayed by an electronic instrument, the voiceprint is a viewable sound wave spectrum carrying language information, when human language is generated, a complex biophysical process exists between a human language center and a pronunciation organ, and the pronunciation organ used by a person during speaking comprises: tongue, larynx, lung, nasal cavity, etc., and since each individual's vocal organs vary in size and shape, their vocal print patterns may also vary from one another. The voiceprint features are feature parameters of the voiceprint, so that the voiceprint is reliable, and different voiceprint features can distinguish different sounds.

Each mobile terminal is provided with at least one preset voiceprint feature recognition library, and the preset voiceprint feature recognition library stores the voiceprint features of the user of the mobile terminal.

In the embodiment of the invention, for the collected voice signal, the voice signal is analyzed, the voiceprint characteristics obtained by analysis are matched with a preset voiceprint characteristic identification library, which voiceprint characteristics are the voiceprint characteristics of the current caller is identified from the voiceprint characteristics obtained by analysis, and the identified voiceprint characteristics are used as the target voiceprint characteristics.

Step 203, determining the voice amplitude of the sound of the target voiceprint feature in the voice signal;

and 204, adjusting the loudness value and the frequency value of the sound to which the target voiceprint feature belongs in the voice signal according to the voice amplitude.

In the embodiment of the present invention, the adjusting device determines the speech amplitude of the sound of the target voiceprint feature in the speech signal, wherein the sound of the target voiceprint feature is the sound of the caller, and the speech amplitude is the average value of the amplitudes in the sound wave formed by the sound of the caller or the minimum value of the amplitudes.

The adjusting device adjusts the loudness value and the frequency value of the sound to which the target voiceprint feature belongs in the voice signal according to the voice amplitude.

Wherein, the loudness value is used for measuring the volume, and the frequency value is used for measuring the definition of the sound.

It should be noted that after the adjustment of the voice signal is completed, the voice signal may be sent to the mobile terminal used by the other end of the call object, so that the call object can hear the voice with clear and proper volume.

In the embodiment of the invention, when the mobile terminal is in an earphone communication mode, the voice signal in the environment is collected, the voice signal is analyzed, the target voiceprint characteristic is obtained by matching from the voiceprint characteristics obtained by analysis through a preset voiceprint characteristic recognition library, the voice amplitude of the sound of the target voiceprint characteristic in the voice signal is determined, and the loudness value and the frequency value of the sound of the target voiceprint characteristic in the voice signal are adjusted according to the voice amplitude. Compared with the prior art, in the earphone call mode, the target voiceprint characteristics are matched through the preset voiceprint characteristic identification library aiming at the collected voice signals, the loudness value and the frequency value of the sound to which the target voiceprint characteristics belong are adjusted, and compared with an AGC self-adaptive gain adjustment mode, the problem of amplification of environmental noise can be effectively avoided while the call volume is increased, and the call quality is improved.

Referring to fig. 3, a flowchart of a detailed step of step 202 in the first embodiment of the present invention is shown, which includes:

step 301, analyzing the voice signal to obtain voiceprint characteristics of each sound of different sources in the voice signal;

in the embodiment of the present invention, the sounds of different sources may be a speaker, a television, an animal, a machine, and so on, and various persons or things or devices capable of generating sounds.

Step 302, searching the voiceprint feature recognition library, and judging whether voiceprint features matched with the voiceprint features in the voiceprint feature recognition library exist in the voiceprint features of the voices;

in the embodiment of the present invention, a voiceprint feature recognition library is preset in the mobile terminal, and includes voiceprint features of one or more users, and the specific setting mode may be: the user enters a setting interface of the mobile terminal through clicking operation, and selects a voiceprint setting function, so that a display interface of the mobile terminal displays a start button of voiceprint setting, the user speaks any content after clicking the button, or pronounces the content displayed on the display interface, a microphone on the mobile terminal collects the content spoken by the user, analyzes the voiceprint characteristics, judges whether the voiceprint characteristics obtained through analysis meet requirements, stores the voiceprint characteristics into a voiceprint characteristic library if the voiceprint characteristics meet the requirements, so as to complete the setting of the voiceprint characteristics, and displays a prompt message to prompt the user to reset the voiceprint setting if the voiceprint characteristics do not meet the requirements. In this way, the setting of the voiceprint characteristics of one or more users on one mobile terminal can be realized.

Step 303, if the matched voiceprint features exist, determining the matched voiceprint features as the target voiceprint features;

in the embodiment of the invention, the voiceprint features obtained by analysis are matched with a preset voiceprint feature identification library, which is the voiceprint feature of the current caller is identified from the voiceprint features obtained by analysis, and the identified voiceprint features are used as target voiceprint features.

In the embodiment of the invention, when the mobile terminal is in an earphone communication mode, a voice signal in an environment is collected, the voice signal is analyzed, voiceprint features of different sounds in the voice signal from different sources are obtained, a voiceprint feature recognition library is searched, whether voiceprint features matched with the voiceprint features in the voiceprint feature recognition library exist in the voiceprint features of the sounds is judged, if the matched voiceprint features exist, the matched voiceprint features are determined as target voiceprint features, the voice amplitude of the sound belonging to the target voiceprint features in the voice signal is determined, and the loudness value and the frequency value of the sound belonging to the target voiceprint features in the voice signal are adjusted according to the voice amplitude. Compared with the prior art, in the earphone call mode, the target voiceprint characteristics are matched through the preset voiceprint characteristic identification library aiming at the collected voice signals, the loudness value and the frequency value of the sound to which the target voiceprint characteristics belong are adjusted, and compared with an AGC self-adaptive gain adjustment mode, the problem of amplification of environmental noise can be effectively avoided while the call volume is increased, and the call quality is improved.

Referring to fig. 4, a flowchart of a detailed step of step 204 in the first embodiment of the present invention is shown, which includes:

step 401, searching a preset parameter adjustment table, and determining a target loudness value and a target frequency value corresponding to the voice amplitude, wherein the parameter adjustment table includes a mapping relationship between the voice amplitude, the loudness value and the frequency value;

step 402, judging whether the target loudness value is smaller than or equal to a preset threshold value;

in the embodiment of the present invention, the mobile terminal presets a parameter adjustment table, where the parameter adjustment table includes a mapping relationship between a voice amplitude, a loudness value, and a frequency value, and the parameter adjustment table presets a standard parameter, for example, the voice amplitude is 10, the corresponding loudness value and frequency value are both 40, the voice amplitude is 30, the corresponding loudness value and frequency value are both 70, and the loudness value in the parameter adjustment table needs to be 40 greater than the voice amplitude, where 40 is the standard parameter.

The preset threshold value is used for preventing the phenomenon of sound breaking caused by excessive increase of the loudness value.

Step 403, if the target loudness is less than or equal to the preset threshold, adjusting the loudness value and the frequency value of the sound to which the target voiceprint feature belongs to the target loudness value and the target frequency value, respectively.

In the embodiment of the invention, the preset threshold is a loudness limit value, and when the target loudness value exceeds the preset threshold, when the loudness value of the sound to which the target voiceprint feature belongs is adjusted to the target loudness value, a sound breaking phenomenon occurs.

In the embodiment of the invention, when a local end speaker approaches the mouth, because the target loudness is less than or equal to the preset threshold value, when the loudness value of the sound to which the target voiceprint feature belongs is adjusted to the target loudness value, and the opposite end speaker receives the voice signal, the phenomenon of sound break does not occur, so that the target loudness is less than or equal to the preset threshold value, and the loudness value and the frequency value of the sound to which the target voiceprint feature belongs are respectively adjusted to the target loudness value and the target frequency value. For example, the preset threshold is 100, the speech amplitude is 10, the corresponding loudness value and frequency value are both 40, and 40 is less than 100, then the loudness value and frequency value of the sound to which the target voiceprint feature belongs are respectively adjusted to 40.

Further, step 404 is included after step 402, and step 404 and step 403 are in parallel, specifically:

step 404, if the target loudness is greater than the preset threshold, adjusting the loudness value and the frequency value of the sound to which the target voiceprint feature belongs to the preset threshold and the preset frequency value corresponding to the preset threshold, respectively.

In the embodiment of the invention, when the local end speaker brings the earphone close to the mouth and the target loudness is greater than the preset threshold value, if the loudness value and the frequency value of the sound to which the target voiceprint feature belongs are respectively adjusted to the target loudness value and the target frequency value, when the opposite end speaker receives the voice signal, the phenomenon of sound break occurs, so that the loudness value and the frequency value of the sound to which the target voiceprint feature belongs are respectively adjusted to the preset threshold value and the preset frequency value corresponding to the preset threshold value. For example, the preset threshold is 100, the voice amplitude is 70, the corresponding loudness value and frequency value are both 110, and 110 is greater than 100, then the loudness value and frequency value of the sound to which the target voiceprint feature belongs are respectively adjusted to 100, wherein by judging the relationship between the target loudness and the preset threshold, the loudness value and frequency value of the sound to which the target voiceprint feature belongs can be accurately adjusted, the loudness value and frequency value of the sound to which the target voiceprint feature belongs are enhanced, the call volume is increased, and meanwhile, the sound breaking phenomenon when a receiving party listens to a voice signal is prevented.

In the embodiment of the invention, if the mobile terminal is in an earphone conversation mode, two earphone conversation modes are available, wherein the first mode is to make a conversation by making a microphone of an earphone close to the mouth, and the mode has a sound breaking phenomenon; the second is to make the microphone droop normally to make a call, the call volume of the mode is small, the two call modes are compatible through an AGC self-adaptive gain adjustment mode in the prior art, namely, the call volume is increased and the sound breaking phenomenon is prevented, but the whole voice signal is amplified through the AGC self-adaptive gain adjustment mode while the call volume is increased, so that the environmental noise in the voice signal is amplified, and the call quality is reduced.

Referring to fig. 5, a flow chart of a voice signal adaptive adjustment method according to a second embodiment of the present invention is shown, including:

step 501, collecting voice signals in an environment when the mobile terminal is in an earphone call mode;

step 502, analyzing the voice signal, and matching the voice signal with the voice print characteristics obtained by analysis through a preset voice print characteristic identification library to obtain target voice print characteristics;

step 503, determining the voice amplitude of the sound in the voice signal to which the target voiceprint feature belongs;

step 504, adjusting the loudness value and the frequency value of the sound to which the target voiceprint feature belongs in the voice signal according to the voice amplitude;

step 505, extracting sounds belonging to the voiceprint features except the target voiceprint feature from the voice signal to obtain an interference voice signal;

and step 506, performing noise reduction processing on the interference voice signal.

It is to be understood that steps 501 to 504 are similar to those described in steps 201 to 204 of the first embodiment, and specific reference may be made to the first embodiment, which is not described herein again.

In the embodiment of the present invention, after the sound to which the target voiceprint feature belongs is adjusted, in order to further improve the call quality, the method may also adjust for other sounds, specifically: the adjusting device extracts the sound belonging to the voiceprint features except the target voiceprint feature from the voice signal to obtain an interference voice signal, for example, if the voice signal comprises the voice of a caller and the sound of an advertisement played by the motor, the voice of the caller is the sound belonging to the target voiceprint feature, and the adjusting device extracts the sound of the advertisement played by the television from the voice signal to be used as the interference voice signal. Further, the adjusting device performs noise reduction processing on the interfering voice signal, so that after the adjusted voice signal is sent to the other end of the talking object, the talking object can hear the voice signal with a clearer effective signal (namely, the voice of the caller) and a proper volume, and an invalid signal (namely, the interfering voice signal) is weaker.

The noise reduction process may be implemented in various manners, such as a noise gate noise reduction method, a sampling noise reduction method, a filtering noise reduction method, and so on.

In the embodiment of the invention, after the sound to which the target voiceprint feature belongs in the voice signal is adjusted, the noise reduction processing is further carried out on the interference voice signal in the voice signal, so that the call quality is further improved, and the call experience is improved.

Referring to fig. 6, which is a schematic structural diagram of an adaptive earphone call adjustment device according to a third embodiment of the present invention, the device includes an acquisition module 601, an analysis matching module 602, a determination module 603, and an adjustment module 604, specifically:

the acquisition module 601 is used for acquiring voice signals in an environment when the mobile terminal is in an earphone conversation mode;

In the process of a call, if the mobile terminal is in an earphone call mode, a microphone on the mobile terminal will collect a voice signal in an environment, and the collection module 601 will obtain the voice signal collected by the microphone in real time, it can be understood that the voice signal at least contains the voice of a caller of the mobile terminal, and if there are other voices in the environment, the microphone will also collect other voices existing in the environment.

The analysis matching module 602 is configured to analyze the voice signal, and match the voice signal with a preset voiceprint feature recognition library to obtain a target voiceprint feature from the voiceprint features obtained through analysis;

In the embodiment of the present invention, for the collected voice signal, the analysis matching module 602 analyzes the voice signal, matches the voiceprint features obtained through analysis with a preset voiceprint feature recognition library, recognizes which voiceprint feature is the voiceprint feature of the current caller from the voiceprint features obtained through analysis, and uses the recognized voiceprint feature as the target voiceprint feature.

A determining module 603, configured to determine a speech amplitude of a sound in the speech signal to which the target voiceprint feature belongs;

the adjusting module 604 is configured to adjust a loudness value and a frequency value of a sound to which the target voiceprint feature belongs in the voice signal according to the voice amplitude.

In this embodiment of the present invention, the determining module 603 determines a speech amplitude of the sound of the target voiceprint feature in the speech signal, where the sound of the target voiceprint feature is the sound of the caller, and the speech amplitude is an average value of amplitudes in a sound wave formed by the sound of the caller or a minimum value of the amplitudes.

The adjusting module 604 adjusts the loudness value and the frequency value of the sound belonging to the target voiceprint feature in the voice signal according to the voice amplitude.

In the embodiment of the present invention, when the mobile terminal is in an earphone call mode, the acquisition module 601 acquires a voice signal in an environment, the analysis matching module 602 analyzes the voice signal, and matches the voice signal with a preset voiceprint feature recognition library to obtain a target voiceprint feature from the voiceprint features obtained through analysis, the determination module 603 determines a voice amplitude of a sound in the voice signal to which the target voiceprint feature belongs, and the adjustment module 604 adjusts a loudness value and a frequency value of the sound to which the target voiceprint feature belongs in the voice signal according to the voice amplitude. Compared with the prior art, in the earphone call mode, the target voiceprint characteristics are matched through the preset voiceprint characteristic identification library aiming at the collected voice signals, the loudness value and the frequency value of the sound to which the target voiceprint characteristics belong are adjusted, and compared with an AGC self-adaptive gain adjustment mode, the problem of amplification of environmental noise can be effectively avoided while the call volume is increased, and the call quality is improved.

Referring to fig. 7, a detailed structural diagram of an analysis matching module 602 in the third embodiment of the present invention includes an analysis unit 701, a first search unit 702, and a determination unit 703, specifically:

an analyzing unit 701, configured to analyze the voice signal and obtain voiceprint features of sounds from different sources in the voice signal;

A first searching unit 702, configured to search the voiceprint feature identification library, and determine whether a voiceprint feature matching the voiceprint feature in the voiceprint feature identification library exists in the voiceprint features of each sound;

A determining unit 703, configured to determine, if there is a matched voiceprint feature, the matched voiceprint feature as the target voiceprint feature;

In the embodiment of the present invention, when the mobile terminal is in an earphone communication mode, a voice signal in an environment is collected, the parsing unit 701 parses the voice signal, and obtains voiceprint features of each sound of different sources in the voice signal, the first searching unit 702 searches the voiceprint feature recognition library, determines whether a voiceprint feature matched with the voiceprint feature in the voiceprint feature recognition library exists in the voiceprint features of each sound, if a matched voiceprint feature exists, the determining unit 703 determines the matched voiceprint feature as a target voiceprint feature, determines a voice amplitude of a sound in the voice signal to which the target voiceprint feature belongs, and adjusts a loudness value and a frequency value of the sound to which the target voiceprint feature belongs in the voice signal according to the voice amplitude. Compared with the prior art, in the earphone call mode, the target voiceprint characteristics are matched through the preset voiceprint characteristic identification library aiming at the collected voice signals, the loudness value and the frequency value of the sound to which the target voiceprint characteristics belong are adjusted, and compared with an AGC self-adaptive gain adjustment mode, the problem of amplification of environmental noise can be effectively avoided while the call volume is increased, and the call quality is improved.

Referring to fig. 8, a detailed structural diagram of the adjusting module 604 in the third embodiment of the present invention includes a second searching unit 801, a determining unit 802, a first adjusting unit 803, and a second adjusting unit 804, specifically:

a second searching unit 801, configured to search a preset parameter adjustment table, and determine a target loudness value and a target frequency value corresponding to the voice amplitude, where the parameter adjustment table includes a mapping relationship between the voice amplitude, the loudness value, and the frequency value;

a determining unit 802, configured to determine whether the target loudness value is smaller than or equal to a preset threshold;

A first adjusting unit 803, configured to adjust the loudness value and the frequency value of the sound to which the target voiceprint feature belongs to the target loudness value and the target frequency value respectively if the target loudness is less than or equal to the preset threshold.

A second adjusting unit 804, configured to adjust the loudness value and the frequency value of the sound to which the target voiceprint feature belongs to the preset threshold value and the preset frequency value corresponding to the preset threshold value, respectively, if the target loudness is greater than the preset threshold value.

In the embodiment of the invention, when the local end speaker brings the earphone close to the mouth, when the target loudness is greater than the preset threshold, if the loudness value and the frequency value of the sound to which the target voiceprint feature belongs are respectively adjusted to the target loudness value and the target frequency value, when the opposite end speaker receives the voice signal, the phenomenon of sound break occurs, so that the loudness value and the frequency value of the sound to which the target voiceprint feature belongs are respectively adjusted to the preset threshold and the preset frequency value corresponding to the preset threshold. For example, the preset threshold is 100, the voice amplitude is 70, the corresponding loudness value and frequency value are both 110, and 110 is greater than 100, then the loudness value and frequency value of the sound to which the target voiceprint feature belongs are respectively adjusted to 100, wherein the determining unit 802 can accurately adjust the loudness value and frequency value of the sound to which the target voiceprint feature belongs by determining the relationship between the target loudness and the preset threshold, enhance the loudness value and frequency value of the sound to which the target voiceprint feature belongs, improve the communication quality, and prevent the sound break phenomenon from occurring when the receiving party listens to the voice signal.

Referring to fig. 9, which is a schematic structural diagram of an adaptive earphone call adjustment apparatus according to a fourth embodiment of the present invention, the apparatus includes an acquisition module 601, an analysis matching module 602, a determination module 603, an adjustment module 604, an extraction module 901, and a noise reduction module 902, specifically:

an adjusting module 604, configured to adjust a loudness value and a frequency value of a sound to which the target voiceprint feature belongs in the voice signal according to the voice amplitude;

the acquisition module 601, the analysis matching module 602, the determination module 603, and the adjustment module 604 in the embodiment of the present invention are respectively consistent with the contents described in the acquisition module 601, the analysis matching module 602, the determination module 603, and the adjustment module 604 in the third embodiment, and no further description is given here

An extracting module 901, configured to extract, from the voice signal, sounds to which other voiceprint features except the target voiceprint feature belong, so as to obtain an interference voice signal;

and a noise reduction module 902, configured to perform noise reduction processing on the interference speech signal.

In the embodiment of the present invention, after the sound to which the target voiceprint feature belongs is adjusted, in order to further improve the call quality, the method may also adjust for other sounds, specifically: the extracting module 901 extracts sounds belonging to voiceprint features other than the target voiceprint feature from the voice signal to obtain an interference voice signal, for example, if the voice signal includes a voice of a caller and a voice of an electric machine for playing an advertisement, the voice of the caller is the voice belonging to the target voiceprint feature, and the extracting module 901 extracts the voice of a television for playing the advertisement from the voice signal and uses the voice as the interference voice signal. Further, the noise reduction module 902 performs noise reduction processing on the interfering voice signal, so that after the adjusted voice signal is sent to the other end of the call object, an effective signal (i.e., the voice of the caller) in the voice signal received by the call object is clearer and has a proper volume, and an ineffective signal (i.e., the interfering voice signal) is weaker.

In the embodiment of the present invention, after the extracting module 901 adjusts the sound to which the target voiceprint feature belongs in the voice signal, the denoising module 902 further performs denoising processing on an interfering voice signal in the voice signal, so as to further improve the call quality and improve the call experience.

The embodiment of the present invention further provides a mobile terminal, which includes a memory, a processor, and a computer program stored in the memory and capable of running on the processor, and when the processor executes the computer program, the steps of the earphone call adaptive adjustment method in any one of the first embodiment to the second embodiment are implemented.

The embodiment of the present invention further provides a storage medium, which may specifically be a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the steps in the adaptive earphone call adjustment method in any one of the first to second embodiments are implemented.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical division, and in actual implementation, there may be other divisions, for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or modules, and may be in an electrical, mechanical or other form.

The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

In addition, functional modules in the embodiments of the present invention may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.

The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

It should be noted that, for the sake of simplicity, the above-mentioned method embodiments are described as a series of acts or combinations, but those skilled in the art should understand that the present invention is not limited by the described order of acts, as some steps may be performed in other orders or simultaneously according to the present invention. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no acts or modules are necessarily required of the invention.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

In the above description, for a person skilled in the art, according to the idea of the embodiment of the present invention, there are variations in the specific implementation and application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims

1. A method for adaptive adjustment of headset calls is characterized by comprising the following steps:

adjusting the loudness value and the frequency value of the sound to which the target voiceprint feature belongs in the voice signal according to the voice amplitude;

the step of adjusting the loudness value and the frequency value of the sound to which the target voiceprint feature belongs in the voice signal according to the voice amplitude comprises:

searching a preset parameter adjusting table, and determining a target loudness value and a target frequency value corresponding to the voice amplitude value, wherein the parameter adjusting table comprises a mapping relation among the voice amplitude value, the loudness value and the frequency value;

judging whether the target loudness value is smaller than or equal to a preset threshold value or not;

if the target loudness is smaller than or equal to the preset threshold value, adjusting the loudness value and the frequency value of the sound to which the target voiceprint feature belongs to the target loudness value and the target frequency value respectively.

2. The method according to claim 1, wherein the step of parsing the voice signal and obtaining the target voiceprint feature by matching from the parsed voiceprint features through a preset voiceprint feature recognition library comprises:

analyzing the voice signal to obtain the voiceprint characteristics of each sound of different sources in the voice signal;

searching the voiceprint feature recognition library, and judging whether voiceprint features matched with the voiceprint features in the voiceprint feature recognition library exist in the voiceprint features of the voices or not;

and if the matched voiceprint features exist, determining the matched voiceprint features as the target voiceprint features.

3. The method of claim 1, wherein the step of adjusting the loudness value and the frequency value of the sound of the target voiceprint feature in the speech signal according to the speech amplitude further comprises:

if the target loudness is greater than the preset threshold value, the loudness value and the frequency value of the sound to which the target voiceprint feature belongs are respectively adjusted to the preset threshold value and a preset frequency value corresponding to the preset threshold value.

4. A method according to any one of claims 1 to 3, characterized in that the method further comprises:

extracting sounds belonging to other voiceprint features except the target voiceprint feature from the voice signal to obtain an interference voice signal;

and carrying out noise reduction processing on the interference voice signal.

5. An adaptive earphone call adjustment device, comprising:

the adjusting module is used for adjusting the loudness value and the frequency value of the sound to which the target voiceprint feature belongs in the voice signal according to the voice amplitude;

the adjustment module includes:

the second searching unit is used for searching a preset parameter adjusting table and determining a target loudness value and a target frequency value corresponding to the voice amplitude value, wherein the parameter adjusting table comprises a mapping relation among the voice amplitude value, the loudness value and the frequency value;

the judging unit is used for judging whether the target loudness value is smaller than or equal to a preset threshold value or not;

and the first adjusting unit is used for adjusting the loudness value and the frequency value of the sound to which the target voiceprint feature belongs to the target loudness value and the target frequency value respectively if the target loudness is less than or equal to the preset threshold value.

6. The apparatus of claim 5, wherein the parsing matching module comprises:

the analysis unit is used for analyzing the voice signal and acquiring the voiceprint characteristics of the sounds of different sources in the voice signal;

the first searching unit is used for searching the voiceprint feature identification library and judging whether voiceprint features matched with the voiceprint features in the voiceprint feature identification library exist in the voiceprint features of all the sounds;

and the determining unit is used for determining the matched voiceprint features as the target voiceprint features if the matched voiceprint features exist.

7. The apparatus of claim 5, wherein the adjustment module further comprises:

and the second adjusting unit is used for adjusting the loudness value and the frequency value of the sound to which the target voiceprint feature belongs to the preset threshold value and the preset frequency value corresponding to the preset threshold value respectively if the target loudness is greater than the preset threshold value.

8. The apparatus of any one of claims 5 to 7, further comprising:

the extraction module is used for extracting sounds to which other voiceprint features except the target voiceprint feature belong from the voice signal to obtain an interference voice signal;

and the noise reduction module is used for carrying out noise reduction processing on the interference voice signal.

9. A mobile terminal, comprising: memory, processor and computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the adaptive headset talk adjustment method according to any one of claims 1 to 4 when executing the computer program.

10. A storage medium which is a computer-readable storage medium and on which a computer program is stored, wherein the computer program, when executed by a processor, implements the steps of the headset talk adaptive adjustment method according to any one of claims 1 to 4.