CN112235462A - Voice adjusting method, system, electronic equipment and computer readable storage medium - Google Patents

Voice adjusting method, system, electronic equipment and computer readable storage medium Download PDF

Info

Publication number
CN112235462A
CN112235462A CN202011098893.XA CN202011098893A CN112235462A CN 112235462 A CN112235462 A CN 112235462A CN 202011098893 A CN202011098893 A CN 202011098893A CN 112235462 A CN112235462 A CN 112235462A
Authority
CN
China
Prior art keywords
voice
noise
voice information
downlink
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011098893.XA
Other languages
Chinese (zh)
Inventor
倪卫峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Spreadtrum Communications Shanghai Co Ltd
Original Assignee
Spreadtrum Communications Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Spreadtrum Communications Shanghai Co Ltd filed Critical Spreadtrum Communications Shanghai Co Ltd
Priority to CN202011098893.XA priority Critical patent/CN112235462A/en
Publication of CN112235462A publication Critical patent/CN112235462A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M9/00Arrangements for interconnection not involving centralised switching
    • H04M9/08Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic

Abstract

The invention discloses a voice adjusting method, a system, electronic equipment and a computer readable storage medium, which are characterized in that the voice adjusting method comprises the following steps: collecting voice information in the communication process; detecting the voice information, and if the voice information is noise voice, acquiring noise energy of the noise voice; adjusting the voice parameters of the downlink voice information received by the downlink communication link according to the noise energy; and playing the adjusted downlink voice information. The method and the device can automatically adapt to the voice quality under different scenes, and the loudness of the downlink voice sent by the equipment changes along with the change of the voice quality under different noise environments, for example, the noise is high, and the loudness of the downlink voice is also increased; the noise is small, and the loudness of the downlink voice is also reduced; the proper signal-to-noise ratio is always kept, so that the human ear can clearly hear the semantics of the downlink voice.

Description

Voice adjusting method, system, electronic equipment and computer readable storage medium
Technical Field
The invention belongs to the technical field of mobile communication, and particularly relates to a voice adjusting method, a voice adjusting system, electronic equipment and a computer readable storage medium.
Background
When people are in different scenes, such as noisy environments: in scenes such as supermarkets, subways, intersections, KTVs and the like, because environmental noise is large, when voice calls are made, subjective listening feelings are fuzzy, and semantics of the opposite party cannot be received.
Disclosure of Invention
The invention provides a voice adjusting method, a system, electronic equipment and a computer readable storage medium, aiming at overcoming the defect that in the prior art, even if the volume of a mobile phone is adjusted to the maximum, the call experience is still very poor in a noisy environment.
The invention solves the technical problems through the following technical scheme:
a method of speech conditioning, comprising:
collecting voice information in the communication process;
detecting the voice information, and if the voice information is noise voice, acquiring noise energy of the noise voice;
adjusting the voice parameters of the downlink voice information received by the downlink communication link according to the noise energy;
and playing the adjusted downlink voice information.
Preferably, the step of detecting the voice information specifically includes:
detecting the voice information based on a voice activity detection algorithm.
Preferably, in the communication process, the step of collecting the voice information specifically includes:
in the communication process, collecting a frame of voice information every other preset time period;
the detecting the voice information, and if the voice information is a noise voice, the acquiring the noise energy of the noise voice specifically includes:
detecting whether the current frame voice information is noise voice;
if so, acquiring the noise energy of the noise voice of the current frame;
the step of adjusting the voice parameter of the downlink voice information received by the downlink communication link according to the noise energy specifically includes:
and adjusting the voice parameters of the downlink voice information in the same time period with the current frame noise voice according to the noise energy of the current frame noise voice.
Preferably, after the step of detecting the voice information, the voice adjusting method further includes:
and if the voice information is the call voice, sending the call voice through an uplink communication link.
Preferably, the voice parameters include volume and frequency response.
A voice adjusting system comprises an acquisition module, a detection module, an adjusting module and a playing module;
the acquisition module is used for acquiring voice information in the call process;
the detection module is used for detecting the voice information, and if the voice information is noise voice, noise energy of the noise voice is obtained;
the adjusting module is used for adjusting the voice parameters of the downlink voice information received by the downlink communication link according to the noise energy;
the playing module is used for playing the adjusted downlink voice information.
Preferably, the detection module is configured to detect the voice information based on a voice activation detection algorithm.
Preferably, the acquisition module is configured to acquire a frame of voice information every other preset time period during a call;
the detection module comprises a detection unit and a noise energy acquisition unit;
the detection unit is used for detecting whether the current frame voice information is noise voice, and if so, the noise energy acquisition unit is called;
the noise energy acquisition unit acquires the noise energy of the noise voice of the current frame;
the adjusting module is used for adjusting the voice parameters of the downlink voice information in the same time period as the current frame noise voice according to the noise energy of the current frame noise voice.
Preferably, the voice adjusting system further comprises a sending module;
the detection module is used for calling the sending module when the voice information is a call voice;
and the sending module is used for sending the call voice out through an uplink communication link.
Preferably, the voice parameters include volume and frequency response.
An electronic device comprises a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the voice adjusting method when executing the computer program.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned speech adaptation method.
The positive progress effects of the invention are as follows: the method and the device can automatically adapt to the voice quality under different scenes, and the loudness of the downlink voice sent by the equipment changes along with the change of the voice quality under different noise environments, for example, the noise is high, and the loudness of the downlink voice is also increased; the noise is small, and the loudness of the downlink voice is also reduced; the proper signal-to-noise ratio is always kept, so that the human ear can clearly hear the semantics of the downlink voice.
Drawings
Fig. 1 is a flowchart of a speech adjusting method according to embodiment 1 of the present invention.
Fig. 2 is a flowchart illustrating a voice call process according to embodiment 1 of the present invention.
Fig. 3 is a flowchart of a voice adjusting method according to embodiment 2 of the present invention.
Fig. 4 is a block diagram of a voice adjustment system according to embodiment 3 of the present invention.
Fig. 5 is a schematic structural diagram of an electronic device according to embodiment 4 of the present invention.
Detailed Description
The invention is further illustrated by the following examples, which are not intended to limit the scope of the invention.
Example 1
A method of speech conditioning, as shown in fig. 1, comprising:
step 10, collecting voice information in the communication process;
step 20, detecting the voice information, if the voice information is noise voice, executing step 30, and if the voice information is call voice, executing step 60;
step 30, obtaining noise energy of the noise voice; specifically, the voice information is detected based on a voice activation detection algorithm.
Step 40, adjusting the voice parameters of the downlink voice information received by the downlink communication link according to the noise energy; the speech parameters include volume and frequency response.
And step 50, playing the adjusted downlink voice information.
After the step of detecting the voice information, the voice adjusting method further includes:
and step 60, sending the call voice through an uplink communication link.
Taking a mobile phone call as an example, referring to fig. 2, fig. 2 shows a flow diagram of a voice call process, which is divided into an uplink path and a downlink path:
an uplink path: voice information (noise or voice) enters an Audio Codec through a MIC (microphone), is amplified by a PGA (programmable gain controller), enters an ADC (analog-to-digital converter), enters a DSP (digital signal processing module) with a data amount of one frame of 20ms, and can be determined to be a noise frame or a voice frame through a VAD (voice activity detector).
The Voice frame is a Voice signal to be transmitted to the opposite side, generally a near-field Voice, and the like, and the Voice signal passes through Noise estimate directly, then passes through Tx Process (uplink signal processing), including Noise suppression, echo suppression, automatic volume adjustment, and the like, and finally passes through Voice Encoder (Voice coding) and is transmitted.
The Noise frame refers to the environmental Noise, generally far-field Noise, at this time, after Noise estimate, the Noise energy is detected, and this value is sent to MBDRC (multi-band dynamic range controller) and EQ (equalizer) of the downlink, the voice parameters therein are automatically adjusted, the volume and frequency response of the downlink are enhanced or reduced, the modified signal enters the loudspeaker after being amplified by DAC (digital-to-analog converter) module and PGA (programmable gain controller) of Audio Codec, and then the sound is emitted.
In the figure, Voice Decoder is used for Voice decoding, Rx Process is used for downstream signal processing, and it is limited that the loudness emitted from the speaker cannot exceed the maximum value specified by 3GPP in any case.
In the embodiment, the voice quality under different scenes can be automatically adapted, and the loudness of the downlink voice sent by the equipment changes along with the change of the voice quality under different noise environments, for example, the loudness of the downlink voice is increased due to high noise; the noise is small, and the loudness of the downlink voice is also reduced; the proper signal-to-noise ratio is always kept, so that the human ear can clearly hear the semantics of the downlink voice. Example 2
This embodiment is a further improvement on the basis of embodiment 1, as shown in fig. 3, in the call process, step 10 specifically includes:
step 101, in a communication process, acquiring a frame of voice information every other preset time period;
further, step 20 specifically includes: step 201, detecting whether the current frame voice information is noise voice, if yes, executing step 301;
301, acquiring noise energy of noise voice of a current frame;
step 401, adjusting the voice parameters of the downlink voice information in the same time period as the current frame noise voice according to the noise energy of the current frame noise voice, and then executing step 50.
This embodiment illustrates that, when processing the voice information, reading a frame of data at preset time intervals, for example, in reference to embodiment 1, reading the frame of data with a data amount of 20ms, and then adjusting the downlink voice information in the same time period as the noise voice.
Example 3
A voice adjusting system, as shown in FIG. 4, includes a collecting module 1, a detecting module 2, an adjusting module 3 and a playing module 4;
the acquisition module 1 is used for acquiring voice information in the call process;
the detection module 2 is configured to detect the voice information, and if the voice information is a noise voice, obtain noise energy of the noise voice; specifically, the detection module 2 is configured to detect the voice information based on a voice activation detection algorithm.
The adjusting module 3 is used for adjusting the voice parameters of the downlink voice information received by the downlink communication link according to the noise energy; the speech parameters include volume and frequency response.
And the playing module 4 is used for playing the adjusted downlink voice information.
The voice adjusting system also comprises a sending module 5;
the detection module 2 is used for calling the sending module 5 when the voice information is a call voice;
the sending module 5 is configured to send the call voice through an uplink communication link.
In this embodiment, the detection module 2 includes a detection unit 21 and a noise energy obtaining unit 22; the acquisition module 1 is used for acquiring a frame of voice information every other preset time period in the call process;
the detecting unit 21 is configured to detect whether the current frame speech information is a noise speech, and if so, invoke the noise energy obtaining unit 22;
the noise energy obtaining unit 22 obtains the noise energy of the noise voice of the current frame;
the adjusting module 3 is configured to adjust a speech parameter of the downlink speech information in the same time period as the current frame noise speech according to the noise energy of the current frame noise speech.
In the embodiment, the voice quality under different scenes can be automatically adapted, and the loudness of the downlink voice sent by the equipment changes along with the change of the voice quality under different noise environments, for example, the loudness of the downlink voice is increased due to high noise; the noise is small, and the loudness of the downlink voice is also reduced; the proper signal-to-noise ratio is always kept, so that the human ear can clearly hear the semantics of the downlink voice.
Example 4
An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the speech adjustment method of embodiment 1 or 2 when executing the computer program.
Fig. 5 is a schematic structural diagram of an electronic device provided in this embodiment. Fig. 5 illustrates a block diagram of an exemplary electronic device 90 suitable for use in implementing embodiments of the present invention. The electronic device 90 shown in fig. 5 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiment of the present invention.
As shown in fig. 5, the electronic device 90 may take the form of a general purpose computing device, which may be a server device, for example. The components of the electronic device 90 may include, but are not limited to: at least one processor 91, at least one memory 92, and a bus 93 that connects the various system components (including the memory 92 and the processor 91).
The bus 93 includes a data bus, an address bus, and a control bus.
Memory 92 may include volatile memory, such as Random Access Memory (RAM)921 and/or cache memory 922, and may further include Read Only Memory (ROM) 923.
Memory 92 may also include a program tool 925 having a set (at least one) of program modules 924, such program modules 924 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
The processor 91 executes various functional applications and data processing by running a computer program stored in the memory 92.
The electronic device 90 may also communicate with one or more external devices 94 (e.g., keyboard, pointing device, etc.). Such communication may be through an input/output (I/O) interface 95. Also, the electronic device 90 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet) via a network adapter 96. The network adapter 96 communicates with the other modules of the electronic device 90 via the bus 93. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the electronic device 90, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID (disk array) systems, tape drives, and data backup storage systems, etc.
It should be noted that although in the above detailed description several units/modules or sub-units/modules of the electronic device are mentioned, such a division is merely exemplary and not mandatory. Indeed, the features and functionality of two or more of the units/modules described above may be embodied in one unit/module, according to embodiments of the application. Conversely, the features and functions of one unit/module described above may be further divided into embodiments by a plurality of units/modules.
Example 5
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the speech adjustment method of embodiment 1 or 2.
More specific examples, among others, that the readable storage medium may employ may include, but are not limited to: a portable disk, a hard disk, random access memory, read only memory, erasable programmable read only memory, optical storage device, magnetic storage device, or any suitable combination of the foregoing.
In a possible implementation, the invention can also be implemented in the form of a program product comprising program code means for causing a terminal device to carry out the steps of implementing the speech adaptation method as described in embodiment 1 or 2, when said program product is run on said terminal device.
Where program code for carrying out the invention is written in any combination of one or more programming languages, the program code may be executed entirely on the user device, partly on the user device, as a stand-alone software package, partly on the user device and partly on a remote device or entirely on the remote device.
While specific embodiments of the invention have been described above, it will be appreciated by those skilled in the art that this is by way of example only, and that the scope of the invention is defined by the appended claims. Various changes and modifications to these embodiments may be made by those skilled in the art without departing from the spirit and scope of the invention, and these changes and modifications are within the scope of the invention.

Claims (12)

1. A method for speech conditioning, comprising:
collecting voice information in the communication process;
detecting the voice information, and if the voice information is noise voice, acquiring noise energy of the noise voice;
adjusting the voice parameters of the downlink voice information received by the downlink communication link according to the noise energy;
and playing the adjusted downlink voice information.
2. The speech adaptation method of claim 1, wherein the step of detecting the speech information specifically comprises:
detecting the voice information based on a voice activity detection algorithm.
3. The voice adjustment method according to claim 1, wherein the step of collecting voice information during the call specifically comprises:
in the communication process, collecting a frame of voice information every other preset time period;
the detecting the voice information, and if the voice information is a noise voice, the acquiring the noise energy of the noise voice specifically includes:
detecting whether the current frame voice information is noise voice;
if so, acquiring the noise energy of the noise voice of the current frame;
the step of adjusting the voice parameter of the downlink voice information received by the downlink communication link according to the noise energy specifically includes:
and adjusting the voice parameters of the downlink voice information in the same time period with the current frame noise voice according to the noise energy of the current frame noise voice.
4. The speech adaptation method of claim 2, wherein after the step of detecting the speech information, the speech adaptation method further comprises:
and if the voice information is the call voice, sending the call voice through an uplink communication link.
5. The speech adaptation method of claim 1, wherein the speech parameters include volume and frequency response.
6. A voice regulation system is characterized by comprising an acquisition module, a detection module, a regulation module and a playing module;
the acquisition module is used for acquiring voice information in the call process;
the detection module is used for detecting the voice information, and if the voice information is noise voice, noise energy of the noise voice is obtained;
the adjusting module is used for adjusting the voice parameters of the downlink voice information received by the downlink communication link according to the noise energy;
the playing module is used for playing the adjusted downlink voice information.
7. The voice conditioning system of claim 6, wherein the detection module is to detect the voice information based on a voice activity detection algorithm.
8. The voice adjustment system of claim 6, wherein the collecting module is configured to collect a frame of voice information every other predetermined time period during a call;
the detection module comprises a detection unit and a noise energy acquisition unit;
the detection unit is used for detecting whether the current frame voice information is noise voice, and if so, the noise energy acquisition unit is called;
the noise energy acquisition unit acquires the noise energy of the noise voice of the current frame;
the adjusting module is used for adjusting the voice parameters of the downlink voice information in the same time period as the current frame noise voice according to the noise energy of the current frame noise voice.
9. The voice conditioning system of claim 7, wherein the voice conditioning system further comprises a transmit module;
the detection module is used for calling the sending module when the voice information is a call voice;
and the sending module is used for sending the call voice out through an uplink communication link.
10. The speech modification system of claim 6, wherein the speech parameters include volume and frequency response.
11. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the speech adaptation method of any of claims 1 to 5 when executing the computer program.
12. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, carries out the steps of the speech adaptation method of one of claims 1 to 5.
CN202011098893.XA 2020-10-14 2020-10-14 Voice adjusting method, system, electronic equipment and computer readable storage medium Pending CN112235462A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011098893.XA CN112235462A (en) 2020-10-14 2020-10-14 Voice adjusting method, system, electronic equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011098893.XA CN112235462A (en) 2020-10-14 2020-10-14 Voice adjusting method, system, electronic equipment and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN112235462A true CN112235462A (en) 2021-01-15

Family

ID=74112944

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011098893.XA Pending CN112235462A (en) 2020-10-14 2020-10-14 Voice adjusting method, system, electronic equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN112235462A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113194381A (en) * 2021-04-28 2021-07-30 国光电器(香港)有限公司 Volume adjusting method and device, sound equipment and storage medium
CN115052070A (en) * 2022-06-24 2022-09-13 歌尔股份有限公司 Method and device for adjusting call volume, call equipment and medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120101816A1 (en) * 2000-05-30 2012-04-26 Adoram Erell Enhancing the intelligibility of received speech in a noisy environment
CN102436821A (en) * 2011-12-02 2012-05-02 海能达通信股份有限公司 Method for adaptively adjusting sound effect and equipment thereof
CN103873625A (en) * 2014-03-31 2014-06-18 深圳市中兴移动通信有限公司 Method and device for increasing volume of received voice and mobile terminal
CN204119309U (en) * 2014-09-24 2015-01-21 西安邮电大学 A kind of mobile phone estimating adjustment In Call automatically based on noise
CN104780259A (en) * 2014-01-14 2015-07-15 深圳富泰宏精密工业有限公司 Automatic adjustment system and method for call voice quality
CN111383647A (en) * 2018-12-28 2020-07-07 展讯通信(上海)有限公司 Voice signal processing method and device and readable storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120101816A1 (en) * 2000-05-30 2012-04-26 Adoram Erell Enhancing the intelligibility of received speech in a noisy environment
CN102436821A (en) * 2011-12-02 2012-05-02 海能达通信股份有限公司 Method for adaptively adjusting sound effect and equipment thereof
CN104780259A (en) * 2014-01-14 2015-07-15 深圳富泰宏精密工业有限公司 Automatic adjustment system and method for call voice quality
CN103873625A (en) * 2014-03-31 2014-06-18 深圳市中兴移动通信有限公司 Method and device for increasing volume of received voice and mobile terminal
CN204119309U (en) * 2014-09-24 2015-01-21 西安邮电大学 A kind of mobile phone estimating adjustment In Call automatically based on noise
CN111383647A (en) * 2018-12-28 2020-07-07 展讯通信(上海)有限公司 Voice signal processing method and device and readable storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113194381A (en) * 2021-04-28 2021-07-30 国光电器(香港)有限公司 Volume adjusting method and device, sound equipment and storage medium
CN113194381B (en) * 2021-04-28 2023-04-14 国光电器(香港)有限公司 Volume adjusting method and device, sound equipment and storage medium
CN115052070A (en) * 2022-06-24 2022-09-13 歌尔股份有限公司 Method and device for adjusting call volume, call equipment and medium

Similar Documents

Publication Publication Date Title
US10186276B2 (en) Adaptive noise suppression for super wideband music
RU2461081C2 (en) Intelligent gradient noise reduction system
CN110447069B (en) Method and device for processing voice signal in self-adaptive noise environment
CN103871421B (en) A kind of self-adaptation noise reduction method and system based on subband noise analysis
US20120316869A1 (en) Generating a masking signal on an electronic device
JP6290429B2 (en) Speech processing system
CN102549659A (en) Suppressing noise in an audio signal
CN112004177B (en) Howling detection method, microphone volume adjustment method and storage medium
US9601128B2 (en) Communication apparatus and voice processing method therefor
US11627421B1 (en) Method for realizing hearing aid function based on bluetooth headset chip and a bluetooth headset
CN112235462A (en) Voice adjusting method, system, electronic equipment and computer readable storage medium
CN108133712B (en) Method and device for processing audio data
CN112202956A (en) Terminal equipment and audio acquisition method thereof
US8954322B2 (en) Acoustic shock protection device and method thereof
CN113241085B (en) Echo cancellation method, device, equipment and readable storage medium
US9972338B2 (en) Noise suppression device and noise suppression method
WO2024051820A1 (en) Abnormality-based paging method and related apparatus
EP3830823B1 (en) Forced gap insertion for pervasive listening
WO2007049222A1 (en) Adaptive volume control for a speech reproduction system
CN113824843B (en) Voice call quality detection method, device, equipment and storage medium
CN102446510A (en) Automatic gain control system and method
US20080147394A1 (en) System and method for improving an interactive experience with a speech-enabled system through the use of artificially generated white noise
WO2021239254A1 (en) A own voice detector of a hearing device
CN112700785A (en) Voice signal processing method and device and related equipment
CN116132862A (en) Microphone control method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210115