CN113299303A - Voice data processing method, device, storage medium and system - Google Patents

Voice data processing method, device, storage medium and system Download PDF

Info

Publication number
CN113299303A
CN113299303A CN202110476209.5A CN202110476209A CN113299303A CN 113299303 A CN113299303 A CN 113299303A CN 202110476209 A CN202110476209 A CN 202110476209A CN 113299303 A CN113299303 A CN 113299303A
Authority
CN
China
Prior art keywords
voice
voice information
data processing
transmitted
current
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110476209.5A
Other languages
Chinese (zh)
Inventor
乔雪娜
陈乐�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Pingdingshan Juxin Network Technology Co ltd
Original Assignee
Pingdingshan Juxin Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Pingdingshan Juxin Network Technology Co ltd filed Critical Pingdingshan Juxin Network Technology Co ltd
Priority to CN202110476209.5A priority Critical patent/CN113299303A/en
Publication of CN113299303A publication Critical patent/CN113299303A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10KSOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
    • G10K11/00Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/16Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/175Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound
    • G10K11/178Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound by electro-acoustically regenerating the original acoustic waves in anti-phase
    • G10K11/1785Methods, e.g. algorithms; Devices
    • G10K11/17853Methods, e.g. algorithms; Devices of the filter
    • G10K11/17854Methods, e.g. algorithms; Devices of the filter the filter being an adaptive filter

Abstract

The embodiment of the invention discloses a voice data processing method, a voice data processing device, a storage medium and a voice data processing system. The method comprises the following steps: acquiring current voice information of a local terminal through voice acquisition equipment; filtering and denoising the current voice information by adopting an LMS or RLS adaptive filtering algorithm to obtain voice information to be transmitted; and transmitting the voice information to be transmitted to the opposite terminal. By implementing the embodiment of the invention, the LMS or RLS adaptive filtering algorithm is adopted to filter and reduce noise of the current voice information, so that the noise is eliminated, and the voice information needing to be transmitted to the opposite end only comprises the voice of a speaker, thereby eliminating the influence of the noise on the voice data transmission and further improving the transmission quality of the voice data.

Description

Voice data processing method, device, storage medium and system
Technical Field
The invention relates to the technical field of data processing, in particular to a voice data processing method, a voice data processing device, a voice data processing storage medium and a voice data processing system.
Background
In the existing mobile phone voice call or internet-based video conference and other scenes, the transmission quality of voice data is often reduced due to noise, so that the user experience effect is poor.
Disclosure of Invention
In view of the technical defects in the prior art, embodiments of the present invention provide a method, an apparatus, a storage medium, and a system for processing voice data.
In order to achieve the above object, in a first aspect, an embodiment of the present invention provides a method for processing voice data, including:
acquiring current voice information of a local terminal through voice acquisition equipment;
filtering and denoising the current voice information by adopting an LMS or RLS adaptive filtering algorithm to obtain voice information to be transmitted;
and transmitting the voice information to be transmitted to the opposite terminal.
Wherein, the current voice information comprises speaker voice and environmental noise; the voice information to be transmitted only includes speaker voice.
In some preferred embodiments of the present application, obtaining the voice information to be transmitted specifically includes:
performing analog-to-digital conversion on the current voice information and generating a voice signal to be processed;
and filtering and denoising the voice signal to be processed by adopting an LMS or RLS adaptive filtering algorithm to obtain the voice information to be transmitted.
In a second aspect, an embodiment of the present invention provides a speech data processing apparatus, including a processor, an input device, an output device, and a memory, where the processor, the input device, the output device, and the memory are connected to each other, where the memory is used to store a computer program, and the computer program includes program instructions, and the processor is configured to call the program instructions to execute the method of the first aspect.
In a third aspect, an embodiment of the present invention provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, where the computer program includes program instructions, and the program instructions, when executed by a processor, cause the processor to execute the method of the first aspect.
In a fourth aspect, an embodiment of the present invention provides another speech data processing apparatus, including:
the acquisition module is used for acquiring the current voice information of the local terminal through the voice acquisition equipment;
the processing module is used for filtering and denoising the current voice information by adopting an LMS or RLS adaptive filtering algorithm to obtain the voice information to be transmitted;
and the transmission module is used for transmitting the voice information to be transmitted to the opposite terminal.
In some preferred embodiments of the present application, the apparatus further includes a preprocessing module, configured to perform analog-to-digital conversion on the current speech information and generate a speech signal to be processed;
the processing module is specifically configured to:
and filtering and denoising the voice signal to be processed by adopting an LMS or RLS adaptive filtering algorithm to obtain the voice information to be transmitted.
The preprocessing module comprises an analog-to-digital conversion chip, and the analog-to-digital conversion chip adopts an AD9125 dual-channel 16-bit bandwidth chip.
In a fifth aspect, an embodiment of the present invention further provides a voice data processing system, which includes a home terminal, a voice acquisition device, a voice data processing apparatus, and an opposite terminal. Wherein the voice data processing apparatus is as described above.
In certain preferred embodiments of the present application, the voice capturing device includes a microphone or an array of a plurality of equally spaced microphones.
By implementing the embodiment of the invention, the LMS or RLS adaptive filtering algorithm is adopted to filter and reduce noise of the current voice information, so that the noise is eliminated, and the voice information needing to be transmitted to the opposite end only comprises the voice of a speaker, thereby eliminating the influence of the noise on the voice data transmission and further improving the transmission quality of the voice data.
Drawings
In order to more clearly illustrate the detailed description of the invention or the technical solutions in the prior art, the drawings that are needed in the detailed description of the invention or the prior art will be briefly described below.
FIG. 1 is a flow chart of a method for processing voice data according to an embodiment of the present invention;
FIG. 2 is a block diagram of a voice data processing system provided by an embodiment of the present invention;
FIG. 3 is a schematic diagram of an architecture of the speech data processing apparatus of FIG. 2;
fig. 4 is a schematic diagram of another structure of the speech data processing apparatus in fig. 2.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It should be noted that the voice scene of the voice data processing method provided by the embodiment of the present invention mainly includes: a voice or video call initiated by social software such as a mobile phone call, a WeChat and the like, an internet-based video conference and the like.
Referring to fig. 1, an embodiment of the present invention provides a method for processing voice data, including:
and S1, acquiring the current voice information of the local terminal through the voice acquisition equipment.
Wherein the current voice information comprises speaker voice and environmental noise.
And S2, performing analog-to-digital conversion on the current voice information and generating a voice signal to be processed.
In the embodiment, an AD9125 two-channel 16-bit bandwidth chip is adopted to perform analog-to-digital conversion on the current voice signal. AD9125 is a dual channel, 16-bit, high dynamic range TxDAC digital-to-analog converter (DAC) that provides a 1000MSPS sampling rate that can produce multiple carriers up to the nyquist frequency. It has properties optimized for direct conversion transmission applications, including complex digital modulation and gain and offset compensation. The DAC output is optimized to seamlessly interface with an analog quadrature modulator, such as the ADI ADL537x F-MOD family of modulators. The 4-wire serial port interface allows many internal parameters to be programmed and read back. The full scale output current can be programmed in the range of 8.7mA to 31.7 mA.
And S3, filtering and denoising the voice signal to be processed by adopting an LMS or RLS adaptive filtering algorithm to obtain the voice information to be transmitted.
It should be noted that the principle of the lms (least Mean square) algorithm is as follows: the next moment filter weight coefficient is estimated through the output error of the filter at the current moment, the output signal and the filter weight coefficient, and finally the minimum mean square error between the expected output signal and the actual output signal is achieved. RLS (recursive Least Square) can perform real-time online parameter estimation, and is a linear recursion estimator to minimize the covariance of the current parameters.
And S4, transmitting the voice information to be transmitted to the opposite terminal.
The voice data processing method of the embodiment of the invention adopts the LMS or RLS adaptive filtering algorithm to filter and reduce noise of the current voice information, eliminates noise, and enables the voice information needing to be transmitted to the opposite end to only comprise the voice of a speaker, thereby eliminating the influence of the noise on the voice data transmission and further improving the transmission quality of the voice data.
Based on the same inventive concept, the embodiment of the invention provides a voice data processing system. As shown in fig. 2, the system includes a home terminal, a voice collecting device, a voice data processing device, and an opposite terminal.
The voice collecting device can be a sound pickup or an array formed by a plurality of microphones which are arranged at equal intervals. For example, in a video training conference in which multiple persons participate, a plurality of equally spaced microphones may be used to collect current speech information.
Further, as shown in fig. 3, in a preferred embodiment of the present invention, the voice data processing apparatus includes:
the acquisition module 10 is used for acquiring current voice information of a local terminal through voice acquisition equipment;
the processing module 11 is configured to perform filtering and denoising processing on the current voice information by using an LMS or RLS adaptive filtering algorithm to obtain voice information to be transmitted;
and the transmission module 12 is configured to transmit the voice information to be transmitted to the opposite terminal.
Furthermore, the voice data processing device also comprises a preprocessing module which is used for carrying out analog-to-digital conversion on the current voice information and generating a voice signal to be processed;
the processing module is specifically configured to:
and filtering and denoising the voice signal to be processed by adopting an LMS or RLS adaptive filtering algorithm to obtain the voice information to be transmitted.
Alternatively, in another preferred embodiment of the present invention, as shown in fig. 4, the voice data processing apparatus may include: one or more processors 101, one or more input devices 102, one or more output devices 103, and memory 104, the processors 101, input devices 102, output devices 103, and memory 104 being interconnected via a bus 105. The memory 104 is used for storing a computer program comprising program instructions, the processor 101 being configured for invoking the program instructions for performing the methods of the above-described method embodiment parts.
It should be understood that, in the embodiment of the present invention, the Processor 101 may be a Central Processing Unit (CPU), and the Processor may also be other general processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The input device 102 may include a keyboard or the like, and the output device 103 may include a display (LCD or the like), a speaker, or the like.
The memory 104 may include read-only memory and random access memory, and provides instructions and data to the processor 101. A portion of the memory 104 may also include non-volatile random access memory. For example, the memory 104 may also store device type information.
In a specific implementation, the processor 101, the input device 102, and the output device 103 described in the embodiments of the present invention may execute the implementation manner described in the embodiments of the voice data processing method provided in the embodiments of the present invention, and are not described herein again.
It should be noted that, for a more specific workflow of the voice data processing system and the device, please refer to the foregoing method embodiment, which is not described herein again.
Accordingly, corresponding to the foregoing method embodiment and the speech data processing apparatus shown in fig. 4, an embodiment of the present invention provides a computer-readable storage medium storing a computer program, the computer program comprising program instructions that, when executed by a processor, implement: the voice data processing method is provided.
The computer readable storage medium may be an internal storage unit of the system according to any of the foregoing embodiments, for example, a hard disk or a memory of the system. The computer readable storage medium may also be an external storage device of the system, such as a plug-in hard drive, Smart Media Card (SMC), Secure Digital (SD) Card, Flash memory Card (Flash Card), etc. provided on the system. Further, the computer readable storage medium may also include both an internal storage unit and an external storage device of the system. The computer-readable storage medium is used for storing the computer program and other programs and data required by the system. The computer readable storage medium may also be used to temporarily store data that has been output or is to be output.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in a functional general in the foregoing description for the purpose of illustrating clearly the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may also be an electric, mechanical or other form of connection.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention essentially or partially contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A method for processing voice data, comprising:
acquiring current voice information of a local terminal through voice acquisition equipment;
filtering and denoising the current voice information by adopting an LMS or RLS adaptive filtering algorithm to obtain voice information to be transmitted;
and transmitting the voice information to be transmitted to the opposite terminal.
2. The speech data processing method of claim 1, wherein the current speech information comprises speaker voice and ambient noise; the voice information to be transmitted only includes speaker voice.
3. The voice data processing method of claim 1, wherein obtaining the voice information to be transmitted specifically comprises:
performing analog-to-digital conversion on the current voice information and generating a voice signal to be processed;
and filtering and denoising the voice signal to be processed by adopting an LMS or RLS adaptive filtering algorithm to obtain the voice information to be transmitted.
4. A speech data processing apparatus comprising a processor, an input device, an output device and a memory, the processor, the input device, the output device and the memory being interconnected, wherein the memory is configured to store a computer program comprising program instructions, the processor being configured to invoke the program instructions to perform the method according to any one of claims 1 to 3.
5. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program comprising program instructions that, when executed by a processor, cause the processor to carry out the method according to any one of claims 1-3.
6. A speech data processing apparatus, comprising:
the acquisition module is used for acquiring the current voice information of the local terminal through the voice acquisition equipment;
the processing module is used for filtering and denoising the current voice information by adopting an LMS or RLS adaptive filtering algorithm to obtain the voice information to be transmitted;
and the transmission module is used for transmitting the voice information to be transmitted to the opposite terminal.
7. The voice data processing apparatus of claim 6, further comprising a pre-processing module for performing analog-to-digital conversion on the current voice information and generating a voice signal to be processed;
the processing module is specifically configured to:
and filtering and denoising the voice signal to be processed by adopting an LMS or RLS adaptive filtering algorithm to obtain the voice information to be transmitted.
8. The speech data processing apparatus of claim 7, wherein the pre-processing module comprises an analog-to-digital conversion chip, and the analog-to-digital conversion chip is an AD9125 two-channel 16-bit bandwidth chip.
9. A speech data processing system comprising a home terminal, speech acquisition means, speech data processing means and an opposite terminal, wherein the speech data processing means is as claimed in any one of claims 6 to 8.
10. The voice data processing system of claim 9, wherein the voice capture device comprises a microphone or an array of a plurality of equally spaced microphones.
CN202110476209.5A 2021-04-29 2021-04-29 Voice data processing method, device, storage medium and system Pending CN113299303A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110476209.5A CN113299303A (en) 2021-04-29 2021-04-29 Voice data processing method, device, storage medium and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110476209.5A CN113299303A (en) 2021-04-29 2021-04-29 Voice data processing method, device, storage medium and system

Publications (1)

Publication Number Publication Date
CN113299303A true CN113299303A (en) 2021-08-24

Family

ID=77321752

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110476209.5A Pending CN113299303A (en) 2021-04-29 2021-04-29 Voice data processing method, device, storage medium and system

Country Status (1)

Country Link
CN (1) CN113299303A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7117145B1 (en) * 2000-10-19 2006-10-03 Lear Corporation Adaptive filter for speech enhancement in a noisy environment
CN110088834A (en) * 2016-12-23 2019-08-02 辛纳普蒂克斯公司 Multiple-input and multiple-output (MIMO) Audio Signal Processing for speech dereverbcration
CN111161751A (en) * 2019-12-25 2020-05-15 声耕智能科技(西安)研究院有限公司 Distributed microphone pickup system and method under complex scene
CN112509595A (en) * 2020-11-06 2021-03-16 广州小鹏汽车科技有限公司 Audio data processing method, system and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7117145B1 (en) * 2000-10-19 2006-10-03 Lear Corporation Adaptive filter for speech enhancement in a noisy environment
CN110088834A (en) * 2016-12-23 2019-08-02 辛纳普蒂克斯公司 Multiple-input and multiple-output (MIMO) Audio Signal Processing for speech dereverbcration
CN111161751A (en) * 2019-12-25 2020-05-15 声耕智能科技(西安)研究院有限公司 Distributed microphone pickup system and method under complex scene
CN112509595A (en) * 2020-11-06 2021-03-16 广州小鹏汽车科技有限公司 Audio data processing method, system and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
朱民雄等: "《计算机语音技术 修订版》", 北京航空航天大学出版社, pages: 302 - 307 *

Similar Documents

Publication Publication Date Title
CN109817238B (en) Audio signal acquisition device, audio signal processing method and device
CN110289009B (en) Sound signal processing method and device and interactive intelligent equipment
CN111031448B (en) Echo cancellation method, echo cancellation device, electronic equipment and storage medium
CN110931035B (en) Audio processing method, device, equipment and storage medium
CN114792524B (en) Audio data processing method, apparatus, program product, computer device and medium
CN113192528A (en) Single-channel enhanced voice processing method and device and readable storage medium
CN110299144A (en) Audio mixing method, server and client
JP2013172199A (en) Echo canceller, echo cancelling method, and talking device
CN112688965B (en) Conference audio sharing method and device, electronic equipment and storage medium
WO2024017110A1 (en) Voice noise reduction method, model training method, apparatus, device, medium, and product
CN113299303A (en) Voice data processing method, device, storage medium and system
CN113674752A (en) Method and device for reducing noise of audio signal, readable medium and electronic equipment
US20240105198A1 (en) Voice processing method, apparatus and system, smart terminal and electronic device
CN112489669B (en) Audio signal processing method, device, equipment and medium
CN111083250A (en) Mobile terminal and noise reduction method thereof
CN112309418B (en) Method and device for inhibiting wind noise
CN112307161B (en) Method and apparatus for playing audio
CN114283808A (en) Multi-path outbound system, method, apparatus, medium, and product
CN111147655B (en) Model generation method and device
CN113763976A (en) Method and device for reducing noise of audio signal, readable medium and electronic equipment
CN111145793B (en) Audio processing method and device
CN111145776B (en) Audio processing method and device
CN113808605B (en) Voice enhancement method, device and equipment based on building intercom system
US11924367B1 (en) Joint noise and echo suppression for two-way audio communication enhancement
CN111145792B (en) Audio processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination