CN113299303A

CN113299303A - Voice data processing method, device, storage medium and system

Info

Publication number: CN113299303A
Application number: CN202110476209.5A
Authority: CN
Inventors: 乔雪娜; 陈乐�
Original assignee: Pingdingshan Juxin Network Technology Co ltd
Current assignee: Pingdingshan Juxin Network Technology Co ltd
Priority date: 2021-04-29
Filing date: 2021-04-29
Publication date: 2021-08-24

Abstract

The embodiment of the invention discloses a voice data processing method, a voice data processing device, a storage medium and a voice data processing system. The method comprises the following steps: acquiring current voice information of a local terminal through voice acquisition equipment; filtering and denoising the current voice information by adopting an LMS or RLS adaptive filtering algorithm to obtain voice information to be transmitted; and transmitting the voice information to be transmitted to the opposite terminal. By implementing the embodiment of the invention, the LMS or RLS adaptive filtering algorithm is adopted to filter and reduce noise of the current voice information, so that the noise is eliminated, and the voice information needing to be transmitted to the opposite end only comprises the voice of a speaker, thereby eliminating the influence of the noise on the voice data transmission and further improving the transmission quality of the voice data.

Description

Voice data processing method, device, storage medium and system

Technical Field

The invention relates to the technical field of data processing, in particular to a voice data processing method, a voice data processing device, a voice data processing storage medium and a voice data processing system.

Background

In the existing mobile phone voice call or internet-based video conference and other scenes, the transmission quality of voice data is often reduced due to noise, so that the user experience effect is poor.

Disclosure of Invention

In view of the technical defects in the prior art, embodiments of the present invention provide a method, an apparatus, a storage medium, and a system for processing voice data.

In order to achieve the above object, in a first aspect, an embodiment of the present invention provides a method for processing voice data, including:

acquiring current voice information of a local terminal through voice acquisition equipment;

filtering and denoising the current voice information by adopting an LMS or RLS adaptive filtering algorithm to obtain voice information to be transmitted;

and transmitting the voice information to be transmitted to the opposite terminal.

Wherein, the current voice information comprises speaker voice and environmental noise; the voice information to be transmitted only includes speaker voice.

In some preferred embodiments of the present application, obtaining the voice information to be transmitted specifically includes:

performing analog-to-digital conversion on the current voice information and generating a voice signal to be processed;

and filtering and denoising the voice signal to be processed by adopting an LMS or RLS adaptive filtering algorithm to obtain the voice information to be transmitted.

In a second aspect, an embodiment of the present invention provides a speech data processing apparatus, including a processor, an input device, an output device, and a memory, where the processor, the input device, the output device, and the memory are connected to each other, where the memory is used to store a computer program, and the computer program includes program instructions, and the processor is configured to call the program instructions to execute the method of the first aspect.

In a third aspect, an embodiment of the present invention provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, where the computer program includes program instructions, and the program instructions, when executed by a processor, cause the processor to execute the method of the first aspect.

In a fourth aspect, an embodiment of the present invention provides another speech data processing apparatus, including:

the acquisition module is used for acquiring the current voice information of the local terminal through the voice acquisition equipment;

the processing module is used for filtering and denoising the current voice information by adopting an LMS or RLS adaptive filtering algorithm to obtain the voice information to be transmitted;

and the transmission module is used for transmitting the voice information to be transmitted to the opposite terminal.

In some preferred embodiments of the present application, the apparatus further includes a preprocessing module, configured to perform analog-to-digital conversion on the current speech information and generate a speech signal to be processed;

the processing module is specifically configured to:

The preprocessing module comprises an analog-to-digital conversion chip, and the analog-to-digital conversion chip adopts an AD9125 dual-channel 16-bit bandwidth chip.

In a fifth aspect, an embodiment of the present invention further provides a voice data processing system, which includes a home terminal, a voice acquisition device, a voice data processing apparatus, and an opposite terminal. Wherein the voice data processing apparatus is as described above.

In certain preferred embodiments of the present application, the voice capturing device includes a microphone or an array of a plurality of equally spaced microphones.

By implementing the embodiment of the invention, the LMS or RLS adaptive filtering algorithm is adopted to filter and reduce noise of the current voice information, so that the noise is eliminated, and the voice information needing to be transmitted to the opposite end only comprises the voice of a speaker, thereby eliminating the influence of the noise on the voice data transmission and further improving the transmission quality of the voice data.

Drawings

In order to more clearly illustrate the detailed description of the invention or the technical solutions in the prior art, the drawings that are needed in the detailed description of the invention or the prior art will be briefly described below.

FIG. 1 is a flow chart of a method for processing voice data according to an embodiment of the present invention;

FIG. 2 is a block diagram of a voice data processing system provided by an embodiment of the present invention;

FIG. 3 is a schematic diagram of an architecture of the speech data processing apparatus of FIG. 2;

fig. 4 is a schematic diagram of another structure of the speech data processing apparatus in fig. 2.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It should be noted that the voice scene of the voice data processing method provided by the embodiment of the present invention mainly includes: a voice or video call initiated by social software such as a mobile phone call, a WeChat and the like, an internet-based video conference and the like.

Referring to fig. 1, an embodiment of the present invention provides a method for processing voice data, including:

and S1, acquiring the current voice information of the local terminal through the voice acquisition equipment.

Wherein the current voice information comprises speaker voice and environmental noise.

And S2, performing analog-to-digital conversion on the current voice information and generating a voice signal to be processed.

In the embodiment, an AD9125 two-channel 16-bit bandwidth chip is adopted to perform analog-to-digital conversion on the current voice signal. AD9125 is a dual channel, 16-bit, high dynamic range TxDAC digital-to-analog converter (DAC) that provides a 1000MSPS sampling rate that can produce multiple carriers up to the nyquist frequency. It has properties optimized for direct conversion transmission applications, including complex digital modulation and gain and offset compensation. The DAC output is optimized to seamlessly interface with an analog quadrature modulator, such as the ADI ADL537x F-MOD family of modulators. The 4-wire serial port interface allows many internal parameters to be programmed and read back. The full scale output current can be programmed in the range of 8.7mA to 31.7 mA.

And S3, filtering and denoising the voice signal to be processed by adopting an LMS or RLS adaptive filtering algorithm to obtain the voice information to be transmitted.

It should be noted that the principle of the lms (least Mean square) algorithm is as follows: the next moment filter weight coefficient is estimated through the output error of the filter at the current moment, the output signal and the filter weight coefficient, and finally the minimum mean square error between the expected output signal and the actual output signal is achieved. RLS (recursive Least Square) can perform real-time online parameter estimation, and is a linear recursion estimator to minimize the covariance of the current parameters.

And S4, transmitting the voice information to be transmitted to the opposite terminal.

The voice data processing method of the embodiment of the invention adopts the LMS or RLS adaptive filtering algorithm to filter and reduce noise of the current voice information, eliminates noise, and enables the voice information needing to be transmitted to the opposite end to only comprise the voice of a speaker, thereby eliminating the influence of the noise on the voice data transmission and further improving the transmission quality of the voice data.

Based on the same inventive concept, the embodiment of the invention provides a voice data processing system. As shown in fig. 2, the system includes a home terminal, a voice collecting device, a voice data processing device, and an opposite terminal.

The voice collecting device can be a sound pickup or an array formed by a plurality of microphones which are arranged at equal intervals. For example, in a video training conference in which multiple persons participate, a plurality of equally spaced microphones may be used to collect current speech information.

Further, as shown in fig. 3, in a preferred embodiment of the present invention, the voice data processing apparatus includes:

the acquisition module 10 is used for acquiring current voice information of a local terminal through voice acquisition equipment;

the processing module 11 is configured to perform filtering and denoising processing on the current voice information by using an LMS or RLS adaptive filtering algorithm to obtain voice information to be transmitted;

and the transmission module 12 is configured to transmit the voice information to be transmitted to the opposite terminal.

Furthermore, the voice data processing device also comprises a preprocessing module which is used for carrying out analog-to-digital conversion on the current voice information and generating a voice signal to be processed;

the processing module is specifically configured to:

Alternatively, in another preferred embodiment of the present invention, as shown in fig. 4, the voice data processing apparatus may include: one or more processors 101, one or more input devices 102, one or more output devices 103, and memory 104, the processors 101, input devices 102, output devices 103, and memory 104 being interconnected via a bus 105. The memory 104 is used for storing a computer program comprising program instructions, the processor 101 being configured for invoking the program instructions for performing the methods of the above-described method embodiment parts.

It should be understood that, in the embodiment of the present invention, the Processor 101 may be a Central Processing Unit (CPU), and the Processor may also be other general processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The input device 102 may include a keyboard or the like, and the output device 103 may include a display (LCD or the like), a speaker, or the like.

The memory 104 may include read-only memory and random access memory, and provides instructions and data to the processor 101. A portion of the memory 104 may also include non-volatile random access memory. For example, the memory 104 may also store device type information.

In a specific implementation, the processor 101, the input device 102, and the output device 103 described in the embodiments of the present invention may execute the implementation manner described in the embodiments of the voice data processing method provided in the embodiments of the present invention, and are not described herein again.

It should be noted that, for a more specific workflow of the voice data processing system and the device, please refer to the foregoing method embodiment, which is not described herein again.

Accordingly, corresponding to the foregoing method embodiment and the speech data processing apparatus shown in fig. 4, an embodiment of the present invention provides a computer-readable storage medium storing a computer program, the computer program comprising program instructions that, when executed by a processor, implement: the voice data processing method is provided.

The computer readable storage medium may be an internal storage unit of the system according to any of the foregoing embodiments, for example, a hard disk or a memory of the system. The computer readable storage medium may also be an external storage device of the system, such as a plug-in hard drive, Smart Media Card (SMC), Secure Digital (SD) Card, Flash memory Card (Flash Card), etc. provided on the system. Further, the computer readable storage medium may also include both an internal storage unit and an external storage device of the system. The computer-readable storage medium is used for storing the computer program and other programs and data required by the system. The computer readable storage medium may also be used to temporarily store data that has been output or is to be output.

Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in a functional general in the foregoing description for the purpose of illustrating clearly the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may also be an electric, mechanical or other form of connection.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention essentially or partially contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A method for processing voice data, comprising:

2. The speech data processing method of claim 1, wherein the current speech information comprises speaker voice and ambient noise; the voice information to be transmitted only includes speaker voice.

3. The voice data processing method of claim 1, wherein obtaining the voice information to be transmitted specifically comprises:

4. A speech data processing apparatus comprising a processor, an input device, an output device and a memory, the processor, the input device, the output device and the memory being interconnected, wherein the memory is configured to store a computer program comprising program instructions, the processor being configured to invoke the program instructions to perform the method according to any one of claims 1 to 3.

5. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program comprising program instructions that, when executed by a processor, cause the processor to carry out the method according to any one of claims 1-3.

6. A speech data processing apparatus, comprising:

7. The voice data processing apparatus of claim 6, further comprising a pre-processing module for performing analog-to-digital conversion on the current voice information and generating a voice signal to be processed;

the processing module is specifically configured to:

8. The speech data processing apparatus of claim 7, wherein the pre-processing module comprises an analog-to-digital conversion chip, and the analog-to-digital conversion chip is an AD9125 two-channel 16-bit bandwidth chip.

9. A speech data processing system comprising a home terminal, speech acquisition means, speech data processing means and an opposite terminal, wherein the speech data processing means is as claimed in any one of claims 6 to 8.

10. The voice data processing system of claim 9, wherein the voice capture device comprises a microphone or an array of a plurality of equally spaced microphones.