CN113936698B

CN113936698B - Audio data processing method and device and electronic equipment

Info

Publication number: CN113936698B
Application number: CN202111130355.9A
Authority: CN
Inventors: 田征绿; 宋志超
Original assignee: Du Xiaoman Technology Beijing Co Ltd
Current assignee: Beijing Duxiaoman Payment Technology Co ltd; Du Xiaoman Technology Beijing Co Ltd
Priority date: 2021-09-26
Filing date: 2021-09-26
Publication date: 2023-04-28
Anticipated expiration: 2041-09-26
Also published as: CN113936698A

Abstract

The invention discloses a processing method and device of audio data and electronic equipment. Wherein the method comprises the following steps: sampling the audio to be processed to obtain a plurality of audio data, acquiring first audio data from the plurality of audio data, and acquiring second audio data from the first audio data, so that the first audio data and the second audio data are spliced to obtain third audio data, and further the third audio data are subjected to noise reduction to obtain the noise-reduced audio. The invention solves the problem of resource waste caused by independent adaptation of different audio noise reduction algorithms in the prior art.

Description

Audio data processing method and device and electronic equipment

Technical Field

The present invention relates to the field of communications technologies, and in particular, to a method and an apparatus for processing audio data, and an electronic device.

Background

With the development of communication technology, people use audio data to transmit information more and more in work and life, wherein noise reduction processing on the audio data is a key for guaranteeing audio transmission quality in the transmission process of the audio data.

In the prior art, different audio noise reduction methods are often adopted according to different application scenes, and because the sampling rate, the sampling precision, the type and the length of the input audio data are different in the different audio noise reduction methods, in the actual audio data noise reduction processing process, the audio data are generally required to be split according to the different audio noise reduction methods and then are independently processed.

However, the audio data is split for different audio noise reduction methods and then is processed separately, and because of repeated steps and the defect of inextensibility, a large memory space is usually required to complete all the noise reduction processing of the audio data, so that the problem of memory space resource waste is caused.

In view of the above problems, no effective solution has been proposed at present.

Disclosure of Invention

The embodiment of the invention provides a processing method and device of audio data and electronic equipment, which at least solve the technical problem of resource waste caused by independent adaptation of different audio noise reduction algorithms in the prior art.

According to an aspect of an embodiment of the present invention, there is provided a method for processing audio data, including: sampling the audio to be processed to obtain a plurality of audio data, acquiring first audio data from the plurality of audio data, and acquiring second audio data from the first audio data, so that the first audio data and the second audio data are spliced to obtain third audio data, and further the third audio data are subjected to noise reduction to obtain the noise-reduced audio.

Further, the processing method of the audio data further comprises the following steps: sequentially storing a plurality of audio data into a first buffer area, sequentially reading the first audio data from the first buffer area, reading second audio data from the first audio data, and storing the second audio data into a second buffer area.

Further, the processing method of the audio data further comprises the following steps: and when the second audio data is not stored in the second buffer area, reading the first audio data with the first length from the first buffer area to obtain the second audio data, wherein the first length is the maximum length of the data stored in the second buffer area, and determining that the second audio data is the third audio data.

Further, the processing method of the audio data further comprises the following steps: when second audio data is stored in the second buffer area, detecting the second length of the second audio data, calculating the difference between the first length and the second length to obtain a third length, wherein the first length is the maximum length of the data stored in the second buffer area, reading the first audio data of the first length from the plurality of audio data, simultaneously, reading the first audio data of the third length from the first audio data of the first length, further, performing splicing processing on the second audio data and the first audio data of the third length to obtain third audio data, reading the first audio data of the second length from the first audio data of the first length, and storing the first audio data of the second length in the first storage area.

Further, the processing method of the audio data further comprises the following steps: and when the first audio data does not exist in the first buffer area and the length of the second audio data stored in the second buffer area is smaller than the first length, performing noise reduction processing on the second audio data, and updating audio to be processed based on the second audio data after the noise reduction processing.

Further, the processing method of the audio data further comprises the following steps: and carrying out noise reduction processing on at least part of audio in the third audio data to obtain noise-reduced audio, and determining residual audio data in the third audio data, wherein the residual audio data are audio data which are not subjected to noise reduction processing in the third audio data, so that when the length of the residual audio data is greater than or equal to the first length, the residual audio data are subjected to noise reduction processing.

Further, the processing method of the audio data further comprises the following steps: and determining a noise reduction algorithm for carrying out noise reduction processing on the third audio data, determining a frame length corresponding to the noise reduction algorithm, and further determining a first length according to the frame length, wherein the first length is the maximum length of the data stored in the second cache region.

According to another aspect of the embodiment of the present invention, there is also provided an apparatus for processing audio data, including: the sampling module is used for sampling the audio to be processed to obtain a plurality of audio data; an acquisition module for acquiring first audio data from the plurality of audio data and acquiring second audio data from the first audio data; the splicing module is used for carrying out splicing processing on the first audio data and the second audio data to obtain third audio data; and the processing module is used for carrying out noise reduction processing on the third audio data to obtain noise-reduced audio.

According to another aspect of the embodiment of the present invention, there is also provided an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor; the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of processing audio data.

According to another aspect of the embodiments of the present invention, there is also provided a non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute the above-described audio data processing method.

According to another aspect of embodiments of the present invention, there is also provided a computer program product comprising a computer program, characterized in that the computer program, when being executed by a processor, implements the above-mentioned method of processing audio data.

In the embodiment of the invention, a mode of carrying out noise reduction processing on a plurality of audio data through two buffer areas is adopted, the audio to be processed is sampled to obtain a plurality of audio data, first audio data are obtained from the plurality of audio data, second audio data are obtained from the first audio data, and therefore the first audio data and the second audio data are spliced to obtain third audio data, and noise reduction processing is carried out on the third audio data to obtain the noise-reduced audio.

In the above process, the present disclosure reads, stores and splices a plurality of audio data respectively by using two different buffer areas, so that unified noise reduction processing can be performed on the plurality of audio data, which avoids the problem of memory resource waste caused by that each audio data needs to be processed separately when different audio data appear, thereby reducing the complexity of the noise reduction processing of the audio data and realizing the effect of saving the memory resource space.

Therefore, the scheme provided by the disclosure achieves the purpose of uniformly denoising a plurality of audio data, thereby realizing the technical effect of saving the memory resource space, and further solving the problem of resource waste caused by independent adaptation of different audio denoising algorithms in the prior art.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiments of the invention and together with the description serve to explain the invention and do not constitute a limitation on the invention. In the drawings:

FIG. 1 is a flow chart of an alternative method of processing audio data according to an embodiment of the invention;

FIG. 2 is a flow chart of an alternative method of processing audio data according to an embodiment of the invention;

FIG. 3 is a flow chart of an alternative method of processing audio data according to an embodiment of the invention;

FIG. 4 is a flow chart of an alternative method of processing audio data according to an embodiment of the invention;

FIG. 5 is a flow chart of an alternative method of processing audio data according to an embodiment of the invention;

FIG. 6 is a flow chart of an alternative method of processing audio data according to an embodiment of the invention;

FIG. 7 is a flow chart of an alternative method of processing audio data according to an embodiment of the invention;

FIG. 8 is a flow chart of an alternative method of processing audio data according to an embodiment of the invention;

FIG. 9 is a flow chart of an alternative method of processing audio data according to an embodiment of the invention;

FIG. 10 is a flow chart of an alternative method of processing audio data according to an embodiment of the invention;

FIG. 11 is a flow chart of an alternative method of processing audio data according to an embodiment of the invention;

fig. 12 is a schematic diagram of an alternative apparatus for audio data according to an embodiment of the present invention.

Detailed Description

In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Example 1

According to an embodiment of the present invention, there is provided an embodiment of a method of processing audio data, it being noted that the steps shown in the flowcharts of the drawings may be performed in a computer system such as a set of computer executable instructions, and that although a logical order is shown in the flowcharts, in some cases the steps shown or described may be performed in an order different from that herein.

In addition, it should be further noted that, an electronic device having an audio data processing function may be used as an execution subject of the method provided in the embodiments of the present disclosure, where the electronic device includes, but is not limited to: computer devices such as notebook computers, desktop computers, servers, smartphones, and audio processing devices such as smartphones and microphones.

Fig. 1 is a flowchart of an alternative audio data processing method according to an embodiment of the present invention, as shown in fig. 1, the method includes the steps of:

step S102, sampling the audio to be processed to obtain a plurality of audio data.

In step S102, the audio to be processed may include a plurality of audio data, where a sampling rate, a sampling precision, a type, and a length may be different among the plurality of audio data. Audio data includes, but is not limited to: audio data such as voice data generated in a voice conversation and voice data generated in a video conversation.

Step S104, acquiring first audio data from the plurality of audio data, and acquiring second audio data from the first audio data.

In step S104, two storage areas may be set before the first audio data is acquired, where, as shown in fig. 3, the length of the first storage area stored data is greater than the length of the second storage area stored data, the first storage area may be a 2-frame (audio processing unit, generally measured in terms of time, and commonly has a length of 5 ms, 10 ms, 20 ms, etc. also referred to as a frame) cache area, the second storage area may be a cache area of frame data to be processed, where the length of the first storage area stored data is twice the length of the second storage area stored data, the second storage area may be referred to as a cache (computer cache) 1, the second storage area may be referred to as a cache2, and the cache1 may be used to splice the complete frame audio data, that is, splice the first audio data and the second audio data, and generate third audio data after the splicing is completed, so that the electronic device may perform noise reduction processing on the third audio data. The cache2 may be used to cooperate with the cache1 to perform noise reduction processing on the audio data, and cache at least part of the processed audio data, that is, the first audio data, and in addition, the cache2 may be further used to cache the audio data in the cache2 when the sum of the remaining audio data of the previous batch of less than the whole frame and the remaining audio data of the current batch exceeds the length of one frame during the noise reduction processing of the audio data.

Optionally, the electronic device may obtain the first audio data from a plurality of audio data, as shown in fig. 4, where the electronic device samples the audio to be processed to obtain an initial state of audio data processing, and the sampling points of the plurality of audio data are 1-15 and A, B, C. The first audio data is at least part of the plurality of audio data, and the first audio data may be stored in the first buffer area, as shown in fig. 5, and the electronic device may store three sampling points of part of audio data A, B, C in the plurality of audio data in the first buffer area. It should be noted that, since the audio data processing is a cyclic process, a state of a general process among the audio data processing processes may be used as a start state, and start and end are only specific examples of this general state. The process of inputting the audio data and denoising the batch processing is also repeated, and the state that the processing of the previous batch of audio data is finished is also the state before the processing of the next batch of data. Thus, fig. 5 takes the state after one normal internal loop in the audio data processing process is selected as the initial flow processing state, i.e., 1-15 has all the noise reduction processing completed.

Further, the electronic device may acquire audio data from the first audio data and store the audio data in the second buffer area before performing the noise reduction process on the plurality of audio data. It should be noted that the second audio data may be data that the electronic device directly reads from the plurality of audio data into the second buffer area, or may be data that the electronic device reads from the first audio data in the first buffer area, for example, as shown in fig. 6, the electronic device reads A, B, C in the first buffer area into the second buffer area.

It should be noted that, by the technical means of acquiring the second audio data from the first audio data, the audio data may be processed in the first buffer area by uniformly processing a plurality of audio data with different lengths according to the moving sequence, so as to achieve the effect of saving the memory space resources.

And S106, performing splicing processing on the first audio data and the second audio data to obtain third audio data.

In step S106, the electronic device may perform a splicing process on the first audio data and the second audio data in the second buffer area, where the second audio data is the audio data stored in the second buffer area, and the second audio data may be data that the electronic device directly reads from the plurality of audio data to the second buffer area, or may be data that the electronic device reads from the first audio data in the first buffer area, and after the electronic device reads from the first storage area, the electronic device may splice the read first audio data and second audio data to obtain the third audio data. As shown in fig. 7, A, B, C in the second buffer area is first audio data read from the first buffer area by the electronic device, 1 and 2 are data in the second buffer area read directly from the plurality of audio data by the electronic device, and the electronic device performs splicing processing on the two data to obtain third audio data.

Further, as shown in fig. 7, when the electronic device reads the audio data, the electronic device reads the audio data according to a certain read data, for example, in fig. 7, the electronic device reads A, B, C in the first buffer area first, then reads two

audio data

1 and 2 from the plurality of audio data, stores the two audio data in the remaining length in the second buffer area, and the electronic device buffers the audio data to be processed in the plurality of audio data in the first buffer area, for example, 3, 4 and 5, so as to avoid the audio data to be covered by the result of the noise reduction processing in the second buffer area when the audio data is not subjected to the noise reduction processing. The length of each new audio data acquired from the plurality of audio data is the maximum length of the second buffer area storage data.

Through the process, the plurality of audio data are respectively read, stored and spliced by using the two different cache areas, so that the plurality of audio data can be subjected to unified noise reduction treatment, the two different cache areas are compact and complete in structure, and the effect of saving content space resources is realized.

Step S108, noise reduction processing is carried out on the third audio data, and noise-reduced audio is obtained.

Optionally, as shown in fig. 8, the electronic device performs noise reduction processing on the third audio data in the second buffer area, and the processing result may be stored in the second buffer area according to a preset processing function. Before the noise reduction processing, as shown in fig. 2, an operator may set a noise reduction processing function and a custom conversion method of a data type of a sampling point of audio data on an electronic device, where the two caches for generating three frame lengths in total, that is, the first cache area and the second cache area, are adopted in the embodiment of the disclosure, so that the audio data to be processed is read into the two cache areas according to a cycle reading order, the audio data is segmented for the noise reduction processing until the audio data is completely noise reduced, and the audio data is output.

Optionally, as shown in fig. 9, after performing the noise reduction processing on the third audio data in the second buffer area, the electronic device copies the result after the noise reduction processing to the plurality of audio data, so that 1, 2, 3, 4, 5 in the plurality of audio data is transformed into a ', B ', C ', 1', 2'.

Further, as shown in fig. 10, the electronic device reads the

audio data

3, 4, 5 in the first buffer area, which has not been subjected to the noise reduction processing, into the second buffer area, and supplements the audio data, which has not been read, in the plurality of audio data to be processed, to the second buffer area, fills the remaining length of the second buffer area, and further performs the noise reduction processing on the audio data of the second buffer area again.

Through the process, the data conversion function and the noise reduction function are set, the noise reduction processing process of the audio data is abstracted, the audio data is suitable for being uniformly realized on various platforms, and the reading process of the audio data is uniformly processed by using two cache areas with different sizes, so that the effect of saving the memory space is realized.

Based on the above-mentioned schemes defined in step S102 to step S108, it can be known that in the present disclosure, a manner of performing noise reduction processing on a plurality of audio data through two buffer areas is adopted, by sampling audio to be processed, a plurality of audio data are obtained, and first audio data are sequentially read from a first buffer area, where the first audio data are at least part of the plurality of audio data, and the first audio data are sequentially read from the first buffer area, where the first audio data are at least part of the plurality of audio data, and further performing noise reduction processing on third audio data, so as to obtain noise-reduced audio.

It is easy to note that in the above process, the present disclosure reads, stores, and splices a plurality of audio data respectively by using two cache areas with different sizes, so that unified noise reduction processing can be performed on the plurality of audio data, which avoids the problem of memory resource waste caused by that each audio data needs to be separately processed when different audio data appear, and further reduces the complexity of the noise reduction processing of the audio data, and achieves the effect of saving the memory resource space.

In an alternative embodiment, the electronic device sequentially stores the plurality of audio data in the first buffer area, sequentially reads the first audio data from the first buffer area, further reads the second audio data from the first audio data, and stores the second audio data in the second buffer area.

Optionally, the electronic device may sequentially store a plurality of audio data in the first buffer area, as shown in fig. 4, where the electronic device samples the audio to be processed to obtain an initial state of audio data processing, where sampling points of the plurality of audio data are 1-15 and A, B, C. The electronic device sequentially stores A, B, C in the first buffer area, so that in a subsequent reading process, the electronic device can sequentially read the first audio data from the first buffer area. The electronic device may also read second audio data from the first audio data and store the second audio data in the second buffer area. For example, as shown in fig. 6, the electronic device reads A, B, C in the first buffer area into the second buffer area.

Through the process, the plurality of audio data are sequentially stored in the first buffer area and the second buffer area according to the reading sequence, so that the accuracy of the audio data with different lengths in the noise reduction processing process and the integrity of the audio data are ensured.

In an alternative embodiment, when the electronic device does not store the second audio data in the second buffer area, the electronic device reads the first audio data with the first length from the first buffer area to obtain the second audio data, where the first length is the maximum length of the data stored in the second buffer area, and determines that the second audio data is the third audio data.

Optionally, when the electronic device detects that no data is stored in the second buffer area, the electronic device may read the first audio data from the first buffer area, where the read data length is the maximum length of the data stored in the second buffer area, that is, the first length, for example, when the maximum length of the stored data in the second buffer area is 5, the electronic device may directly read the first audio data with the length of 5, which is read from the first stored data, in the case that no data is stored in the second buffer area, so as to obtain the second audio data, and without performing splicing, directly determine the second audio data as the third audio data.

In an alternative embodiment, when the electronic device stores the second audio data in the second buffer area, the electronic device detects the second length of the second audio data, calculates a difference between the first length and the second length to obtain a third length, where the first length is the maximum length of the second buffer area stored data, reads the first audio data of the first length from the plurality of audio data, reads the first audio data of the third length from the first audio data of the first length, and performs a splicing process on the second audio data and the first audio data of the third length to obtain the third audio data, further reads the first audio data of the second length from the first audio data of the first length, and stores the first audio data of the second length in the first storage area.

Optionally, as shown in fig. 6, when the electronic device detects that the second buffer area stores the second audio data, for example A, B, C, the electronic device detects that the second length of the second audio data, for example A, B, C, is 3, on the basis that the electronic device continues to select a difference between the first length and the second length, and if the first length in fig. 6 is 5, the difference is 2=5-3, that is, the third length is 2, at this time, as shown in fig. 7, the electronic device reads the first audio data with the length of 5, for example 1-5, from the plurality of audio data, divides the first audio data with the length of 5, and sequentially performs a splicing process on a portion of the first audio data (1, 2) with the length of 2 and the second audio data (A, B, C) to obtain the third audio data. At the same time, the remaining part of the first audio data (3, 4, 5) is stored into the first storage area.

Through the process, the audio data can be processed in the first buffer area in a unified manner according to the moving sequence, and a plurality of audio data with different lengths are processed, so that the effect of saving memory space resources is achieved.

In an alternative embodiment, when the first audio data does not exist in the first buffer area, and the length of the second audio data stored in the second buffer area is smaller than the first length, the electronic device performs noise reduction processing on the second audio data, and performs update processing on audio to be processed based on the second audio data after the noise reduction processing, where the first length is the maximum length of the data stored in the second buffer area.

Alternatively, as shown in fig. 11, when the first audio data does not exist in the first buffer area, and the length of the second audio data in the second buffer area is smaller than the first length, in fig. 11, only the audio data with the lengths of 16, 17 and 18 being 3 is smaller than the maximum length of the data stored in the second buffer area, at this time, the electronic device may directly perform noise reduction processing on the 16, 17 and 18 in the second buffer area, and update the audio to be processed after the processing is completed.

It should be noted that when the audio data is processed to the end, there may be a case where only the second audio data of the first length is in the second buffer area, but the first audio data is not in the first buffer area, and through the above process, the remaining audio data is subjected to noise reduction processing, so that the integrity of the noise reduction processing of the audio data is ensured.

In an optional embodiment, the electronic device performs noise reduction processing on at least part of audio in the third audio data to obtain noise-reduced audio, determines remaining audio data in the third audio data, where the remaining audio data is audio data in the third audio data that is not noise-reduced, and performs noise reduction processing on the remaining audio data when a length of the remaining audio data is greater than or equal to a first length, where the first length is a maximum length of the second buffer area storage data.

Alternatively, when the electronic device processes multiple batches of audio data, there may be a case where the length of the remaining audio data is greater than or equal to the first length, for example, when the length remaining after noise reduction of two batches of audio data exceeds the length of one frame, it is required to obtain one frame of noise-reduced audio data before overflow occurs (i.e. the space of the input audio data is insufficient to receive the sum of the lengths of the output audio data and the remaining audio data), which may only occur at the end of the previous batch of data processing, and at this time, the electronic device may perform noise reduction processing on the remaining audio data, thereby subsequently restoring to the initial state of the flow.

Through the process, the problem that the space of the input audio data is insufficient to receive the sum of the lengths of the output audio data and the residual audio data is avoided, and the stability of the noise reduction processing process of the audio data is improved.

In an optional embodiment, the electronic device determines a noise reduction algorithm for performing noise reduction processing on the third audio data, determines a frame length corresponding to the noise reduction algorithm, and further determines a first length according to the frame length, where the first length is a maximum length of the data stored in the second buffer area.

Optionally, the electronic device first determines a noise reduction algorithm for noise reduction of the third audio data, where the noise reduction algorithm includes, but is not limited to: opus (an instant voice transmission audio format standardized by the internet engineering task force) noise reduction algorithm suitable for low-latency on-line, webrtc (a system supporting web browser for real-time voice conversation or video conversation), FFMpeg (a set of open source computer programs that can be used to record, convert and convert digital audio, video into streams) noise reduction algorithm, and NLM algorithm (non-local average noise reduction, algorithm for weighted average calculation of real data with similarity of neighboring data and distance between two points). Moreover, due to the different frame (also referred to as frame) lengths for the various algorithms, for example, the Opus noise reduction method uses a sampling rate of 8000HZ or 16000HZ, the input frame is 480, while the Webrtc speech noise reduction algorithm uses only a sampling rate of 16000HZ, the input frame is 160. Therefore, the electronic device needs to determine the frame length of the noise reduction algorithm, and thus determine the length of the data stored in the second buffer area, that is, the first length, as shown in fig. 3, if the frame length corresponding to the current noise reduction algorithm is 5, the length of the second buffer area is also 5.

It should be noted that, through the above process, the lengths of the stored data in the two different buffer areas are set according to the noise reduction algorithm, so that unified noise reduction processing can be performed on a plurality of audio data, and the effect of saving the memory space resources is achieved.

As can be seen from the above, according to the present disclosure, by using two buffer areas with different sizes to respectively read, store and splice a plurality of audio data, a unified noise reduction process can be performed on the plurality of audio data, so that the problem of memory resource waste caused by that each audio data needs to be separately processed when different audio data appear is avoided, and further the complexity of the noise reduction process of the audio data is reduced, and the effect of saving the memory resource space is achieved.

Example 2

There is further provided, according to an embodiment of the present disclosure, an embodiment of a processing apparatus for audio data, where fig. 12 is a schematic diagram of an apparatus for processing audio data according to embodiment 2 of the present disclosure, the apparatus including: the sampling module 1201 is configured to sample audio to be processed to obtain a plurality of audio data; an acquisition module 1203 configured to acquire first audio data from the plurality of audio data, and acquire second audio data from the first audio data; the splicing module 1205 is configured to splice the first audio data and the second audio data to obtain third audio data; the processing module 1207 is configured to perform noise reduction processing on the third audio data, so as to obtain noise-reduced audio.

It should be noted that the sampling module 1201, the obtaining module 1203, the splicing module 1205, and the processing module 1207 correspond to steps S102 to S108 in the above embodiment, and the four modules are the same as the examples and application scenarios implemented by the corresponding steps, but are not limited to those disclosed in the above embodiment 1.

Optionally, the acquiring module further includes: the device comprises a first storage module, a reading module and a second storage module. The first storage module is used for sequentially storing a plurality of audio data into the first cache area; the reading module is used for sequentially reading the first audio data from the first cache area; and the second storage module is used for reading the second audio data from the first audio data and storing the second audio data into the second buffer area.

Optionally, the above-mentioned splicing module further includes: the first reading module and the determining module. The first reading module is used for reading first audio data with a first length from the first cache area to obtain second audio data when the second audio data is not stored in the second cache area, wherein the first length is the maximum length of the data stored in the second cache area; and the determining module is used for determining the second audio data to be third audio data.

Optionally, the above-mentioned splicing module further includes: the device comprises a detection module, a calculation module, a second reading module, a third reading module, a first splicing module and a fourth reading module. The detection module is used for detecting a second length of the second audio data when the second audio data is stored in the second buffer area; the computing module is used for computing the difference value between the first length and the second length to obtain a third length, wherein the first length is the maximum length of the data stored in the second cache area; the second reading module is used for reading first audio data with a first length from the plurality of audio data; a third reading module for reading the first audio data of the third length from the first audio data of the first length; the first splicing module is used for carrying out splicing processing on the second audio data and the first audio data with the third length to obtain third audio data; and the fourth reading module is used for reading the first audio data with the second length from the first audio data with the first length and storing the first audio data with the second length into the first storage area.

Optionally, the audio data processing device further includes: and the updating module is used for carrying out noise reduction processing on the second audio data when the first audio data does not exist in the first buffer area and the length of the second audio data stored in the second buffer area is smaller than the first length, and carrying out updating processing on the audio to be processed based on the second audio data after the noise reduction processing.

Optionally, the processing module further includes: the device comprises a first processing module, a first determining module and a second processing module. The first processing module is used for carrying out noise reduction processing on at least part of audio in the third audio data to obtain noise-reduced audio; the first determining module is used for determining residual audio data in the third audio data, wherein the residual audio data are audio data which are not subjected to noise reduction processing in the third audio data; and the second processing module is used for carrying out noise reduction processing on the residual audio data when the length of the residual audio data is greater than or equal to the first length.

Optionally, the audio data processing device further includes: the system comprises a second determining module, a third determining module and a fourth determining module. The second determining module is used for determining a noise reduction algorithm for performing noise reduction processing on the third audio data; a third determining module, configured to determine a frame length corresponding to the noise reduction algorithm; and the fourth determining module is used for determining a first length according to the frame length, wherein the first length is the maximum length of the data stored in the second cache area.

Example 3

According to another aspect of the embodiments of the present disclosure, there is also provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the audio data processing method of embodiment 1 described above.

Example 4

According to another aspect of the embodiments of the present disclosure, there is also provided a non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute the audio data processing method in the above-described embodiment 1.

Example 5

According to another aspect of the embodiments of the present disclosure, there is also provided a computer program product comprising a computer program which, when executed by a processor, implements the audio data processing method in embodiment 1 described above.

The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.

In the foregoing embodiments of the present invention, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.

In the several embodiments provided in the present application, it should be understood that the disclosed technology content may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and the division of the units, for example, may be a logic function division, and may be implemented in another manner, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.

The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.

Claims

1. A method of processing audio data, comprising:

sampling the audio to be processed to obtain a plurality of audio data;

acquiring first audio data from the plurality of audio data, and acquiring second audio data from the first audio data;

performing splicing processing on the first audio data and the second audio data to obtain third audio data;

and carrying out noise reduction processing on the third audio data to obtain noise-reduced audio.

2. The method of claim 1, wherein obtaining first audio data from the plurality of audio data and obtaining second audio data from the first audio data comprises:

sequentially storing the plurality of audio data into a first buffer area;

sequentially reading the first audio data from the first buffer area;

and reading the second audio data from the first audio data and storing the second audio data into a second buffer area.

3. The method of claim 2, wherein the splicing the first audio data and the second audio data to obtain third audio data comprises:

when the second audio data is not stored in the second cache area, reading first audio data with a first length from the first cache area to obtain the second audio data, wherein the first length is the maximum length of the second cache area storage data;

and determining the second audio data as the third audio data.

4. The method of claim 2, wherein the splicing the first audio data and the second audio data to obtain third audio data comprises:

detecting a second length of the second audio data when the second audio data is stored in the second buffer area;

calculating a difference value between the first length and the second length to obtain a third length, wherein the first length is the maximum length of the data stored in the second cache area;

reading first audio data of the first length from the plurality of audio data;

reading the first audio data of the third length from the first audio data of the first length;

performing splicing processing on the second audio data and the first audio data with the third length to obtain third audio data;

and reading the first audio data of the second length from the first audio data of the first length, and storing the first audio data of the second length into a first storage area.

5. The method according to claim 2, wherein the method further comprises:

and when the first audio data does not exist in the first buffer area and the length of the second audio data stored in the second buffer area is smaller than the first length, performing noise reduction processing on the second audio data, and updating the audio to be processed based on the second audio data after the noise reduction processing.

6. The method of claim 2, wherein performing noise reduction processing on the third audio data to obtain noise-reduced audio comprises:

carrying out noise reduction treatment on at least part of audio in the third audio data to obtain the noise-reduced audio;

determining remaining audio data in the third audio data, wherein the remaining audio data is audio data which is not subjected to noise reduction processing in the third audio data;

and when the length of the residual audio data is greater than or equal to the first length, carrying out noise reduction processing on the residual audio data.

7. The method according to claim 2, wherein the method further comprises:

determining a noise reduction algorithm for performing noise reduction processing on the third audio data;

determining the frame length corresponding to the noise reduction algorithm;

and determining a first length according to the frame length, wherein the first length is the maximum length of the data stored in the second cache area.

8. An apparatus for processing audio data, comprising:

the sampling module is used for sampling the audio to be processed to obtain a plurality of audio data;

an acquisition module for acquiring first audio data from the plurality of audio data and second audio data from the first audio data;

the splicing module is used for carrying out splicing processing on the first audio data and the second audio data to obtain third audio data;

and the processing module is used for carrying out noise reduction processing on the third audio data to obtain noise-reduced audio.

9. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of processing audio data according to any one of claims 1 to 7.

10. A non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute the method of processing audio data according to any one of claims 1 to 7.