CN113571093A

CN113571093A - Audio data processing method

Info

Publication number: CN113571093A
Application number: CN202110849986.XA
Authority: CN
Inventors: 陈元有; 周婵贞; 钟小艳
Original assignee: Individual
Current assignee: Individual
Priority date: 2021-07-27
Filing date: 2021-07-27
Publication date: 2021-10-29

Abstract

The application provides an audio data processing method, and relates to the technical field of audio data processing. In the method, firstly, to-be-processed audio data sent by audio acquisition equipment in communication connection is acquired, wherein the to-be-processed audio data comprises multiple frames of to-be-processed audio frames, and the audio acquisition equipment is deployed in a target area and used for acquiring information of a sound source in the target area to obtain the to-be-processed audio frames; secondly, determining whether a plurality of frames of audio frames to be processed included in the audio data to be processed need to be screened or not based on a predetermined judgment rule; then, if it is determined that a plurality of frames of audio frames to be processed included in the audio data to be processed need to be subjected to screening processing, the plurality of frames of audio frames to be processed are subjected to screening processing based on a pre-configured screening rule, so that a corresponding target audio frame is obtained. Based on the method, the problem of poor audio data processing effect in the prior art can be solved.

Description

Audio data processing method

Technical Field

The application relates to the technical field of audio data processing, in particular to an audio data processing method.

Background

The audio data has more application fields, so after the audio data is collected by the audio collecting device, the audio data can be transmitted to the required device for application. However, the inventor researches and finds that in the prior art, the processing is directly applied based on the collected audio data, which results in poor audio data processing effect.

Disclosure of Invention

In view of the above, an objective of the present application is to provide an audio data processing method to solve the problem of poor audio data processing effect in the prior art.

In order to achieve the above purpose, the embodiment of the present application adopts the following technical solutions:

an audio data processing method applied to an audio data processing device, the audio data processing method comprising:

the method comprises the steps of obtaining audio data to be processed sent by audio acquisition equipment in communication connection, wherein the audio data to be processed comprises a plurality of frames of audio frames to be processed, and the audio acquisition equipment is deployed in a target area and used for acquiring information of a sound source in the target area to obtain the audio frames to be processed;

determining whether the multi-frame audio frame to be processed included in the audio data to be processed needs to be subjected to screening processing based on a predetermined judgment rule;

and if the multi-frame audio frames to be processed included in the audio data to be processed need to be screened, screening the multi-frame audio frames to be processed based on a pre-configured screening rule to obtain corresponding target audio frames.

In a possible embodiment, in the audio data processing method, the step of acquiring audio data to be processed sent by an audio acquisition device connected in communication includes:

judging whether original audio data sent by audio acquisition equipment in communication connection is received, wherein the original audio data is obtained by carrying out information acquisition on a sound source in the target area on the basis of the audio acquisition equipment and comprises multiple frames of original audio frames;

if the original audio data sent by the audio acquisition equipment are received, verifying the original audio data to determine the legality of the original audio data;

if the legality of the original audio data does not meet a preset legal condition, discarding the original audio data;

if the legality of the original audio data meets a preset legal condition, determining the original audio data as audio data to be processed, and taking each frame of the original audio frame included in the original audio data as an audio frame to be processed.

In a possible embodiment, in the audio data processing method, if the original audio data sent by the audio acquisition device is received, the step of performing verification processing on the original audio data to determine the validity of the original audio data includes:

judging whether an audio data packet comprising the original audio data carries target identification information or not;

if the audio data packet does not carry the target identification information, determining that the legality of the original audio data does not meet a preset legal condition;

if the audio data packet carries the target identification information, identifying the target identification information to obtain equipment identity information of the audio acquisition equipment;

and verifying the original audio data based on the equipment identity information and a predetermined legal equipment identity set, wherein if the equipment identity information belongs to the legal equipment identity set, the legality of the original audio data is determined to meet a preset legal condition, and if the equipment identity information does not belong to the legal equipment identity set, the legality of the original audio data is determined not to meet the preset legal condition.

if the original audio data sent by the audio acquisition equipment are received, carrying out data volume statistical processing on the original audio data to obtain corresponding audio data volume;

determining a size relation between the audio data volume and a predetermined data volume threshold interval, wherein the data volume threshold interval is generated based on data volume threshold configuration operation of the audio data processing equipment responding to a corresponding user;

if the audio data volume belongs to the data volume threshold interval, determining that the legality of the original audio data meets a preset legal condition;

and if the audio data volume does not belong to the data volume threshold interval, determining that the legality of the original audio data does not meet a preset legal condition.

In a possible embodiment, in the audio data processing method, the step of determining whether the multiple frames of audio frames to be processed included in the audio data to be processed need to be subjected to the screening processing based on a predetermined judgment rule includes:

carrying out data volume statistical processing on the received original audio data sent by the audio acquisition equipment to obtain corresponding audio data volume;

determining a size relation between the audio data amount and a first predetermined data amount threshold, wherein the first data amount threshold is generated based on a first data amount threshold configuration operation of the audio data processing equipment in response to a corresponding user;

and if the audio data volume is larger than the first data volume threshold value, determining a place where the multi-frame audio frame to be processed included in the audio data to be processed needs to be screened.

In a possible embodiment, in the audio data processing method, the step of determining whether the multiple frames of audio frames to be processed included in the audio data to be processed need to be subjected to a screening process based on a predetermined judgment rule further includes:

and if the audio data volume is less than or equal to the first data volume threshold, determining that the multi-frame audio frame to be processed included in the audio data to be processed does not need to be subjected to screening processing.

if the audio data volume is less than or equal to the first data volume threshold, determining the number of frames of the multi-frame to-be-processed audio frames included in the to-be-processed audio data to obtain a corresponding first number of frames, and determining the time length of the multi-frame to-be-processed audio frames to obtain a corresponding first duration;

determining a corresponding first acquisition frequency based on the first frame number and the first time length;

determining a magnitude relation between the first acquisition frequency and a predetermined acquisition frequency threshold, wherein the acquisition frequency threshold is generated based on acquisition frequency threshold configuration operation of the audio data processing equipment in response to a corresponding user;

if the first acquisition frequency is less than or equal to the acquisition frequency threshold, determining that the multi-frame audio frame to be processed included in the audio data to be processed does not need to be subjected to screening processing;

and if the first acquisition frequency is greater than the acquisition frequency threshold, determining that the multi-frame to-be-processed audio frame included in the to-be-processed audio data needs to be screened.

In a possible embodiment, in the audio data processing method, if it is determined that the multiple frames of to-be-processed audio frames included in the to-be-processed audio data need to be filtered, the step of filtering the multiple frames of to-be-processed audio frames based on a preset filtering rule to obtain corresponding target audio frames includes:

if it is determined that the plurality of frames of audio frames to be processed included in the audio data to be processed need to be screened, obtaining the audio frames screened during historical screening to obtain corresponding historical audio frames;

and screening the plurality of frames of audio frames to be processed based on the historical audio frames to obtain corresponding target audio frames.

In a possible embodiment, in the audio data processing method, if it is determined that the multiple frames of to-be-processed audio frames included in the to-be-processed audio data need to be subjected to the filtering processing, the step of obtaining the audio frames that are filtered out in the historical filtering processing to obtain corresponding historical audio frames includes:

and if the fact that the multi-frame audio frames to be processed included in the audio data to be processed need to be screened is determined, obtaining the audio frames which are screened in the recent screening process in history, and obtaining the corresponding historical audio frames.

In a possible embodiment, in the audio data processing method, the step of performing a filtering process on the multiple frames of audio frames to be processed based on the historical audio frames to obtain corresponding target audio frames includes:

sequencing the multiple frames of audio frames to be processed according to the sequence of time from morning to evening based on the acquisition time of each frame of audio frames to be processed to obtain a corresponding sequence of audio frames to be processed;

and carrying out segmentation processing on the audio frame sequence to be processed to obtain a plurality of corresponding audio frame sequence fragments to be processed, and carrying out de-rescreening and screening processing on the audio frames to be processed included in each audio frame sequence fragment to be processed to obtain a corresponding target audio frame.

According to the audio data processing method, after the audio data to be processed sent by the audio acquisition equipment in communication connection is obtained, whether the audio data to be processed needs to be screened or not is determined based on the predetermined judgment rule, the audio data to be processed needs to be screened and processed is determined, then the audio data to be processed is screened and processed based on the predetermined screening rule, and the corresponding target audio frame is obtained, so that compared with the conventional technical scheme that the screening processing is not carried out or is directly carried out, due to the fact that the judgment mechanism whether the screening processing is carried out or not is added, the problem that the effect of audio data processing is not good (for example, the data volume is too large) due to the fact that the screening processing is not carried out in the prior art is solved, or the data is greatly distorted due to the fact that the screening processing is directly carried out to a certain degree.

In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

Fig. 1 is a schematic flowchart of an audio data processing method according to an embodiment of the present application.

Fig. 2 is a schematic flow chart of step 130 in fig. 1.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The embodiment of the application provides audio data processing equipment. Wherein the audio data processing device may comprise a memory and a processor.

In detail, the memory and the processor are electrically connected directly or indirectly to realize data transmission or interaction. For example, they may be electrically connected to each other via one or more communication buses or signal lines. The memory can have stored therein at least one software function (computer program) which can be present in the form of software or firmware. The processor may be configured to execute the executable computer program stored in the memory, so as to implement the audio data processing method provided by the embodiment of the present application (as described later).

Alternatively, the Memory may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like. The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), a System on Chip (SoC), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components.

It will be appreciated that in an alternative example, the audio data processing device may be a server with data processing capabilities. For example, in one alternative example, the audio data processing device may be configured to:

the method comprises the steps of obtaining audio data to be processed sent by audio acquisition equipment in communication connection, wherein the audio data to be processed comprises a plurality of frames of audio frames to be processed, and the audio acquisition equipment is deployed in a target area and used for acquiring information of a sound source in the target area to obtain the audio frames to be processed; determining whether the multi-frame audio frame to be processed included in the audio data to be processed needs to be subjected to screening processing based on a predetermined judgment rule; and if the multi-frame audio frames to be processed included in the audio data to be processed need to be screened, screening the multi-frame audio frames to be processed based on a pre-configured screening rule to obtain corresponding target audio frames.

As shown in fig. 1, an embodiment of the present application further provides an audio data processing method, which is applicable to the audio data processing device. Wherein the method steps defined by the flow related to the audio data processing method can be implemented by the audio data processing device.

The specific process shown in FIG. 1 will be described in detail below.

And step 110, acquiring audio data to be processed sent by the audio acquisition equipment in communication connection.

In this embodiment, the audio data processing device may first acquire to-be-processed audio data sent by an audio acquisition device connected in communication.

The audio data to be processed may include multiple frames of audio frames to be processed, and the audio acquisition device may be deployed in a target area and configured to acquire information of a sound source (person, object, device, or the like) in the target area to obtain the audio frames to be processed.

And step 120, determining whether the multi-frame audio frames to be processed included in the audio data to be processed need to be subjected to screening processing based on a predetermined judgment rule.

In this embodiment, after the audio data to be processed is acquired based on step 110, the audio data processing apparatus may determine whether the multiple frames of audio frames to be processed included in the audio data to be processed need to be subjected to a screening process based on a predetermined determination rule.

If it is determined that the multiple frames of to-be-processed audio frames included in the to-be-processed audio data need to be filtered, step 130 may be executed. If it is determined that the multiple frames of to-be-processed audio frames included in the to-be-processed audio data do not need to be subjected to screening processing, the multiple frames of to-be-processed audio frames may all be used as target audio frames.

And step 130, screening the multiple frames of audio frames to be processed based on a preset screening rule to obtain corresponding target audio frames.

In this embodiment, after determining that the multiple frames of to-be-processed audio frames included in the to-be-processed audio data need to be subjected to the filtering processing based on step 120, the audio data processing apparatus may perform the filtering processing on the multiple frames of to-be-processed audio frames based on a pre-configured filtering rule, so that the corresponding target audio frame may be obtained.

Based on the method, after the audio data to be processed sent by the audio acquisition equipment in communication connection is acquired, whether the audio data to be processed needs to be screened or not is determined based on the predetermined judgment rule, the audio data to be processed needs to be screened and processed is determined, then the audio data to be processed is screened and processed based on the predetermined screening rule, and the corresponding target audio frame is obtained, so that compared with the conventional technical scheme that the screening processing is not performed or the screening processing is directly performed, due to the fact that the judgment mechanism whether the screening processing is performed or not is added, the problem that the audio data processing effect is not good (for example, the data amount is too much) due to the fact that the screening processing is not performed in the prior art is solved, or the data is distorted to a certain extent due to the fact that the screening processing is directly performed.

In the first aspect, it should be noted that, in step 110, the audio data to be processed sent by the communicatively connected audio capture device may be acquired based on the following steps:

the method comprises the steps that firstly, whether original audio data sent by audio acquisition equipment in communication connection are received or not is judged, wherein the original audio data are obtained by carrying out information acquisition on a sound source in a target area on the basis of the audio acquisition equipment and comprise a plurality of frames of original audio frames;

secondly, if the original audio data sent by the audio acquisition equipment are received, the original audio data are verified to determine the legality of the original audio data;

thirdly, if the legality of the original audio data does not meet a preset legal condition, discarding the original audio data;

and fourthly, if the legality of the original audio data meets a preset legal condition, determining the original audio data as audio data to be processed, and taking each frame of the original audio frame included in the original audio data as an audio frame to be processed.

It will be appreciated that in an alternative example, the original audio data may be subjected to a verification process to determine the validity of the original audio data based on the following steps:

first, if the original audio data sent by the audio acquisition device is received, it is determined whether an audio data packet including the original audio data carries target identification information (that is, in a legal process, the audio acquisition device may package the original audio data and a target identification information together to form the audio data packet, and the legal audio acquisition device may acquire or configure correct target identification information in advance);

secondly, if the audio data packet does not carry the target identification information, determining that the legality of the original audio data does not meet a preset legal condition;

then, if the audio data packet carries the target identification information, performing identification processing on the target identification information to obtain equipment identity information of the audio acquisition equipment (the equipment identity information can be unique information such as equipment fingerprints);

and finally, checking the original audio data based on the equipment identity information and a predetermined legal equipment identity set, wherein if the equipment identity information belongs to the legal equipment identity set, the legality of the original audio data is determined to meet a preset legal condition, and if the equipment identity information does not belong to the legal equipment identity set, the legality of the original audio data is determined not to meet the preset legal condition

It is understood that, in another alternative example, the original audio data may also be subjected to a verification process based on the following steps to determine the validity of the original audio data:

firstly, if the original audio data sent by the audio acquisition equipment is received, carrying out data volume statistical processing on the original audio data to obtain corresponding audio data volume;

secondly, determining a size relation between the audio data volume and a predetermined data volume threshold interval, wherein the data volume threshold interval is generated based on data volume threshold configuration operation performed by the audio data processing device in response to a corresponding user, and it can be understood that a legal audio acquisition device can acquire or configure a correct data volume threshold interval in advance;

then, if the audio data volume belongs to the data volume threshold interval, determining that the legality of the original audio data meets a preset legal condition;

and finally, if the audio data volume does not belong to the data volume threshold interval, determining that the legality of the original audio data does not meet a preset legal condition.

In the second aspect, it should be noted that, when step 120 is executed, it may be determined whether the filtering process needs to be performed on the multiple frames of audio frames to be processed included in the audio data to be processed based on the following steps:

firstly, carrying out data volume statistical processing on received original audio data (as described above) sent by the audio acquisition equipment to obtain corresponding audio data volume;

secondly, determining the size relation between the audio data volume and a first data volume threshold value which is predetermined, wherein the first data volume threshold value is generated based on the first data volume threshold value configuration operation of the audio data processing equipment responding to a corresponding user;

then, if the audio data amount is greater than the first data amount threshold, it is determined that the multiple frames of audio frames to be processed included in the audio data to be processed need to be subjected to screening processing.

It is understood that, in an alternative example, when the step 120 is executed, it may further be based on the following steps to determine whether the plurality of frames of the to-be-processed audio frames included in the to-be-processed audio data need to be subjected to the filtering process:

It is understood that, in another alternative example, when the step 120 is executed, it may further be determined whether the multiple frames of audio frames to be processed included in the audio data to be processed need to be subjected to the filtering process based on the following steps:

step one, if the audio data volume is less than or equal to the first data volume threshold, determining the number of frames of the multiple frames of audio frames to be processed included in the audio data to be processed to obtain a corresponding first number of frames, and determining the time length of the multiple frames of audio frames to be processed (i.e. the difference between the acquisition time corresponding to the first frame of audio frame to be processed and the acquisition time corresponding to the last frame of audio frame) to obtain a corresponding first duration;

a second step, determining a corresponding first collecting frequency based on the first frame number and the first time length (for example, the first frame number is divided by the first time length to obtain the first collecting frequency);

thirdly, determining the magnitude relation between the first acquisition frequency and a predetermined acquisition frequency threshold, wherein the acquisition frequency threshold is generated based on the acquisition frequency threshold configuration operation of the audio data processing equipment responding to a corresponding user;

fourthly, if the first acquisition frequency is less than or equal to the acquisition frequency threshold (indicating that the repetition degree between adjacent audio frames is possibly not high), determining that the multi-frame audio frames to be processed included in the audio data to be processed do not need to be screened;

and fifthly, if the first acquisition frequency is greater than the acquisition frequency threshold (indicating that the repetition degree between adjacent audio frames is possibly high), determining that the multi-frame audio frames to be processed included in the audio data to be processed need to be screened.

In the third aspect, it should be noted that, when step 130 is executed, the multiple frames of audio frames to be processed may be subjected to a filtering process based on a pre-configured filtering rule based on the following steps to obtain corresponding target audio frames (as shown in

steps

131 and 132 of fig. 2):

step 131, if it is determined that the multiple frames of to-be-processed audio frames included in the to-be-processed audio data need to be screened, obtaining the audio frames screened during historical screening to obtain corresponding historical audio frames;

and step 132, screening the multiple frames of audio frames to be processed based on the historical audio frames to obtain corresponding target audio frames.

It is understood that in an alternative example, when performing step 131, the historical audio frames may be obtained based on the following steps:

if it is determined that the multiple frames of to-be-processed audio frames included in the to-be-processed audio data need to be subjected to screening processing, obtaining the audio frames which are screened in the history last time during screening processing, and obtaining corresponding history audio frames (which may be one frame or multiple frames).

It is understood that, in an alternative example, when the step 132 is executed, the multiple frames of audio frames to be processed may be subjected to a filtering process based on the historical audio frames to obtain corresponding target audio frames based on the following steps:

firstly, based on the acquisition time of each frame of the audio frame to be processed, sequencing the multiple frames of audio frames to be processed according to the sequence of the time from morning to evening to obtain a corresponding audio frame sequence to be processed, namely, in the audio frame sequence to be processed, the acquisition time of the previous audio frame to be processed is earlier than that of the next audio frame to be processed;

secondly, the audio frame sequence to be processed is segmented to obtain a plurality of corresponding audio frame sequence segments to be processed, and the audio frames to be processed included in each audio frame sequence segment to be processed are subjected to de-duplication screening processing to obtain corresponding target audio frames.

It will be appreciated that in an alternative example, the target audio frame may be derived based on the following steps:

firstly, segmenting the audio frame sequence to be processed based on the audio energy difference between the audio frames to be processed to obtain a plurality of corresponding audio frame sequence segments to be processed (for example, two adjacent audio frames to be processed with an audio energy difference value larger than a threshold are segmented into two adjacent audio frame sequence segments to be processed, wherein the audio energy calculation method may refer to the related prior art);

a second step of calculating an average value of audio similarity between each frame of the to-be-processed audio frame in each of the to-be-processed audio frame sequence segments and the historical audio frame in multiple dimensions (it is understood that the multiple dimensions may include at least the audio energy, and may also include amplitude, etc.);

thirdly, respectively calculating the sum of the average values of the audio similarity corresponding to each audio frame sequence segment to be processed to obtain corresponding audio similarity representative values, and taking a first number of pre-configured audio frame sequence segments to be processed with the largest audio similarity representative values as first audio frame sequence segments in the plurality of audio frame sequence segments to be processed, so as to obtain the first number of first audio frame sequence segments;

fourthly, determining the audio similarity representative value and the audio similarity representative value of the audio frame sequence segment to be processed with the smallest difference value between the average values of the audio similarity representative values and the audio similarity representative value of the audio frame sequence segment to be processed with the smallest difference value, and taking the audio similarity representative value corresponding to the audio frame sequence segment to be processed with the smallest difference value as a target representative value;

fifthly, sequencing the audio frames to be processed included in the first number of first audio frame sequence segments according to the sequence of time from morning to evening, and re-dividing the audio frames to obtain at least one second audio frame sequence segment, wherein the re-dividing principle is that the audio similarity representative value corresponding to each second audio frame sequence segment is larger than or equal to the target representative value;

sixthly, taking other audio frame sequence segments to be processed except the first number of first audio frame sequence segments as third audio frame sequence segments;

seventhly, respectively determining a first characterization coefficient corresponding to the audio similarity representative value of each second audio frame sequence segment and each third audio frame sequence segment based on a pre-configured target corresponding relationship, wherein the audio similarity representative value and the first characterization coefficient have a positive correlation, and the first characterization coefficient is smaller than 1 and larger than 0;

eighthly, respectively calculating the average value of the audio energy of the audio frames to be processed, which are included in each second audio frame sequence segment, to obtain a corresponding energy average value, and respectively calculating the average value of the audio energy of the audio frames to be processed, which are included in each third audio frame sequence segment, to obtain a corresponding energy average value;

a ninth step of performing product calculation on the energy mean value based on the first characterization coefficient (for example, multiplying the energy mean value corresponding to a second audio frame sequence segment by the corresponding first characterization coefficient), and determining screening ratio information corresponding to each second audio frame sequence segment and each third audio frame sequence segment based on the obtained product (where the product and the screening ratio information have a positive correlation, and if the product is larger, the screening ratio information is larger);

tenth, determining whether each second audio frame sequence segment and each third audio frame sequence segment have repeated audio frames to be processed;

eleventh, for each second frame sequence segment and each third frame sequence segment in which there are repeated audio frames to be processed, the repeated audio frames to be processed are screened based on the corresponding screening ratio information (for example, a screening number is determined according to the number of the repeated audio frames to be processed and the screening ratio information, then, an integer number closest to the screening number is determined, then, the integer number of the frames to be processed is selected as the target audio frame, and at least one frame of the audio frames to be processed needs to be selected as the target audio frame).

It is understood that each of the second frame sequence segments and each of the third frame sequence segments, for which there are no repeated audio frames to be processed, and audio frames to be processed other than the repeated audio frames to be processed, may be all the target audio frames.

In summary, according to the audio data processing method provided by the present application, after the audio data to be processed sent by the audio acquisition device in communication connection is acquired, firstly, whether the audio data to be processed needs to be screened is determined based on a predetermined judgment rule, and when determining that the audio data to be processed needs to be screened, and then screening the audio data to be processed based on the pre-configured screening rule to obtain the corresponding target audio frame, compared with the conventional technical scheme of not screening or directly screening, due to the fact that a judgment mechanism for judging whether screening processing is carried out or not is added, the problem that in the prior art, due to the fact that screening processing is not carried out, the effect of audio data processing is poor (for example, the data quantity is excessive) is solved, or data can be greatly distorted to a certain extent due to the fact that screening processing is directly carried out.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus and method embodiments described above are illustrative only, as the flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part. The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, an electronic device, or a network device) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes. It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims

1. An audio data processing method applied to an audio data processing apparatus, the audio data processing method comprising:

2. The audio data processing method according to claim 1, wherein the step of acquiring the audio data to be processed sent by the communicatively connected audio acquisition device comprises:

3. The audio data processing method according to claim 2, wherein the step of performing a verification process on the original audio data to determine the validity of the original audio data if the original audio data sent by the audio capturing device is received comprises:

4. The audio data processing method according to claim 2, wherein the step of performing a verification process on the original audio data to determine the validity of the original audio data if the original audio data sent by the audio capturing device is received comprises:

5. The audio data processing method according to claim 1, wherein the step of determining whether the plurality of frames of audio frames to be processed included in the audio data to be processed need to be subjected to the filtering processing based on a predetermined judgment rule comprises:

and if the audio data volume is larger than the first data volume threshold, determining that the multi-frame audio frame to be processed included in the audio data to be processed needs to be screened.

6. The audio data processing method according to claim 5, wherein the step of determining whether the plurality of frames of audio frames to be processed included in the audio data to be processed need to be subjected to the filtering processing based on a predetermined judgment rule further comprises:

7. The audio data processing method according to claim 5, wherein the step of determining whether the plurality of frames of audio frames to be processed included in the audio data to be processed need to be subjected to the filtering processing based on a predetermined judgment rule further comprises:

8. The audio data processing method according to any one of claims 1 to 6, wherein if it is determined that the multiple frames of audio frames to be processed included in the audio data to be processed need to be filtered, the step of filtering the multiple frames of audio frames to be processed based on a pre-configured filtering rule to obtain corresponding target audio frames includes:

9. The audio data processing method according to claim 8, wherein if it is determined that the multiple frames of to-be-processed audio frames included in the to-be-processed audio data need to be filtered, the step of obtaining the filtered audio frames during the historical filtering process to obtain corresponding historical audio frames includes:

10. The audio data processing method according to claim 8, wherein the step of performing a filtering process on the plurality of frames of audio frames to be processed based on the historical audio frames to obtain corresponding target audio frames comprises: