CN107506409B

CN107506409B - Method for processing multi-audio data

Info

Publication number: CN107506409B
Application number: CN201710673700.0A
Authority: CN
Inventors: 王红娟; 董毅; 付宪瑞; 王玉奎
Original assignee: Inspur Financial Information Technology Co Ltd
Current assignee: Inspur Financial Information Technology Co Ltd
Priority date: 2017-08-09
Filing date: 2017-08-09
Publication date: 2021-01-08
Anticipated expiration: 2037-08-09
Also published as: CN107506409A

Abstract

The invention discloses a method for processing multi-audio data, which stores a multi-audio data in a single audio file and comprises the following steps: s1: selecting sound source equipment, and setting independent acquisition channels for different sound source equipment; s2: setting sound source information and frequency related data for the acquired audio data; s3: setting the number of sound sources and the basic information of different sound sources in an audio file header; s4: the number of occurrences, the frequency, and the position and the duration of each occurrence of the sound source are set for each sound source. S5: storing the sound source file header information, starting coding and writing the sound source file header information into an audio data stream; s6: and writing the related sound source data into the frequency set by the encoder of the audio file. The multi-tone source file stored by the method is smaller in size than a traditional multi-file independent storage mode, and audio data indexes are easier to establish, so that the method has an important role in the related fields of audio recording, management and the like.

Description

Method for processing multi-audio data

Technical Field

The invention relates to a method for processing multi-audio data, and belongs to the technical field of electronic equipment.

Background

Audio data as a waveform data has two main important parameters in its conventional acquisition process: audio and volume. The audio frequency is often used as a main parameter for identifying the characteristics of the sound source, and the volume is an important characteristic for representing the sound intensity.

In the process of sound collection and coding, sound wave data of different sound sources are overlapped in an interactive mode on collection equipment, single audio files mixing various sound sources are finally generated, in the subsequent processing process, related processing algorithms often need to filter clutter data according to specific frequency characteristics and then can search data to be searched, and in the process, the difficulty and the accuracy of identification are often low. This is because, in the process of audio acquisition and storage, there is often an overlap of multiple sound sources, and the data of different sound sources are different in frequency, so that many times will overlap, and it is very difficult to filter out specific information in these overlapping frequencies. Especially, when the volume of other sound sources is higher than the volume of the target to be searched, the volume of the target to be searched is usually covered by the background volume and cannot be detected.

Therefore, in the process of searching the batch of audio files, a large amount of labor cost and time cost are required to be invested to find the target to be searched, and an effective automatic quick searching mode is difficult to find for replacing the target.

Disclosure of Invention

The present invention is directed to solve the above problems in the prior art, and to provide a method for processing multiple audio data.

The technical solution to the above object of the present invention is achieved by: a method for processing multi-audio data, comprising: one kind of multi-audio data is stored in a single audio file.

Preferably, the method comprises the steps of:

s1: selecting sound source equipment, and setting independent acquisition channels for different sound source equipment;

s2: setting sound source information and frequency related data for the acquired audio data;

s3: setting the number of sound sources and the basic information of different sound sources in an audio file header;

s4: setting the occurrence times, frequency, and the position and duration of each occurrence of the sound source for each sound source;

s5: storing the sound source file header information, starting coding and writing the sound source file header information into an audio data stream;

s6: the method comprises the steps of writing related sound source data on the set frequency of an encoder of an audio file, and when a plurality of groups of sound source data exist in a certain frequency band, sequentially writing the sound source data in a sound source sequence to form a sound source 1| sound source 2| sound source 3.

Preferably, in step S3, the audio of different frequencies is regarded as different audio sources regardless of whether the capturing devices are the same.

Preferably, in step S4, the number of sound sources is the time period in which the audio of the frequency effectively appears, and the background data except for the frequency is ignored.

Preferably, in step S4, the sound source frequency is a unique identifier of the sound source, so as to create an index.

Preferably, in step S6, when a plurality of sets of sound source data exist in a certain frequency band, the sound source 1| sound source 2| sound source 3 are written in order of sound source.

Preferably, a method for processing multi-audio data, which can store a multi-audio data in a single audio file; the method comprises the following steps:

s6: writing related sound source data on the set frequency of an encoder of an audio file, and when a plurality of groups of sound source data exist in a certain frequency band, sequentially writing the sound source data in a sound source sequence of a sound source 1| sound source 2| sound source 3; in step S3, the audio of different frequencies is regarded as different sound sources regardless of whether the capturing devices are the same; in step S4, the sound source frequency is the time period in which the audio of the frequency effectively appears, and the background data except the frequency is ignored; in step S4, the sound source frequency is a unique identifier of the sound source, so as to create an index; when a plurality of groups of sound source data exist in a certain frequency band, writing the sound source data into the frequency band according to the sound source sequence, wherein the sound source data comprises a sound source 1 and a sound source 2 and a sound source 3; when a plurality of sets of sound source data exist in a certain frequency band, the sound source data are written in sequence according to the sound source sequence, and the sound source 1| sound source 2| sound source 3 is generated.

The technical scheme of the invention has the advantages that: the method can not only integrate audio data of multiple sound sources at will, store audio data of multiple sound sources and multiple frequencies in the same audio file at the same time, but also freely switch the sound sources in the playback process, freely switch the sound source data or mix and throw away any sound source, realize real-time sound mixing operation, and set a sound pattern characteristic retrieval index aiming at different sound source data, so that the retrieval process of the audio data is more efficient. The multi-tone source file stored by the method is smaller in size than a traditional multi-file independent storage mode, audio data indexes are easier to establish, the method plays an important role in related fields such as audio recording and management, and is suitable for industrial popularization and use.

Drawings

Fig. 1 is a schematic view of the spatial dimensions of different audio source data in the present invention.

Fig. 2 is a representation of the data stream of two sets of audio data of different frequencies in a multi-dimensional data space according to the present invention.

Detailed Description

Objects, advantages and features of the present invention will be illustrated and explained by the following non-limiting description of preferred embodiments. The embodiments are merely exemplary for applying the technical solutions of the present invention, and any technical solution formed by replacing or converting the equivalent thereof falls within the scope of the present invention claimed.

The invention discloses a processing method of multi-audio data, which stores a multi-audio data in a single audio file.

Specifically, the method for processing the multi-audio data comprises the following steps:

s2: setting sound source information, frequency and other related data for the acquired audio data;

s4: the number of occurrences, the frequency, and the position and the duration of each occurrence of the sound source, etc. are set for each sound source. The audio of different frequencies is regarded as different sound sources no matter whether the acquisition equipment is the same or not. The sound source frequency is the time period when the audio of the frequency effectively appears, the background data except the frequency is ignored, and the sound source frequency is the unique identifier of the sound source so as to create an index; specifically, in this embodiment, the number, position, and duration are obtained in a specific acquisition process, the frequency range is a normal frequency range, the frequency is used to distinguish human voice, and the frequency range of human voice is 300Hz — 3400 Hz.

In this embodiment, the selection of the multiple sets of sound source data is not limited, and the selection is performed according to the actual needs in the working process. A certain frequency band refers to the voice of the same person, and the frequency is the same, unless high or low sound is intentionally emitted, the frequency of the voice of the same person is generally maintained in a frequency band, of course, the frequency here is not only the frequency of the audio itself, but also includes the concept of speech speed, i.e. how fast the voice is speaking.

Different from a common audio data storage mode, the sound source with the highest volume is always collected as valid data at the same time node, and the sound source with the lower volume is always submerged or only mixed in gaps with different frequencies to become background sound. The sound source data storage mode provided by the multi-audio data processing method can use a multi-dimensional means to freely match audio data of multiple sound sources and multiple frequencies, different sound sources or the same sound source audio can be separated in different dimensions, and the audio data can be mixed in real time during playback, so that the method has great flexibility. Fig. 1 is a schematic spatial diagram of different audio source data in dimensions, where x is an audio data stream and can also be understood as time, y is audio, z is different audio, and the audio data streams and the audio volumes are separately acquired according to the audio and separated in the y dimension. The audio frequencies of different frequencies are considered as different sound sources, and the different sound sources or the same sound source audio can be separated in different dimensions.

Two groups of audio data with different sound sources and different frequencies exist in the form of independent data streams in a multidimensional data space, and when the audio data are coded into an independent audio file, the storage mode is as shown in fig. 2: the frequencies and the sound sources of the audio 1 and the audio 2 are different, and specifically, the frequency, the starting position and the length of each time and the frequency of the audio 1 are different from those of the audio 2.

Unlike the existing audio data storage method, which is a combination of audio format header + audio data, storing multi-source audio in this way makes it difficult to separate the mixed audio data. The multidimensional audio coding and storing mode proposed by the processing method of the multi-audio data is based on the sound source and the frequency, and the audio information of each sound source and frequency is dispersedly stored in the same audio data stream, so that the method has extremely high efficiency when the audio data index is created.

By the method, the audio data with multiple sound sources and multiple frequencies can be stored in the same audio file at the same time, the sound sources can be freely switched in the playback process, real-time sound mixing operation is realized, and the voiceprint characteristic retrieval index can be set for different sound source data, so that the retrieval process of the audio data is more efficient.

The invention has a plurality of embodiments, and all technical solutions formed by adopting equivalent transformation or equivalent transformation are within the protection scope of the invention.

Claims

1. A method for processing multi-audio data, comprising: storing a plurality of audio data in a single audio file; the method comprises the following steps:

s6: writing related sound source data on the set frequency of an encoder of an audio file, and when a plurality of groups of sound source data exist in a certain frequency band, sequentially writing the sound source data in a sound source sequence of a sound source 1| sound source 2| sound source 3;

in step S3, the audio of different frequencies is regarded as different sound sources regardless of whether the capturing devices are the same;

in step S4, the sound source frequency is the time period in which the audio of the frequency effectively appears, and the background data except the frequency is ignored; in step S4, the sound source frequency is a unique identifier of the sound source, so as to create an index; in step S6, when a plurality of sets of sound source data exist in one frequency band, the sound source 1| sound source 2| sound source 3 are written in order of sound source.