CN104392744A

CN104392744A - Method and apparatus for recording voice frequency

Info

Publication number: CN104392744A
Application number: CN201410427431.6A
Authority: CN
Inventors: 陈正超; 石毅; 蒋鸿伟
Original assignee: Guiyang Longmaster Information and Technology Co ltd
Current assignee: GUIYANG YUWAN SCIENCE & TECHNOLOGY CO., LTD.
Priority date: 2014-08-27
Filing date: 2014-08-27
Publication date: 2015-03-04

Abstract

The invention discloses a method and an apparatus for recording voice frequency. The method comprises: performing voice-frequency recording to obtain original recording data; acquiring the original recording data, and adjusting the sound channel quantity of the acquired original recording data to obtain intermediate recording data and acquire accompanying sound data; and performing audio mixing on the intermediate recording data and the acquired accompanying sound data, so as to obtain the target voice frequency data. By adjusting the sound channel quantity of the original recording data and performing audio mixing on the recording data and the accompanying sound data obtained through adjusting of the sound channel quantity, the target voice frequency data is obtained, the problem that a plurality of smart phones cannot record multi-sound-channel voice frequency is solved, also the quality of the obtained voice frequency data is substantially improved, and the problem that to-be recorded sound and accompanying sound are separated at different sound channels is avoided.

Description

A kind of method of recording audio and device

Technical field

The present invention relates to audio signal processing technique field, particularly relate to a kind of method and device of recording audio.

Background technology

The manufacturer of current production of intelligent mobile phone is numerous, and numerous cell phone manufacturers produces model and performance differs larger and miscellaneous mobile phone.When needing to realize audio frequency (as the dual-channel audio) of recording playback multichannel on smart mobile phone, there is the problem that music and voice are separated in different sound channel in the audio frequency having a lot of smart mobile phone cannot record multichannel audio or recording.

Such as, when realizing Kara OK function on a mobile platform, the fundamental prerequisite realizing this function records multichannel audio (as dual-channel audio) exactly, obtains the karaoke file (as mp3) that user needs.Existing mobile platform realizes find when recording multichannel audio, the multichannel audio of some smart mobile phone direct recording (namely the API of call operation system records multichannel audio) exists and does not have sound, the problems such as the binaural sound poor effect of recording.

Summary of the invention

The technical problem to be solved in the present invention is to provide a kind of method and device of recording audio, can record out the audio frequency of high-quality, avoids wanting the sound of recording and sound accompaniment to be separated in the problem of different sound channel.

For solving the problems of the technologies described above, the method for a kind of recording audio of the present invention, comprising:

Carry out audio recording, obtain original recording data;

Obtain original recording data, to the original recording data point reuse number of channels obtained, obtain middle recording data, and obtain sound accompaniment data;

The sound accompaniment data of described middle recording data and acquisition are carried out audio mixing, obtains target audio data.

Further, the described original recording data point reuse number of channels to obtaining, comprising:

Described original recording data are monophony, if monophony to be adjusted to m (m>1) sound channel, then in units of frame, sampled point in each frame of repeating query original recording data, by the sampled point of original recording data continuous assignment m sampled point in the respective frame of middle recording data, after each frame of repeating query original recording data, obtain the middle recording data of m sound channel.

Described original recording data are n (n>1) sound channel, if n sound channel to be adjusted to q (q>1) * n sound channel, then in units of frame, set of samples in each frame of repeating query original recording data, the sampled point identical with number of channels is comprised in described set of samples, each sampled point is corresponding with a sound channel, the set of samples assignment of original recording data is given continuous q set of samples of the respective frame of middle recording data, after each frame of repeating query original recording data, obtain the middle recording data of q*n sound channel.

Described original recording data are l (l>1) sound channel, if l sound channel to be adjusted to p (p>1) sound channel, wherein, p and l is non-integer multiple, then in units of frame, set of samples in each frame of repeating query original recording data, the sampled point identical with number of channels is comprised in described set of samples, each sampled point is corresponding with a sound channel, the sampled point identical with number of channels comprised in set of samples to original recording data is averaged, by the mean value continuous assignment p sampled point in the respective frame of middle recording data obtained, the middle recording data of p sound channel is obtained after each frame of repeating query original recording data.

Further, described sound accompaniment data are identical with the number of channels of described middle recording data.

Further, in the process of carrying out audio recording, if sound pick-up outfit plays sound accompaniment simultaneously, then eliminate the sound accompaniment that the sound pick-up outfit in the voice data recorded is play, will the voice data of sound accompaniment be eliminated as described original recording data.

Further, described method also comprises:

To the original recording data point reuse number of channels obtained, after obtaining middle recording data, volume adjustment is carried out to described middle recording data; And after acquisition sound accompaniment data, volume adjustment is carried out to described sound accompaniment data, described volume adjusts to comprise and is multiplied with volume number percent by each sampled point in data.

Further, a kind of device of recording audio, comprising: recoding unit, sound channel adjustment unit, sound accompaniment data capture unit and downmixing unit, wherein:

Described recoding unit, for carrying out audio recording, obtains original recording data;

Described sound channel adjustment unit, for obtaining original recording data, to the original recording data point reuse number of channels obtained, obtains middle recording data;

Described sound accompaniment data capture unit, for obtaining sound accompaniment data;

Described downmixing unit, for the sound accompaniment data of described middle recording data and acquisition are carried out audio mixing, obtains target audio data.

Further, described sound channel adjustment unit, to the original recording data point reuse number of channels obtained, comprising:

Described original recording data are monophony, if monophony to be adjusted to m (m>1) sound channel, then in units of frame, sampled point in each frame of repeating query original recording data, by the sampled point of original recording data continuous assignment m sampled point in the respective frame of middle recording data, after each frame of repeating query original recording data, obtain the middle recording data of m sound channel; Or,

Described original recording data are n (n>1) sound channel, if n sound channel to be adjusted to q (q>1) * n sound channel, then in units of frame, set of samples in each frame of repeating query original recording data, the sampled point identical with number of channels is comprised in described set of samples, each sampled point is corresponding with a sound channel, the set of samples assignment of original recording data is given continuous q set of samples of the respective frame of middle recording data, after each frame of repeating query original recording data, obtain the middle recording data of q*n sound channel; Or,

In sum, the application carries out number of channels adjustment to original recording data, recording data after number of channels being adjusted again and sound accompaniment data carry out audio mixing, thus obtain target audio data, the problem that a lot of smart mobile phone cannot record multichannel audio can be solved, and significantly can improve the quality of the voice data of acquisition, and can avoid wanting the sound of recording and sound accompaniment to be separated in the problem of different sound channel.

Accompanying drawing explanation

Fig. 1 is the process flow diagram of the method for the recording audio of the application;

Fig. 2 is the Organization Chart of the device of the recording audio of the application.

Embodiment

Hereinafter also describe the present invention in detail with reference to accompanying drawing in conjunction with the embodiments.It should be noted that, when not conflicting, the embodiment in the application and the feature in embodiment can combine mutually.

As shown in Figure 1, the method for the recording audio of the application, comprising:

Step 101: carry out audio recording, obtains original recording data;

According to the operation start audio recording of user in the application.Original recording data are kept in the recording buffer zone of operating system.

In the process of carrying out audio recording, if sound pick-up outfit does not play sound accompaniment simultaneously, the actual sound recorded is exactly the sound that user wants to record, then using recorded voice data as original recording data; In the process of carrying out audio recording, if sound pick-up outfit plays sound accompaniment simultaneously, then the sound accompaniment that the sound pick-up outfit in the voice data needing elimination to record is play, will eliminate the voice data of sound accompaniment as original recording data.Can eliminate by the mode of echo cancellor the sound accompaniment that the sound pick-up outfit in recorded voice data plays in the application.

Squelch can also be carried out in the process of carrying out audio recording in the application simultaneously.

Step 102: obtain original recording data, to the original recording data point reuse number of channels obtained, obtains middle recording data, and obtains sound accompaniment data;

In order to realize recording audio faster in the application, opening separately an audio mixing thread, namely carrying out audio mixing and coding when recording audio with regard to starting, when having recorded so substantially audio mixing and coding also together with complete.Audio mixing thread obtains original recording data, and ceaselessly can obtain sound accompaniment data from the sound accompaniment buffer memory (musicBuffer) preserving sound accompaniment data.

In order to improve the efficiency of audio mixing, create recording thread in the application to be responsible for reading original recording data from the recording buffer zone of operating system, then original recording data are adjusted back audio manager, original recording data are written in record buffer memory (recordBuffer) by audio manager.RecordBuffer adopts the buffer from increasing, and namely acquiescence opens up the space of 1024 bytes, if storage space foot opens up 1024 bytes again, successively from increasing.

Audio mixing thread in the application is from recordBuffer, obtain original recording data.Because recording thread and audio mixing thread all need to access recordBuffer, need when therefore accessing to lock.

To the original recording data point reuse number of channels obtained in the application, comprising:

Original recording data are monophony, if monophony to be adjusted to m (m>1) sound channel, then in units of frame, sampled point in each frame of repeating query original recording data, by the sampled point of original recording data continuous assignment m sampled point in the respective frame of middle recording data, after each frame of repeating query original recording data, obtain the middle recording data of m sound channel.

Frame is the transmission unit of audio frequency.Sampling rate is sound pick-up outfit sampling number to voice signal within a second, and the reduction of sample frequency more high sound is more true more natural.Usually the sampling rate adopted at present is 16K, 32K and 44K etc.Sample bytes is the byte number shared by a sampled point, and general employing two bytes store a sampled point.

Such as, take sampling rate as 32K be example, the quantity arranging the sampled point of a frame is 1280, and so for two-channel 1280/2/32=20ms, namely the duration of a frame is 20ms, need when audio plays by voice data one by one be input to equipment.

To convert monophony to two-channel, the application is described below.

When the monaural original recording data point reuse recorded is two-channel, in the data of each frame, needs each half of the sampled point of left and right acoustic channels (i.e. each 640 samplings of left and right acoustic channels), and be spaced.

Suppose that the monophonic sounds recorded is 640 sampled point one frames, 640/32/1=20ms, following table 1 is monaural data model.

Table 1

1

2

3

4

5

6

7

……

639

640

Principle monophony being adjusted to two-channel is as shown in table 2, by sampled point continuous assignment two sampled points in dual channel data of each mono data.The mono data of 20ms has 640 sampled points, and two-channel is then 1280 sampled points.Table 2 is the data model of two-channel 1280 sampled points.

Table 2

1

2

3

4

5

6

7

…

639

640

In table 1 and table 2, a grid represents a sampled point.The data of the value code sampling in grid.The distribution of the data acquisition sampling point of two-channel is spaced apart, namely follows the sampled point of a R channel after a L channel sampled point, is so spaced.Such as, table 2 first grid 1 is the sampled point of L channel, and second grid 1 is the sampled point of R channel.

Below be the partial code of two-channel by monaural original recording data point reuse.

BuffRecordDC: dual channel data stores Buffer (DC:Double Channel)

BuffRecordSC: the storage Buffer (SC:Single Channel) of mono data

Above-mentioned code is exactly the monaural sampled point of repeating query, then by each sampled point simultaneously assignment to two of two-channel Buffer sampled points, the i.e. sampled points of left and right acoustic channels.

To the original recording data point reuse number of channels obtained in the application, can also be:

Original recording data are n (n>1) sound channel, if n sound channel to be adjusted to q (q>1) * n sound channel, then in units of frame, set of samples in each frame of repeating query original recording data, the sampled point identical with number of channels is comprised in set of samples, each sampled point is corresponding with a sound channel, the set of samples assignment of original recording data is given continuous q set of samples of the respective frame of middle recording data, after each frame of repeating query original recording data, obtain the middle recording data of q*n sound channel.

Set of samples in the application is the voice data for multichannel, such as, if comprise 1280 sampled points for voice data one frame of two-channel, then each frame comprises 640 set of samples, two sampled points are comprised in each set of samples, the quantity of the sampled point comprised in set of samples is identical with number of channels, and each sampled point is corresponding with a sound channel.Such as, first grid in table 2 and second grid are a set of samples, and first grid is the sampled point of L channel, and second grid 1 is the sampled point of R channel.

Original recording data are l (l>1) sound channel, if l sound channel to be adjusted to p (p>1) sound channel, wherein, p and l is non-integer multiple, then in units of frame, set of samples in each frame of repeating query original recording data, the sampled point identical with number of channels is comprised in set of samples, each sampled point is corresponding with a sound channel, the sampled point identical with number of channels comprised in set of samples to original recording data is averaged, by the mean value continuous assignment p sampled point in the respective frame of middle recording data obtained, the middle recording data of p sound channel is obtained after each frame of repeating query original recording data.

Can be that l is greater than p in the application, also can be that p is greater than l, when the number of channels of original recording data is non-integer multiple with the number of channels wanting to obtain, the set of samples of original recording data is averaged, by mean value continuous compound rate p the sampled point in the respective frame of middle recording data obtained.When averaging to the set of samples of original recording data, the sampled point comprised can be added, then by the result of addition divided by the number of channels of original recording data, obtain the mean value of set of samples in set of samples.

Step 103: the sound accompaniment data of middle recording data and acquisition are carried out audio mixing, obtains target audio data.

The number of channels of audio mixing thread adjustment original recording data in the application, and after obtaining sound accompaniment data, carries out audio mixing by the middle recording data obtained after carrying out number of channels adjustment and sound accompaniment data, obtains target audio data.

Middle recording data is become original PCM speech data with sound accompaniment data audio mixing by audio mixing thread, then by scrambler, original PCM encoded speech data is become the speech data (as MP3 data) of required audio format, is deposited in file.

In the application, the number of channels of sound accompaniment data is identical with the number of channels of middle recording data.Such as, want the target audio data obtaining two-channel, need the sound accompaniment data of buffer memory two-channel in musicBuffer in advance, after the monaural original recording data of recording, adjust the middle recording data that monaural original recording data obtain two-channel, then audio mixing is carried out to the middle recording data of two-channel and the sound accompaniment data of two-channel.

In addition, to the original recording data point reuse number of channels obtained in the application, after obtaining middle recording data, volume adjustment can also be carried out to middle recording data; And after acquisition sound accompaniment data, volume adjustment is carried out to described sound accompaniment data, volume adjusts to comprise and is multiplied with volume number percent by each sampled point in data.

It is below the partial code of volume adjustment.

As shown in Figure 2, present invention also provides a kind of device of recording audio, comprising: recoding unit, sound channel adjustment unit, sound accompaniment data capture unit and downmixing unit, wherein:

Recoding unit, for carrying out audio recording, obtains original recording data;

Sound channel adjustment unit, for obtaining original recording data, to the original recording data point reuse number of channels obtained, obtains middle recording data;

Sound accompaniment data capture unit, for obtaining sound accompaniment data;

Downmixing unit, for the sound accompaniment data of middle recording data and acquisition are carried out audio mixing, obtains target audio data.

Above-mentioned sound channel adjustment unit, to the original recording data point reuse number of channels obtained, comprising:

Original recording data are monophony, if monophony to be adjusted to m (m>1) sound channel, then in units of frame, sampled point in each frame of repeating query original recording data, by the sampled point of original recording data continuous assignment m sampled point in the respective frame of middle recording data, after each frame of repeating query original recording data, obtain the middle recording data of m sound channel; Or,

Original recording data are n (n>1) sound channel, if n sound channel to be adjusted to q (q>1) * n sound channel, then in units of frame, set of samples in each frame of repeating query original recording data, the sampled point identical with number of channels is comprised in set of samples, each sampled point is corresponding with a sound channel, the set of samples assignment of original recording data is given continuous q set of samples of the respective frame of middle recording data, after each frame of repeating query original recording data, obtain the middle recording data of q*n sound channel; Or,

In the application, sound accompaniment data are identical with the number of channels of middle recording data.

Above-mentioned recoding unit, also in the process of carrying out audio recording, if sound pick-up outfit plays sound accompaniment simultaneously, then eliminates the sound accompaniment that the sound pick-up outfit in the voice data recorded is play, and will eliminate the voice data of sound accompaniment as original recording data.Recoding unit can eliminate by echo cancellor the sound accompaniment that the sound pick-up outfit in recorded voice data plays.

Itself wherein can also comprise volume adjustment unit, and volume adjustment unit is used for, in the original recording data point reuse number of channels of number of channels adjustment unit to acquisition, after obtaining middle recording data, carrying out volume adjustment to middle recording data; And after sound accompaniment data capture unit obtains sound accompaniment data, volume adjustment is carried out to sound accompaniment data, volume adjusts to comprise and is multiplied with volume number percent by each sampled point in data.

Those skilled in the art should be understood that, above-mentioned of the present invention each module or each step can realize with general calculation element, they can concentrate on single calculation element, or be distributed on network that multiple calculation element forms, alternatively, they can realize with the executable program code of calculation element, thus, they can be stored and be performed by calculation element in the storage device, and in some cases, step shown or described by can performing with the order be different from herein, or they are made into each integrated circuit modules respectively, or the multiple module in them or step are made into single integrated circuit module to realize.Like this, the present invention is not restricted to any specific hardware and software combination.

The foregoing is only the preferred embodiments of the present invention, be not limited to the present invention, for a person skilled in the art, the present invention can have various modifications and variations.Within the spirit and principles in the present invention all, any amendment done, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.

Although above to invention has been detailed description, the present invention is not limited thereto, those skilled in the art of the present technique can carry out various amendment according to principle of the present invention.Therefore, all amendments done according to the principle of the invention, all should be understood to fall into protection scope of the present invention.

Claims

1. a method for recording audio, is characterized in that, comprising:

Carry out audio recording, obtain original recording data;

2. the method for claim 1, is characterized in that, the described original recording data point reuse number of channels to obtaining, comprising:

3. the method for claim 1, is characterized in that, the described original recording data point reuse number of channels to obtaining, comprising:

4. the method for claim 1, is characterized in that, the described original recording data point reuse number of channels to obtaining, comprising:

5. the method as described in one of as any in Claims 1 to 4, it is characterized in that, described sound accompaniment data are identical with the number of channels of described middle recording data.

6. the method as described in one of as any in Claims 1 to 4, is characterized in that:

In the process of carrying out audio recording, if sound pick-up outfit plays sound accompaniment simultaneously, then eliminate the sound accompaniment that the sound pick-up outfit in the voice data recorded is play, will the voice data of sound accompaniment be eliminated as described original recording data.

7. the method as described in one of as any in Claims 1 to 4, it is characterized in that, described method also comprises:

8. a device for recording audio, is characterized in that, comprising: recoding unit, sound channel adjustment unit, sound accompaniment data capture unit and downmixing unit, wherein:

9. device as claimed in claim 8, is characterized in that, described sound channel adjustment unit, to the original recording data point reuse number of channels obtained, comprising:

10. device as claimed in claim 9, it is characterized in that, described sound accompaniment data are identical with the number of channels of described middle recording data.