Summary of the invention
The present invention is in order to solve existing the tinkle of bells conversion instrument and to have problems such as using inconvenience and function singleness and developing.The purpose of this invention is to provide a kind of method of carrying out audio editing and conversion by the intercepting audio volume control, this method has easy to use, can carry out characteristics such as multi-functional editor.
Another object of the present invention provides a kind of method of carrying out audio editing and conversion by the intercepting audio volume control, and voice data suitably be edited, be transformed to this method can according to user's demand, satisfies various hommization demands, and easy to operate.
Basic ideas of the present invention are: the processing of audio frequency generally includes audio data collecting, data processing, the media format conversion of data.The present invention is provided with the oscillograph interface, and data acquisition and processing (DAP) is carried out in oscillograph, and the oscillograph interface is a window class object, is convenient to the user and according to own needed content voice data is gathered and handled.
Based on this, the present invention is achieved in that
A kind of method of carrying out audio editing and conversion by the intercepting audio volume control, it is to adopt specific oscillograph display interface to show voice data, and the selection of situation as required intercepting voice data, the step of the realization that it is concrete is:
1, starts oscillograph, and come out by interface display;
2, audio frequency acquiring data show as Wave data with this voice data and are shown in the oscillograph interface;
3, intercepting voice data, select target audio data section as required;
4, voice data is handled and is transformed, and with the computing of target audio data segment, and is converted into the form output that needs.
The audio frequency acquiring data are to realize by the audio-frequency information or the audio-frequency information in the internal memory that read in the storer, this voice data comprises file contents all in each storage file, the form of audio file has wav, mp3, wma, rm, also can comprise the collection (realizing by record type) of speech data.Oscillograph is a window class object, it has encapsulated drawing mesh lines, rear sight, Wave data, and the Wave data segmentation is chosen, the establishment of recording, audio plays, audio file and the function of storage, its significant feature is that the Wave data binding time parameter with voice data shows, and carries out selection, intercepting and the broadcast of data for the user.
Wave data is by RIFF and WAVEFORMATEX version record, and what preserve among the RIFF is the length of " data " mark and voice data, and what preserve among the WAVEFORMATEX is the format parameter of audio frequency media.Be exactly the voice data zone in addition, the first address of this data area is preserved, store in the corresponding address pointer, the start address of Here it is voice data, the data of whole like this wave file can read out thus.
Intercepting voice data, is the form of inserting symbol in voice data, the audio data section of the selected editing and processing of wanting, and the insertion symbol is that two multiple is provided with, minimum is two.
The selection of target audio data segment is according to the starting point of waveform (Wave data is shown) in selected and the terminal point ratio at the number percent of rectangular area, be exactly just can obtain the length that this ratio multiply by Wave data the deviation post of data then, the deviation post that data have been arranged, according to the first address of Wave data, just can obtain the waveform data of being chosen again.
In order to determine the accuracy of selected Wave data, also can reach the effect of clear and definite selected Wave data by the inverse disposal route of waveform.It is to carry out in the OnMouseMove function that this inverse is handled, and the coordinate of the starting point by importing selected middle waveform into toward this function in and the x axle of terminal point is determined the waveform inverse chosen.
Processing of audio data comprises computing and storage, earlier the selected audio data section in front is carried out computing, redistributes internal memory after computing is finished, and the voice data after the computing is stored in the memory block that internal memory increases newly.
The conversion of voice data is that the above-mentioned voice data of storing is decoded, and is converted into the wave file earlier, adopts the 3GPP standard again, is converted into the Amr file by the Amr scrambler from the wave document No., and stores.
Above-mentioned voice data also can be play audition after handling, and investigates the result of broadcast of handled audio data section, if can not meet the demands, then can carry out the intercepting and the processing of voice data again.
This method can realize that also the segmentation of voice data chooses the establishment of recording, audio plays, audio file and the function of storage.The media format conversion of data is to carry out the amr coding by the 3GPP standard, and resulting file is the amr file, just the common used formatted file of ringing sound of cell phone.
The form of ringing sound of cell phone has multiple, and Amr is a kind of form in the middle of them.The characteristics of the maximum of this ring format are that ratio of compression is very big, can reach 100 to 1, and the internal memory of present mobile phone also is smaller, and with the tinkle of bells of amr form, and in identical storage space, content that can medium is just many.Because it that is to say to have the media file of identical playing duration than other media formatss, and the shared storage space of the media file of amr form is littler, that is to say the tinkle of bells with the amr form like this, can save the storage space of mobile phone greatly.
Principal feature of the present invention is to realize that by an oscillograph object this oscillographic processing comprises the demonstration of audio file, recording, voice audition.Some specific requirements for this method are:
Wherein, for (1) voice data:
The scope of voice data will be tried one's best extensively, and this object is wanted to discern the file data type, and the memory block data also have speech data.The file data type is by importing the complete trails of file into, is the content of reading in whole file then.
The internal storage data memory block that has voice data that internal memory imports into when handling above-mentioned data.Speech data is sound card to be carried out real-time sampling and the data that obtain in when recording.
(2) reading of file data type:
Begin earlier to distinguish medium type, run into the file that to discern,, adopt audio file head recognition technology for example as the situation that does not have extension name or extension name and its physical medium type not to conform to the extension name of file.
(3) select voice data:
Can show an elongated insertion symbol when on the oscillogram of voice data, choosing, just can create with the CreateCaret function and insert symbol, insert symbol by two and can determine one section selected voice data.
(4) processing of audio frequency:
By calling the waveInOpen function, give WAVEHDR data structure initialization data after successfully opening audio waveform equipment, distribute waveform memory area to it, the WAVEHDR data structure is the head of the memory block of audio waveform, then use function waveInPrepareHeader, the WAVEHDR structure is imported in the audio waveform equipment into the memory block that notice audio waveform equipment is prepared audio frequency.
Read the message code in the voice data, the speech data that this moment sampled is preserved, and then call the waveInAddBuffer function and add a WAVEHDR data structure for again audio waveform equipment, in the preservation process of this speech data, utilize the realloc function to give previous speech data storage block, redistribute internal memory, give the memory block copy that increases newly the new speech data that arrives then.
The present invention reflects audio file exactly by oscillographic form, be convenient to the user and select intercepting, can arbitrarily intercept, edit the various audio-frequency informations of use along with user's needs, and audio-frequency information utilize the degree height, reduce unnecessary information stores, make the use of audio file have more hommization and editability.
The present invention also has the former sound input function of true man, voice can be edited, the efficiency ratio of audio file decoding is higher, compressibility through the Amr coding is very high, the tinkle of bells file that generates is smaller, is applicable to upload on the mobile phone, has passed through Filtering Processing before the coding, the tonequality of the tinkle of bells that generates is relatively good, and distortion rate is low.And it is can suitably edit, transform voice data according to user's demand, satisfy various hommization demands, and easy to operate.
Embodiment
As follows referring to Fig. 1 and Fig. 2 specific implementation process of the present invention:
The present invention mainly realizes that by an oscillograph object it can realize comprising the demonstration of audio file, recording, the function of voice audition.Wherein
(1) data source:
The scope of data source will be tried one's best extensively, and this object is wanted to discern the file data type, and the memory block data also have speech data.The file data type is by importing the complete trails of file into, is the content of reading in whole file then.
An internal storage data memory block that has voice data of importing into during the memory block data.Speech data is sound card to be carried out real-time sampling and the data that obtain in when recording.
(2) reading of file data type:
Begin earlier to distinguish medium type, run into the file that to discern,, adopt audio file head recognition technology for example as the situation that does not have extension name or extension name and its physical medium type not to conform to the extension name of file.
Specific practice: two very important data structure: RIFF and WAVEFORMATEX are arranged in the wave file head.
The data structure of RIFF is defined as follows:
//RIFF?file?head //8byte
typedef?struct?RIFF{
char?ID[4];
DWORD?Size;
}riff;
Parameter declaration:
" RIFF " mark of 4 bytes of storage among the ID, the length of Size storing audio files.The WAVEFORMATEX data structure just has definition in system file mmsystem.h, so just needn't oneself define again.This form is defined as follows in the mmsystem.h file
typedef?struct{
WORD?
wFormatTag;
WORD?
nChannels;
DWORD?
nSamplesPerSec;
DWORD?
nAvgBytesPerSec;
WORD?
nBlockAlign;
WORD?
wBitsPerSample;
WORD?
cbSize;
}WAVEFORMATEX;
Parameter declaration:
WFormatTagThe medium type of audio waveform, this parameter value is made as WAVE_FORMAT_PCM in this invention.
NChannelsChannel value, this parameter value 1 is a monophony, 2 is stereo.
NSamplesPerSecThe sampling rate of per second.,
NAvgBytesPerSecAverage sample speed.
NBlockAlignThe data cell of sampling can be made as 1,2,4 byte.
WBitsPerSampleSampling precision can be made as 8,16.
CbSizeIgnore this parameter in this invention.
The data of wave file head are as follows by sequence of addresses:
" RIFF " mark of 4 bytes.
The length value of the file data of 4 bytes.
" WAVE " mark of 4 bytes.
" fmt " mark of 4 bytes.
The length value of the PCMWAVEFORMAT structure of 4 bytes.
The data of the WAVEFORMATEX structure of 16 bytes.,
" data " mark of 4 bytes.
The length value of the data data field of 4 bytes,
The back is exactly the binary data value of voice data.
Reading of the data of file header verifies it is by above mark, compares one by one in order.The length of the data of at every turn reading in reads by the size of RIFF structure, the data markers that compares four bytes, these data markers are respectively " RIFF "; " WAVE "; " FMT "; " DATA ". the value of the data structure of reading in is kept at respectively in RIFF and the WAVEFORMATEX structure, and what preserve among the RIFF is the length of " data " mark and voice data, and what preserve among the WAVEFORMATEX is the format parameter of audio frequency media.Be exactly the voice data zone in addition, the first address of this data area is preserved, the start address of Here it is voice data.The data of whole like this wave file have all read out, and are the drawing oscillogram with that.
(3) oscillograph
Here, oscillograph is not the oscillograph of traditional sense, a but control interface that shows by display interface, it is after CPU reads voice data, the form of voice data by waveform shown, be convenient to editor and observation, its effect is the oscillogram (abbreviation oscillogram) that is used for forming audio frequency, chooses the Wave data section (abbreviation Wave data) of audio frequency.Forming oscillogram and can be divided into three links, is respectively the grid of sign rectangular area, scale, the waveform of voice data.Adopt internal memory dc in forming process, figure all is saved among the internal memory dc, and usefulness BitBlt function in display dc, so just can have been seen figure to the data read among the internal memory dc when upgrading view.
1) sign grid
Adopt dot-and-dash line, linearly do not meet the demands because windows provides, so adopt self-defining method gridding line, the specific implementation method is come the picture dot-and-dash line with the method for pixel points, is exactly at the x axle, on the y axle every n pixel of definition, thereby the formation dot-and-dash line.The density degree availability interval parameter n of dot-and-dash line point controls.The present invention adopt method be the rectangular area laterally be divided into 20 equal portions, vertically be divided into 10 equal portions, spacing parameter n is made as 2.
2) sign scale
On the rectangular area, identify 10 scale graticules, and these 10 scales are finished half graduation apparatus line according to above-mentioned requirement.The scale graticule can adopt 1ineTo, and the Rectangle function generates, and says to be exactly to mark vertical line on the conversational implication, then indicates the bottom line of scale after vertical line is finished.After finishing graticule, follow indexing, the scale form is 00:00:00, and implication is: hour: minute: second, because the scale value that only marks 0,2,4,6,8,10 positions is adopted in the restriction of the size of tab area.Default value difference 0,00:00:02,00:00:04,00:00:06,00:00:08,00:00:10.In the situation of media data was arranged, these mark values were relevant with the maximum duration of playing of medium.But obtain the longest reproduction time of media data, should adopt the numerical value in the RIFF structure, size field value wherein is exactly the length of media data, the field in the WAVEFORMATEX structure
NAvgBytesPerSecValue be the playing duration of the sampling rate medium of average per second be the size value divided by
NAvgBytesPerSecValue.Maximum scale value on the scale should be more than or equal to the maximum playing duration of medium.This has individual algorithm: if this playing duration can be divided exactly by 10, the mark on that scale just can increase progressively by 1/10th values of this playing duration.If aliquant, the value that the mark on that scale just can add after 1 by 1/10th values of this playing duration again increases progressively.
3) form Wave data
Wave data reads in media data source at the beginning, and the first address of this data area just is kept in the self-defining pointer.Data length is exactly the value of the size field in the RIFF structure, if be stereo format, this value must be divided by 2, the initial value of the data of each passage increases progressively, the initial value of the data of first passage is 0 Wave data, the initial value of the data of second passage is 1 Wave data, and the like, seem that Wave data is regarded as an array.The interval of data is made as 40, just form a waveform values every 40 data points like this, the x axial coordinate can regard it is the percent value that the length of rectangular area multiply by the original waveform data length of current data as, round, have a plurality of waveform values like this and on same x coordinate points, export, thereby form with the superimposed formed oscillogram of a plurality of points.The y axial coordinate of oscillogram is exactly the binary value of the Wave data of this point, algorithm is: monophony, ((BYTE*) m_data-128))/128, parameter h is a height, value is 1/2nd stereo for the rectangle height: ((BYTE*) m_data) * h)/(65535/2), parameter h is a height, is worth to be 1/4th of rectangle height.
4) segmentation of Wave data is chosen
Normally choose one section Wave data by left mouse button.The data of choosing can show by inverse on oscillogram.Concrete implementation method: when the mouse-click oscillogram, can on this mouse point, show an elongated insertion symbol, just can create with the CreateCaret function and insert symbol.The CreateCaret function prototype:
BOOL?CreateCaret(
HWND?hWnd,
HBITMAP?hBitmap,
int?nWidth,
int?nHeight
);
Parameter declaration:
HWnd is for inserting the window handle of symbol.
Hbitmap is for inserting the picture handle of symbol, if this parameter is made as NULL, the shape of then inserting symbol is an entity. exists
In this invention, this parameter is made as NULL.
NWidth inserts the width of symbol, and in this invention, this parameter value is made as 0.
NHeight inserts the height of symbol, and in this invention, the height that this parameter value is made as huge zone deducts the word in the scale
The height degree, or deduct the height of scale.Font in this invention is selected the Song typeface, and size is 10, it
The height of viewing area is greatly about 12 pixels, and the height of scale is to be made as 25 pictures in this invention
Plain.
The inverse disposal route of the waveform of choosing:
This processing procedure is to carry out in the OnMouseMove function, and this function is to respond the mouse moving event, and one triggers the mouse moving event, and program will be called this function.Specific practice is, uses the Rectangle function, by in this function, import into selected in the coordinate of x axle of the starting point of waveform and the terminal point oscillogram inverse of painting and choosing.
This has a very crucial parameter to be provided with, and that is exactly the formation pattern that display shows dc, and parameter is set can be like this: dc-〉SetROP2 (R2_NOT), dc is the pointer that a display shows dc.The implication of parameters R 2_NOT is that the pixel of gained is the color of counter-rotating display.
Obtain the waveform data of being chosen:
According to the starting point of waveform in selected and terminal point ratio at the number percent of rectangular area, the length that is exactly this ratio be multiply by Wave data then just can obtain the deviation post of data, the deviation post that data have been arranged, according to the first address of Wave data, just can obtain the waveform data of being chosen again.
The sound signal that oneself is recorded wherein, its processing procedure as shown in Figure 2.
3. recording
1) audio frequency parameter is selected for use:
In order to save the sampling rate of disk space employing 11.025KHz, sampling precision is 8, and sampling channel is a monophony.
These parameters are provided with in the WAVEFORMATEX structure.Be provided with as follows: sampling rate
NSamplesPerSecBe 11025, sampling precision nBlockAlign is 8, and channel number nChannels is 1, and implication is to select monophony for use.
2) equipment selects for use
Audio frequency apparatus is selected the audio waveform input equipment of system for use, and this equipment is to install automatically in the process that sound card is installed.Sound pick-up outfit can be a microphone, also can be other voice-input device.
3) Lu Yin function selects for use
Adopt the WAVEFORM FUNCTION (wave function) of low layer, these functions can be visited audio hardware easily.The several crucial function of using in the process of recording is waveInOpen, waveInPrepareHeader, waveInAddBuffer, waveInStart, aveInUnprepareHeader, waveOutReset, waveInClose.The message that these functions triggered is to handle in the WindowProc of oscillograph object function, has made full use of the advantage based on the windows window message.
4) choosing of audio frequency, method is consistent with said method, does not repeat them here.
5) Lu Yin entire process process
The parameter of WAVEFORMATEX was set before this, sampling rate 110250,8 of sampling precisions, sampling channel are sound channel.Then be to open audio waveform equipment, the function of using is waveInOpen, and this function is defined as follows:
MMRESULT?waveInOpen(
LPHWAVEIN
phwi.
UINT_PTR
uDeviceID,
LPWAVEFORMATEX
pwfx,
DWORD_PTR
dwCallback,
DWORD_PTR
dwCallbackInstance,
DWORD
fdwOpen
);
Parameter declaration:
Phwi is used for obtaining the handle of the delimiter of audio waveform input equipment, and this value is an internal memory pointer.
The table of UDeviceID audio waveform input equipment is known symbol.This parameter value is made as WAVE_MAPPER.
Pwfx points to
WAVEFORMATEXData structure.
The DwCallback call back function, this value is made as the function pointer of WindowProc.
The instance handle value of DwCallbackInstance call back function.This value is made as this, also can be made as NULL.
FdwOpen opens the mark of equipment, selects CALLBACK_WINDOW for use in this invention, adjusts back letter with window
Number comes the message that is triggered of response wave shape audio frequency apparatus.
By calling the waveInOpen function, give WAVEHDR data structure initialization data after successfully opening audio waveform equipment, distribute waveform memory area to it, the WAVEHDR data structure is the head of the memory block of audio waveform, then use function waveInPrepareHeader, the WAVEHDR structure is imported in the audio waveform equipment into the memory block that notice audio waveform equipment is prepared audio frequency.The double buffering sampling just is used in this place, defines two WAVEHDR data structures, after initialization, by calling the waveInPrepareHeader function, imports the memory block of audio waveform equipment respectively into.Calling the waveInOpen function, can retrieve the message code of MM_WIM_OPEN in the WindowProc function, behind this message code, just can add the WAVEHDR data structure in the audio waveform equipment to, be used for voice data with function waveInAddBuffer.Then call the waveInStart function, the beginning speech sample.After calling the waveInStart function, the message code that can retrieve MM_WIM_DATA in the WindowProc function just can obtain the speech data that sampled behind this message code, the speech data that this moment sampled is preserved, and then call the waveInAddBuffer function and add a WAVEHDR data structure for again audio waveform equipment, effect is to allow audio waveform equipment that the speech data that is sampled next time is left in this WAVEHDR data structure.In the preservation process of this speech data, the present invention has used Dram growth technology, and principle is to utilize the realloc function to give previous speech data storage block, redistributes internal memory, gives the memory block copy that increases newly the new speech data that arrives then.So just can preserve the data integrity ground of speech sample get up.The WAVEHDR data structure is defined as follows:
typedef struct{ LPSTR lpData; DWORD dwBufferLength; DWORD dwBytesRecorded; DWORD_PTR dwUser; DWORD dwFlags; DWORD dwLoops; struct wavehdr tag*lpNext; DWORD_PTR reserved; }WAVEHDR;
Among the present invention this data structure value be provided with as follows:
m_pWaveHdr1->lpData =(LPSTR)m_pBuffer1;
m_pWaveHdr1->dwBufferLength?=INP_BUFFER_SIZE;
m_pWaveHdr1->dwBytesRecorded=0;
m_pWaveHdr1->dwUser =0;
m_pWaveHdr1->dwFlags =0;
m_pWaveHdr1->dwLoops =1;
m_pWaveHdr1->lpNext =NULL;
m_pWaveHdr1->reserved =0;
INP_BUFFER_SIZE=16K, just 16*1024 round values.M_pBuffer1 is a memory block that size is 16K.Can stop recording by calling the waveInReset function, after calling the waveInReset function, in the WindowProc function, can retrieve the message code of MM_WIM_CLOSE, just can use function waveInUnprepareHeader behind this message code, notice audio waveform equipment no longer includes the WAVEHDR data structure to be added.
Then discharge the memory block of speech sample, call the waveInClose function, close audio waveform equipment.
The processing of audio data of audio file shown in Figure 1 is identical with above-mentioned process.
Also can show the dynamic waveform figure of voice data simultaneously in Recording Process,
6) dynamic display waveform figure in Recording Process
Display packing: the oscillogram that forms all speech datas that sampled as mentioned above, this speech data of this thread arrives preceding or has just created because of this when beginning to record, allow the self-defining windows event object of this thread waits, and the trusted of this self-defining windows event object control is to control in the message code of the MM_WIM_DATA in the WindowProc function, if when having new speech data to arrive, voice data before this, that windows event object trusted is just given in the storage back, with the trusted of function S etEvent realization event object.The implication of trusted is exactly to give object signalization state, makes its signal effective.In case the event object trusted waits for that its that thread just is operated system wake-up at once.Thread execution begins Wave data, is saved among the internal memory dc, after oscillogram is finished, with the BitBlt function figure of internal memory dc is outputed on the display.Then this thread is waited for that windows event object again.This windows event object is automatic signalization state, in case behind the trusted, can transfer no trusted state automatically at once, so thread is when waiting for the windows event object, the system that can be operated gives and hangs up, in case the event object trusted can be operated system wake-up again.
7) play
Whether voice data also can be play after handling, meet the demands to judge selected voice data.The data source of voice playing can be divided into two types, and a kind of is media file type, and another kind is a sound-type.Their difference mainly is that the parameter of file header is different, the numerical value that is exactly RIFF data structure and this data structure of WAVEFORMATEX is different, two data structures of this of these two kinds of medium types can get access to, the former obtains by resolving media file, the latter be by when recording given parameter be exactly the WAVEFORMATEX data structure, size field value in the RIFF data structure is in the speech sample process, when dynamically painting oscillogram, when being set, the speech data parameter upgrades.Also can when audio data samples, in the process of dynamic growth, dynamically update.Voice playing also is to adopt the WAVEFORM FUNCTION (wave function) of low layer, and the function of using has waveOutOpen, waveOutPrepareHeader, waveOutWrite, waveOutUnprepareHeader, waveOutClose.The implementation procedure of voice playing is as follows:
Also be initialization WAVEFORMATEX data structure at the beginning, call the output device that the waveOutOpen function is opened audio frequency.This function parameters setting is identical with being provided with of above-mentioned waveInOpen function, and some message that audio frequency apparatus triggered also are to be placed in the WindowProc function to handle.In the WindowProc function, retrieve the MM_WOM_OPEN message code after calling the waveInOpen function, under this message code, add code, realize that real voice data outputs in the audio waveform output device.The numerical value of WAVEHDR data structure was set before this, imported the first address of the data block of voice data into, the length of audio data block.Call the waveOutPrepareHeader function WAVEHDR data structure value is imported in the audio waveform output device, notify this equipment to prepare audio frequency output.Then be to call the waveOutWrite function voice data of WAVEHDR data structure is outputed to the audio waveform output device, will have sound from sound card, to send like this.The implementation procedure of the whole broadcast of voice playing be exactly as above-mentioned said.After calling the waveOutWrite function, can create a timer, be used for the update playing progress.The method that realizes is, represent the playing progress rate carved at that time with inserting the symbol position, because in the time in this moment of inserting the voice playing that the scale mark is just arranged under the pairing position of symbol, the every triggering of timer once just goes to call the data-bias position that the waveOutGetPosition function obtains playing audio-fequency data.Can calculate position number percent in the audio data section that will play of this point with these data, multiply by the width of rectangular area with this number percent, the position of the insertion symbol when adding the beginning audio plays is exactly current play position.The position of inserting symbol is set arrives the resulting position of aforementioned calculation.So just can see that inserting symbol is dynamically updating the position, when allowing the people feel to be exactly voice playing, its playing progress rate is upgrading always.
5. media format conversion
(1) mp3 transforms wave
Decoding high efficiency " libmad-MPEG audio decoder library " is adopted in Mp3 decoding just, is an open source code, and is more common, but in decode procedure the feedback decoding state, the decoding progress.The data that decoding is come out save as the wave file.Decoding is independently carried out in the thread at one.It is packaged dynamic link libraries, has here just no longer narrated.
(2) wave changes amr
Just said ringing sound of cell phone has transformed in this invention, adopts the 3GPP standard, and coding is realized transferring the Amr file to from the wave file, and this also is online open plug-in unit, has here just no longer narrated.