CN103559876B

CN103559876B - Sound effect treatment method and system

Info

Publication number: CN103559876B
Application number: CN201310554007.3A
Authority: CN
Inventors: 王影; 江源; 孙见青; 凌震华; 胡国平; 胡郁; 刘庆峰
Original assignee: iFlytek Co Ltd
Current assignee: Anhui Toycloud Technology Co Ltd
Priority date: 2013-11-07
Filing date: 2013-11-07
Publication date: 2016-04-20
Anticipated expiration: 2033-11-07
Also published as: CN103559876A

Abstract

The invention discloses a kind of sound effect treatment method and system, the method comprises: the original sound signal gathering user's input; Determine audio optimization aim; Determine the audio type belonging to described audio optimization aim; Be optimized process according to described audio type to described original sound signal, be optimized voice signal; Signal carries out to described optimization voice signal regular, obtain regular rear voice signal.According to the present invention, different audio effect processing schemes can be adjusted for different Environmental Audio Extension demands, in conjunction with regular process and stereo process, the song of beautifying can be obtained, realize the amusement function of Karaoke, thus can user's actual need be met, be supplied to the auditory effect of user's the best.

Description

Sound effect treatment method and system

Technical field

The invention belongs to signal transacting field, particularly relate to a kind of sound effect treatment method and system.

Background technology

Along with improving constantly of quality of life, amusement has become an indispensable part in daily life.Everyone has the demand of singing, has the impulsion of howl one throat.Karaoke is as all-ages recreation, and singing-hall of playing Karaoka has spread all over the streets and lanes in city, but goes singing-hall not only price of singing higher, and limits by place, can not want to sing namely to sing.Along with internet is fast-developing, increasing user is realized online K more easily and is sung, and the application particularly based on terminal device substantially increases Consumer's Experience.A lot of multimedia mobile terminal supports Kara OK function in the market, but its product is difficult to meet user's requirement, existing karaoke multi-media mobile terminal, can only eliminate the voice of the music formats such as MP3, MP4 simply, cannot professional reverberation be carried out, have a strong impact on karaoke effect; Secondly cannot support to reach different auditory effects to beautifying of singing of user, improve user experience.

For this reason, industry proposes a kind of sound effect processing system and method, and mainly for the music song after voice and accompaniment audio mixing, the signal as music formats such as MP3 processes, and again processes and generates new style music, make it sound with different qualities to it.

In general, tradition audio effect processing mainly contains two types: the first kind be for sound signal in conversion, transmission, amplify, in playing process, the distortion produced due to the factor of source of sound and equipment is carried out rationally effectively revising and compensation, make music effect wish the effect reached closer to musical works itself, reductibility audio can be referred to as.Equations of The Second Kind is then on the basis of original music, carries out spatial loop around, the process such as sound field broadening, Dynamic contrast enhance, makes sound sound more rich and varied, can be referred to as modified audio.

Particularly, in above-mentioned Equations of The Second Kind sound effect treatment method, first system obtains current context information, as hall, square, concert, mountain valley etc., subsequently by methods such as equilibrium, compression, reverberation, realize the special effect processing such as the change of rhythm speed, the conversion of men and women's sound, background music mixing, volume adjusting, make it possess brand-new style.This Equations of The Second Kind sound effect treatment method is when realizing varying environment music, and mainly according to different application environmental parameter, the systematic parameter in adjustment equilibrium, compression, reverb signal processing module realizes.But the music style difference under different applied environments is often comparatively large, only relies on simple parameterized treatment still cannot meet application demand.

Summary of the invention

The object of the invention is to overcome deficiency of the prior art, a kind of sound effect treatment method and system are provided, obtain the song of beautifying, realize the amusement function of Karaoke, meet user's actual need.

For achieving the above object, technical scheme of the present invention is:

A kind of sound effect treatment method, comprising:

Gather the original sound signal of user's input;

Determine audio optimization aim;

Determine the audio type belonging to described audio optimization aim;

Be optimized process according to described audio type to described original sound signal, be optimized voice signal;

Signal carries out to described optimization voice signal regular, obtain regular rear voice signal.

Preferably, describedly audio optimization aim is determined:

Audio scene option is provided to user, and according to the selection result determination audio optimization aim of user; Or

According to the background noise determination audio optimization aim of current environment.

Preferably, the audio type belonging to described audio optimization aim is dim voice audio subclass; Describedly according to described audio type, process is optimized to described original sound signal and comprises:

Directed equilibrium treatment, compression process, reverberation process are carried out successively to described original sound signal.

Preferably, the audio type belonging to described audio optimization aim is clear voice audio subclass; Describedly according to described audio type, process is optimized to described original sound signal and comprises:

At the first track, directed equilibrium treatment, compression process are carried out successively to described original sound signal;

At the second track, reverberation process is carried out to described original sound signal;

Voice signal after first track and the second track process is superposed.

Preferably, described optimization voice signal is carried out that signal is regular to be comprised:

Average energy value according to described optimization voice signal calculates the overall regular factor;

Obtain the energy of the sampled point in each signal element in described optimization voice signal;

Local structured's factor of signal element described in the energy balane of the sampled point in signal element according to the regular Summing Factor of the described overall situation;

For each signal element, each sampled point in signal element according to local structured's factor pair of described signal element carries out local structured.

For first signal element, carry out local structured according to each sampled point in local structured's factor pair first signal element of first signal element;

For each follow-up signal element, carry out local structured according to each sampled point in local structured's factor of this signal element and this signal element of local structured's factor pair of previous signal element thereof.

Preferably, described method also comprises: carry out stereo process to described regular rear voice signal and accompaniment signal, stereo process comprises:

According to the energy of described accompaniment signal and the energy balane actual accompaniment voice energy Ratios of described regular rear voice signal;

Judge whether described reality accompaniment voice energy Ratios equals preset accompaniment voice energy Ratios;

If so, then described regular rear voice signal is superposed with described accompaniment signal;

If not, then according to the described preset accompaniment voice energy Ratios described regular rear voice signal of adjustment or described accompaniment signal, until described reality accompaniment voice energy Ratios equals described preset accompaniment voice energy Ratios;

Regular rear voice signal after adjustment is superposed with described accompaniment signal, or described regular rear voice signal is superposed with the accompaniment signal after adjustment.

A kind of sound effect processing system, comprising:

Original sound signal collecting unit, for gathering the original sound signal of user's input;

Audio optimization aim determining unit, for determining audio optimization aim;

Audio type determining units, for determining the audio type belonging to described audio optimization aim;

Optimization process unit, for being optimized process according to described audio type to described original sound signal, be optimized voice signal;

Regular unit, regular for carrying out signal to described optimization voice signal, obtain regular rear voice signal.

Preferably, described audio optimization aim determining unit comprises:

First audio optimization aim determining unit, for providing audio scene option, and according to the selection result determination audio optimization aim of user to user; Or

Second audio optimization aim determining unit, for the background noise determination audio optimization aim according to current environment.

Preferably, the audio type belonging to described audio optimization aim is dim voice audio subclass; Described optimization process unit comprises:

Directed equilibrium treatment unit, for carrying out directed equilibrium treatment to described original sound signal, obtains voice signal after directed equilibrium treatment;

Compression processing unit, for carrying out compression process to voice signal after described directed equilibrium treatment, obtains compressing rear voice signal;

Reverberation processing unit, for carrying out reverberation process to voice signal after described compression.

Preferably, the audio type belonging to described audio optimization aim is clear voice audio subclass; Described optimization process unit comprises:

First Orbit Optimized unit, for carrying out directed equilibrium treatment, compression process at the first track successively to described original sound signal;

Second Orbit Optimized unit, for carrying out reverberation process at the second track to described original sound signal;

Superpositing unit, for superposing the voice signal after the first track and the second track process.

Preferably, described regular unit comprises:

The regular factor calculating unit of the overall situation, calculates the overall regular factor for the average energy value according to described optimization voice signal;

Energy harvesting unit, for obtaining the energy of the sampled point in each signal element in described optimization voice signal;

Local structured's factor calculating unit, local structured's factor of signal element described in the energy balane of the sampled point in signal element according to the regular Summing Factor of the described overall situation;

First regular unit, for for each signal element, each sampled point in signal element according to local structured's factor pair of described signal element carries out local structured.

Preferably, described regular unit comprises:

Local structured's factor calculating unit, for the sampled point in signal element according to the regular Summing Factor of the described overall situation energy balane described in local structured's factor of signal element;

Second regular unit, for for first signal element, carries out local structured according to each sampled point in local structured's factor pair first signal element of first signal element; For each follow-up signal element, carry out local structured according to each sampled point in local structured's factor of this signal element and this signal element of local structured's factor pair of previous signal element thereof.

Preferably, described system also comprises: stereo process unit, for carrying out stereo process to described regular rear voice signal; Described stereo process unit comprises:

Actual accompaniment voice energy Ratios computing unit, for according to the energy of described accompaniment signal and the energy balane actual accompaniment voice energy Ratios of described regular rear voice signal;

Judging unit, for judging whether described reality accompaniment voice energy Ratios equals preset accompaniment voice energy Ratios;

First superpositing unit, for when described reality accompaniment voice energy Ratios equals described preset accompaniment voice energy Ratios, superposes described regular rear voice signal with described accompaniment signal;

Second superpositing unit, for when described reality accompaniment voice energy Ratios is not equal to described preset accompaniment voice energy Ratios, according to the described preset accompaniment voice energy Ratios described regular rear voice signal of adjustment or described accompaniment signal, until described reality accompaniment voice energy Ratios equals described preset accompaniment voice energy Ratios, and the regular rear voice signal after adjustment is superposed with described accompaniment signal, or described regular rear voice signal is superposed with the accompaniment signal after adjustment.

Beneficial effect of the present invention is, applies sound effect treatment method of the present invention and system:

(1) when carrying out audio optimization process, different schemes can be adjusted for different Environmental Audio Extension demands, as sound scheme of connecting is more suitable for the Environmental Audio Extension scene that voice sounds dim, sound scheme in parallel is then more suitable for voice and sounds Environmental Audio Extension scene more clearly;

(2) after carrying out audio optimization process due to the superposition of multiple audio, the voice finally processed is certain to occur a lot of sonic boom situations, at this moment a lot " taste " sound can be shown in sense of hearing, have a strong impact on the sense of hearing of song, present invention contemplates the voice after to audio optimization process carries out regular, effectively prevent the situation that sonic boom occurs;

(3) when carrying out audio mixing to regular rear voice signal with accompaniment signal, dynamically can adjust accompaniment according to the types of songs of accompanying belonging to signal and carry out audio mixing with the audio mixing ratio of voice, again regular process is carried out to voice signal after superposition, to prevent sonic boom from occurring, thus the auditory effect of user's the best can be supplied to.

Accompanying drawing explanation

In order to be illustrated more clearly in technical scheme of the invention process, be briefly described to the accompanying drawing used required in embodiment below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.

Fig. 1 shows a kind of process flow diagram of embodiment of the present invention sound effect treatment method;

Fig. 2 to show in the embodiment of the present invention when the audio type belonging to described audio optimization aim is dim voice audio subclass, described original sound signal is optimized to the process flow diagram of process;

Fig. 3 to show in the embodiment of the present invention when the audio type belonging to described audio optimization aim is clear voice audio subclass, described original sound signal is optimized to the process flow diagram of process;

Fig. 4 shows in the embodiment of the present invention and carries out the regular a kind of process flow diagram of signal to described to optimization voice signal;

Fig. 5 shows in the embodiment of the present invention and carries out the regular another kind of process flow diagram of signal to described to optimization voice signal;

Fig. 6 shows the another kind of process flow diagram of embodiment of the present invention sound effect treatment method;

Fig. 7 shows the process flow diagram in the embodiment of the present invention, described regular rear voice signal and accompaniment signal being carried out to stereo process;

Fig. 8 shows a kind of structural representation of embodiment of the present invention sound effect processing system;

Fig. 9 shows a kind of structural representation of optimization process unit in the embodiment of the present invention;

Figure 10 shows the another kind of structural representation of optimization process unit in the embodiment of the present invention;

Figure 11 shows a kind of structural representation of regular unit in the embodiment of the present invention;

Figure 12 shows the another kind of structural representation of regular unit in the embodiment of the present invention;

Figure 13 shows the another kind of structural representation of embodiment of the present invention sound effect processing system;

Figure 14 shows the structural representation of stereo process unit in the embodiment of the present invention.

Embodiment

Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.

Embodiment of the present invention sound effect treatment method and system, to solve in background technology Equations of The Second Kind sound effect treatment method when realizing varying environment music, the defect that cannot meet application demand caused by means of only simple adjustment System parameter realizes different ambient musics.The original sound import of system receives user particularly, determines the audio type needing to optimize subsequently, and takes corresponding audio optimization process mode to be optimized pointedly, obtain different Environmental Audio Extension.Alternatively, system also carries out regular and audio mixing to the signal after optimization further, obtains the song of beautifying, thus realizes the amusement function of Karaoke.

As shown in Figure 1, be the process flow diagram of embodiment of the present invention sound effect treatment method, comprise the following steps:

Step 101, gathers the original sound signal of user's input.

Particularly, audio collecting device can be utilized to gather the original sound signal (human voice signal) of user's input as microphone etc. and store; On PC end equipment, conventional external Mike carries out the collection of original sound signal, and the built-in Mike that utilizes more general on the terminal device carries out the collection of original sound signal.The background sound noise etc. that described original sound signal comprises user's voice and maybe may exist.

Step 102, determines audio optimization aim.

Particularly, in an embodiment of the present invention, on the one hand, allow user initiatively to specify audio optimization aim, that is, provide audio scene option to user, and according to the selection result determination audio optimization aim of user.On the other hand, back-up system selects the audio optimization aim of mating most automatically according to current environment self-adaptation.In adaptive analysis, first system obtains the background noise of current environment, and according to the signal to noise ratio (S/N ratio) of described background noise, energy size determines possible music environment, as hall, and mountain valley, the conventional audio scenes such as concert is on-the-spot, then using selected audio scene as audio optimization aim.

Step 103, determines the audio type belonging to described audio optimization aim.

Particularly, varying environment has different music styles, and the difference between different music style varies, and may have certain similarity.As voice in square or mountain valley often sounds comparatively far away and voice is comparatively dim, and when hall or music hall, music sense of hearing voice is often comparatively clear.May take similar operating process to similar music environment, the present invention, according to described audio optimization aim, determines the audio type of its correspondence, comprises dim voice audio subclass and clear voice audio subclass.

Step 104, be optimized process according to described audio type to described original sound signal, be optimized voice signal.

Particularly, with reference to Fig. 2, when the audio type belonging to described audio optimization aim is dim voice audio subclass, describedly according to described audio type, process is optimized to described original sound signal and comprises: described original sound signal is carried out successively:

Step 201, directed equilibrium treatment.Particularly, in order to increase the dim sense of voice, first directed equilibrium treatment being carried out to original sound signal, comprising and voice medium and low frequency part is adjusted, as slightly promoted the signal of frequency at below 150Hz, reducing the lightness of voice; The signal of frequency at 500Hz-2KHz is suitably reduced, to increase dim sense.

Step 202, compression process, sounds softer to make voice, more has dynamics and impulsive force.

Step 203, reverberation process.Particularly, carry out landscaping treatment to the signal after compression, landscaping treatment is here not limited to described reverberation process, to increase richness and the spatial impression of musical sound, increases the dim sense of sound further simultaneously.

Have kind of a dim sense by the sound after above-mentioned in-line optimized treatment method process, and the spatial impression of sound can be increased, make voice sound far away.

With reference to Fig. 3, when the audio type belonging to described audio optimization aim is clear voice audio subclass, describedly according to described audio type, process is optimized to described original sound signal and comprises: according to parallel processing pattern,

Step 301, carries out directed equilibrium treatment, compression process at the first track successively to described original sound signal, and the signal after specimens preserving.Particularly, first, directed equilibrium treatment is carried out to described original sound signal, more clear to make sound listen, comprise and middle pitch (frequency is at 500Hz-2KHz) part is suitably promoted, sound is thoroughly become clear, centering treble portion, promotes slightly, increases penetration power and the stereovision of sound; Then the sound crossed directed equilibrium treatment adds compression process, sounds softer, more has dynamics and impulsive force to make voice.Finally, the signal after compression process is preserved.

Step 302, carries out reverberation process at the second track to described original sound signal, and the signal after specimens preserving.Particularly, carry out landscaping treatment to described original sound signal, to increase richness and the spatial impression of sound, landscaping treatment here includes but are not limited to reverberation process.Then, the signal after specimens preserving.

Step 303, superposes the voice signal after the first track and the second track process, increases the sharpness of music further.

Above-mentioned parallel optimized treatment method, the signal lost after effectively can compensating the process of voice, the voice after processing is sounded can be more clear.

Visible, in the embodiment of the present invention in sound effect treatment method, when carrying out audio optimization process, the present invention can adjust different schemes for different Environmental Audio Extension demands, as sound scheme of connecting is more suitable for the Environmental Audio Extension scene that voice sounds dim, sound scheme in parallel is then more suitable for voice and sounds Environmental Audio Extension scene more clearly.

Step 105, carries out signal to described optimization voice signal regular, obtains regular rear voice signal.Signal carries out to optimization voice signal regular, object be to avoid may problem, the problem includes: sonic boom problem.

Particularly, with reference to Fig. 4, for carrying out the regular a kind of process flow diagram of signal to described to optimization voice signal in the embodiment of the present invention, describedly optimization voice signal is carried out to signal is regular to be comprised:

Step 401, the average energy value according to described optimization voice signal calculates the overall regular factor.

Particularly, calculate the regular factor scale_global of the overall situation of described optimization voice signal according to scale_global=α/mean, wherein, α is the arbitrary small number between 0 to 1, and α is preferably 0.12 in a preferred embodiment of the invention; Mean is the average energy value of described voice signal.

Step 402, obtains the energy of the sampled point in each signal element in described optimization voice signal.

Step 403, local structured's factor of signal element described in the energy balane of the sampled point in signal element according to the regular Summing Factor of the described overall situation.

Particularly, intercept n signal element in described voice signal, obtain m sampled point in each signal element, described n and m is the random natural number being greater than 0.

According to scale_j=min (scale_global, β/max_j) calculate local structured's factor of each signal element respectively, wherein, scale_j is local structured's factor of a jth signal element, β is the arbitrary small number between 0 to 1, and β is preferably 0.45 in a preferred embodiment of the invention; Max_j is the maximal value of the energy of m sampled point in a jth signal element, j=1,2 ..., n.

Step 404, for each signal element, each sampled point in signal element according to local structured's factor pair of described signal element carries out local structured.

Particularly, according to data_norm (j, i)=data (j, i) * scale_j carries out local structured to each sampled point in signal element described in each respectively, wherein, data_norm (j, i) is the regular data of i-th sampled point in a jth signal element, data (j, i) be the raw data of i-th sampled point in a jth signal element, i=1,2,, m.

With reference to Fig. 5, for carrying out the regular another kind of process flow diagram of signal to described to optimization voice signal in the embodiment of the present invention, applying the regular method of signal that this kind is optimized, the flatness after the regular effect of signal and regular data can be improved.

Particularly, described optimization voice signal is carried out that signal is regular to be comprised:

Step 501, the average energy value according to described optimization voice signal calculates the overall regular factor.

Particularly, calculate the regular factor scale_global of the overall situation of described voice signal according to scale_global=α/mean, wherein, α is the arbitrary small number between 0 to 1, and α is preferably 0.12 in a preferred embodiment of the invention; Mean is the average energy value of described voice signal.

Step 502, obtains the energy of the sampled point in each signal element in described optimization voice signal.

Step 503, local structured's factor of signal element described in the energy balane of the sampled point in signal element according to the regular Summing Factor of the described overall situation.

According to scale_j=min (scale_global, β/max_j) calculate local structured's factor of each signal element respectively, wherein, scale_j is local structured's factor of a jth signal element, β is the arbitrary small number between 0 to 1, in a preferred embodiment of the invention, β is preferably 0.45; Max_j is the maximal value of the energy of m sampled point in a jth signal element, j=1,2 ..., n.

Step 504, for first signal element, carries out local structured according to each sampled point in local structured's factor pair first signal element of first signal element.

Particularly, according to data_norm (1, i)=data (1, i) * scale_1 carries out local structured to m sampled point in the 1st signal element respectively, wherein, and data_norm (1, i) be the regular data of i-th sampled point in the 1st signal element, (1, i) be the raw data of i-th sampled point in the 1st signal element, scale_1 is local structured's factor of the 1st signal element to data, i=1,2 ..., m.

Step 505, for each follow-up signal element, carries out local structured according to each sampled point in local structured's factor of this signal element and this signal element of local structured's factor pair of previous signal element thereof.

Specifically according to data_norm (j, i)=data (j, i) * (scale_ (j-1)+i/m* (scale_j-scale_ (j-1))) carries out local structured to each sampled point in the 2nd signal element to the n-th signal element respectively, wherein, data_norm (j, i) be the regular data of i-th sampled point in a jth signal element, data (j, i) be the raw data of i-th sampled point in a jth signal element, scale_j is local structured's factor of a jth signal element, local structured's factor that scale_ (j-1) is jth-1 signal element, i=1, 2, m.

Visible, in embodiment of the present invention sound effect treatment method, due to the superposition of multiple audio after carrying out audio optimization process, the voice finally processed is certain to occur a lot of sonic boom situations, at this moment a lot " taste " sound can be shown in sense of hearing, have a strong impact on the sense of hearing of song, present invention contemplates the voice after to audio optimization process and carry out regular, effectively prevent the situation that sonic boom occurs.

Further, with reference to Fig. 6, be the another kind of process flow diagram of embodiment of the present invention sound effect treatment method, described sound effect treatment method also comprises on the basis of above-mentioned steps 105:

Step 106, described regular rear voice signal carries out stereo process with accompaniment signal.

Particularly, above-mentioned stereo process is, system accepts user and specifies accompaniment, and obtains relevant accompaniment signal; Then described accompaniment signal is utilized to carry out audio mixing to described regular rear voice signal, voice signal after acquisition audio mixing.

In detail, the acquisition methods of accompaniment signal comprises: receive the musical background that user specifies; The accompaniment signal that search database matches to judge whether to be built-in with in database the musical background of specifying with user, if had, then directly carries out stereo process by described regular rear voice signal and accompaniment signal; If not, then by the accompaniment signal that the internet musical background that search and described user specify on the music site of cooperation matches, then described regular rear voice signal and accompaniment signal are carried out stereo process.

Described to described regular rear voice signal with accompaniment signal carry out stereo process as shown in Figure 7, comprising:

Step 601, according to the energy of described accompaniment signal and the energy balane actual accompaniment voice energy Ratios of described regular rear voice signal.Particularly, described reality accompaniment voice energy Ratios is the accompaniment energy of signal and the ratio of the energy of regular rear voice signal.

Step 602, judges whether described reality accompaniment voice energy Ratios equals preset accompaniment voice energy Ratios, if so, then performs step 603; If not, then step 604 is performed.

Step 603, superposes described regular rear voice signal with described accompaniment signal;

Step 604, according to the described preset accompaniment voice energy Ratios described regular rear voice signal of adjustment or described accompaniment signal, until described reality accompaniment voice energy Ratios equals described preset accompaniment voice energy Ratios, then the regular rear voice signal after adjustment is superposed with described accompaniment signal, or described regular rear voice signal is superposed with the accompaniment signal after adjustment.

Particularly, the defining method of described preset accompaniment voice energy Ratios comprises: determine the types of songs belonging to described accompaniment signal; When signal of accompanying belongs to tenderness song, described preset accompaniment voice energy Ratios is generally 1.1, accompanies bigger; If when accompaniment signal belongs to rock song, described preset accompaniment voice energy Ratios is generally 0.9, accompanies smaller.

Above-mentioned method of adjustment is: first, the energy of the regular rear voice signal after adjustment is obtained according to formula 1, or the energy of the accompaniment signal after adjusting according to formula 2, formula 1 is: the energy/preset accompaniment voice energy Ratios of the energy=accompaniment signal of the regular rear voice signal after adjustment, and formula 2 is the energy=regular rear voice signal * preset accompaniment voice energy Ratios of the accompaniment signal after adjustment.Then, energy according to the regular rear voice signal after adjustment carries out regular process (here to former regular rear voice signal, regular process can take the regular method of the signal of routine of the prior art, also the regular method of the signal described in step 401 to step 404 of the present invention can be taked, or take the regular method of the signal described in step 501 to step 505 of the present invention, do not repeat them here), regular rear voice signal after being adjusted, or the energy according to the accompaniment signal after adjustment carries out regular process to former accompaniment signal, accompaniment signal after being adjusted.For example, this song of < > today of Liu Dehua, suppose that the preset accompaniment voice energy Ratios in server is predisposed to 0.9, the actual accompaniment voice energy Ratios of current acquisition is 1.2, illustrate that the energy of accompaniment signal is bigger than normal, need to carry out lifting adjustment to regular rear voice signal.Therefore, first, according to energy/0.9 of the energy=accompaniment signal of the regular rear voice signal after adjustment, the energy of the regular rear voice signal after being adjusted, then carry out regular according to the energy of the regular rear voice signal after described adjustment to former regular rear voice signal, thus the regular rear voice signal after being adjusted.

Further, described stereo process also comprises the voice signal after to superposition, and to carry out signal regular, occurs sonic boom with voice signal after preventing described superposition.Here, the regular method of signal is carried out to the voice signal after superposition, the regular method of the signal of routine of the prior art can be taked, also the regular method of the signal described in step 401 to step 404 of the present invention can be taked, or take the regular method of the signal described in step 501 to step 505 of the present invention, do not repeat them here.

Visible, in embodiment of the present invention sound effect treatment method, when carrying out audio mixing to regular rear voice signal with accompaniment signal, the present invention dynamically can adjust accompaniment and carry out audio mixing with the audio mixing ratio of voice according to the types of songs of accompanying belonging to signal, again regular process is carried out to voice signal after superposition, to prevent sonic boom from occurring, thus the auditory effect of user's the best can be supplied to.

Correspondingly, the embodiment of the present invention also provides a kind of sound effect processing system, as shown in Figure 8, is a kind of structural representation of this system.

In this embodiment, described sound effect processing system comprises:

Original sound signal collecting unit 701, for gathering the original sound signal of user's input;

Audio optimization aim determining unit 702, for determining audio optimization aim;

Audio type determining units 703, for determining the audio type belonging to described audio optimization aim;

Optimization process unit 704, for being optimized process according to described audio type to described original sound signal, be optimized voice signal;

Regular unit 705, regular for carrying out signal to described optimization voice signal, obtain regular rear voice signal.

In embodiments of the present invention, a kind of concrete structure of described audio optimization aim determining unit 702 can comprise: the first audio optimization aim determining unit, for providing audio scene option to user, and according to the selection result determination audio optimization aim of user; Or the second audio optimization aim determining unit, for the background noise determination audio optimization aim according to current environment.

In embodiments of the present invention, as shown in Figure 9, when the audio type belonging to described audio optimization aim is dim voice audio subclass, a kind of concrete structure of described optimization process unit 704 can comprise: directed equilibrium treatment unit 801, for carrying out directed equilibrium treatment to described original sound signal, obtain voice signal after directed equilibrium treatment; Compression processing unit 802, for carrying out compression process to voice signal after described directed equilibrium treatment, obtains compressing rear voice signal; And reverberation processing unit 803, for carrying out reverberation process to voice signal after described compression.

In embodiments of the present invention, as shown in Figure 10, when the audio type belonging to described audio optimization aim is clear voice audio subclass, a kind of concrete structure of described optimization process unit 704 can comprise: the first Orbit Optimized unit 901, for carrying out directed equilibrium treatment, compression process at the first track successively to described original sound signal; Second Orbit Optimized unit 902, for carrying out reverberation process at the second track to described original sound signal; And superpositing unit 903, for superposing the voice signal after the first track and the second track process.

In embodiments of the present invention, as shown in figure 11, a kind of concrete structure of described regular unit 705 can comprise: overall regular factor calculating unit 1001, calculates the overall regular factor for the average energy value according to described optimization voice signal; Energy harvesting unit 1002, for obtaining the energy of the sampled point in each signal element in described optimization voice signal; Local structured's factor calculating unit 1003, for the sampled point in signal element according to the regular Summing Factor of the described overall situation energy balane described in local structured's factor of signal element; And the first regular unit 1004, for for each signal element, each sampled point in signal element according to local structured's factor pair of described signal element carries out local structured.

In embodiments of the present invention, as shown in figure 12, the another kind of concrete structure of described regular unit 705 can comprise: overall regular factor calculating unit 1001, calculates the overall regular factor for the average energy value according to described optimization voice signal; Energy harvesting unit 1002, for obtaining the energy of the sampled point in each signal element in described optimization voice signal; Local structured's factor calculating unit 1003, for the sampled point in signal element according to the regular Summing Factor of the described overall situation energy balane described in local structured's factor of signal element; And the second regular unit 1101, for for first signal element, carries out local structured according to each sampled point in local structured's factor pair first signal element of first signal element; And for for each follow-up signal element, carry out local structured according to each sampled point in local structured's factor of this signal element and this signal element of local structured's factor pair of previous signal element thereof.

In embodiments of the present invention, as shown in figure 13, described sound effect processing system also comprises the stereo process unit 706 with described regular unit 705, for carrying out stereo process to described regular rear voice signal.

In embodiments of the present invention, as shown in figure 14, a kind of concrete structure of described stereo process unit 706 can comprise: actual accompaniment voice energy Ratios computing unit 1201, for according to the energy of described accompaniment signal and the energy balane actual accompaniment voice energy Ratios of described regular rear voice signal; Judging unit 1202, for judging whether described reality accompaniment voice energy Ratios equals described preset accompaniment voice energy Ratios; First superpositing unit 1203, for when described reality accompaniment voice energy Ratios equals described preset accompaniment voice energy Ratios, superposes described regular rear voice signal with described accompaniment signal; Second superpositing unit 1204, for when described reality accompaniment voice energy Ratios is not equal to described preset accompaniment voice energy Ratios, according to the described preset accompaniment voice energy Ratios described regular rear voice signal of adjustment or described accompaniment signal, until described reality accompaniment voice energy Ratios equals described preset accompaniment voice energy Ratios, and the regular rear voice signal after adjustment is superposed with described accompaniment signal, or described regular rear voice signal is superposed with the accompaniment signal after adjustment.

Each embodiment in this instructions all adopts the mode of going forward one by one to describe, between each embodiment identical similar part mutually see, what each embodiment stressed is the difference with other embodiments.Especially, for system embodiment, because it is substantially similar to embodiment of the method, so describe fairly simple, relevant part illustrates see the part of embodiment of the method.System embodiment described above is only schematic, and the wherein said unit that illustrates as separating component and module can or may not be physically separates.In addition, some or all of unit wherein and module can also be selected according to the actual needs to realize the object of the present embodiment scheme.Those of ordinary skill in the art, when not paying creative work, are namely appreciated that and implement.

Structure of the present invention, feature and action effect is described in detail above according to graphic shown embodiment; the foregoing is only preferred embodiment of the present invention; but the present invention does not limit practical range with shown in drawing; every change done according to conception of the present invention; or be revised as the Equivalent embodiments of equivalent variations; do not exceed yet instructions with diagram contain spiritual time, all should in protection scope of the present invention.

Claims

1. a sound effect treatment method, is characterized in that, comprising:

Gather the original sound signal of user's input;

Determine audio optimization aim;

Determine the audio type belonging to described audio optimization aim;

Signal carries out to described optimization voice signal regular, obtain regular rear voice signal;

Described optimization voice signal is carried out that signal is regular to be comprised:

For each signal element, each sampled point in signal element according to local structured's factor pair of described signal element carries out local structured; Or for first signal element, carry out local structured according to each sampled point in local structured's factor pair first signal element of first signal element; For each follow-up signal element, carry out local structured according to each sampled point in local structured's factor of this signal element and this signal element of local structured's factor pair of previous signal element thereof.

2. sound effect treatment method according to claim 1, is characterized in that, describedly determines audio optimization aim:

3. sound effect treatment method according to claim 1, is characterized in that,

Audio type belonging to described audio optimization aim is dim voice audio subclass; Describedly according to described audio type, process is optimized to described original sound signal and comprises:

4. sound effect treatment method according to claim 1, is characterized in that, the audio type belonging to described audio optimization aim is clear voice audio subclass; Describedly according to described audio type, process is optimized to described original sound signal and comprises:

Voice signal after first track and the second track process is superposed.

5. the sound effect treatment method according to any one of Claims 1-4, is characterized in that, described method also comprises: carry out stereo process to described regular rear voice signal and accompaniment signal, stereo process comprises:

6. a sound effect processing system, is characterized in that, comprising:

Regular unit, regular for carrying out signal to described optimization voice signal, obtain regular rear voice signal;

Described regular unit comprises:

Described regular unit also comprises: the first regular unit or the second regular unit, wherein:

Described first regular unit, for for each signal element, each sampled point in signal element according to local structured's factor pair of described signal element carries out local structured;

Described second regular unit, for for first signal element, carries out local structured according to each sampled point in local structured's factor pair first signal element of first signal element; For each follow-up signal element, carry out local structured according to each sampled point in local structured's factor of this signal element and this signal element of local structured's factor pair of previous signal element thereof.

7. sound effect processing system according to claim 6, is characterized in that, described audio optimization aim determining unit comprises:

8. sound effect processing system according to claim 6, is characterized in that, the audio type belonging to described audio optimization aim is dim voice audio subclass; Described optimization process unit comprises:

9. sound effect processing system according to claim 6, is characterized in that, the audio type belonging to described audio optimization aim is clear voice audio subclass; Described optimization process unit comprises:

10. the sound effect processing system according to any one of claim 6 to 9, is characterized in that, described system also comprises: stereo process unit, for carrying out stereo process to described regular rear voice signal; Described stereo process unit comprises: