CN106792253B

CN106792253B - sound effect processing method and system

Info

Publication number: CN106792253B
Application number: CN201611092855.7A
Authority: CN
Inventors: 陈蕴洲
Original assignee: Guangzhou Shiyuan Electronics Thecnology Co Ltd
Current assignee: Guangzhou Shiyuan Electronics Thecnology Co Ltd
Priority date: 2016-11-30
Filing date: 2016-11-30
Publication date: 2019-07-09
Anticipated expiration: 2036-11-30
Also published as: CN106792253A

Abstract

The invention relates to a sound effect processing method and a system, wherein the method comprises the following steps: collecting an environmental sound signal within a preset time range; carrying out feature extraction on the environmental sound signal to obtain environmental sound features; selecting a reference sound feature with the maximum similarity to the environmental sound feature from a plurality of preset reference sound features; searching a preset sound effect mode corresponding to the selected reference sound characteristic to obtain a matched sound effect mode; and performing sound effect processing on the sound signal to be played according to the matched sound effect mode. Therefore, the most suitable matching sound effect mode can be automatically selected according to the environmental sound characteristics of the environmental sound signal, and the sound effect processing effect is good; meanwhile, user operation is not needed, and the use convenience of the user is improved.

Description

Sound effect treatment method and system

Technical field

The present invention relates to signal processing technology fields, more particularly to a kind of sound effect treatment method and system.

Background technique

When audio and video display device carries out audio-video or program broadcasting, it can according to need and different audio modes is set. By taking television set as an example, the method for traditional replacement audio mode is that menu interface of the user using remote controler in display selects It selects, the audio mode that the processing system of television set is selected according to user exports after handling voice signal.

However, usually there is noise in external environment locating for television set, user is not aware that corresponding to current environment, Which kind of most suitable audio mode is, is easy to appear the audio effect processing result and the incongruent situation of current environment of television set, sound Imitate poor processing effect.

Summary of the invention

Based on this, it is necessary in view of the above-mentioned problems, providing the sound effect treatment method and system of a kind of high treating effect.

A kind of sound effect treatment method, comprising:

Acquire the environmental sound signal in preset time range；

Feature extraction is carried out to the environmental sound signal, obtains ambient sound feature；

It is chosen and the maximum reference voice of ambient sound characteristic similarity from preset multiple reference voice features Feature；

Default audio mode corresponding with the reference voice feature chosen is searched, matching audio mode is obtained；

Audio effect processing is carried out to voice signal to be played according to the matching audio mode.

A kind of sound effect processing system, comprising:

Environmental sound signal acquisition module, for acquiring the environmental sound signal in preset time range；

Ambient sound feature obtains module, for carrying out feature extraction to the environmental sound signal, obtains ambient sound Feature；

Reference voice characteristic selecting module, for being chosen and the ambient sound from preset multiple reference voice features The maximum reference voice feature of characteristic similarity；

Match audio mode searching module, the corresponding default audio mode of reference voice feature for searching with choosing, Obtain matching audio mode；

Audio effect processing module, for carrying out audio effect processing to voice signal to be played according to the matching audio mode.

Above-mentioned sound effect treatment method and system, by the environmental sound signal in acquisition preset time range, to ambient sound Sound signal carries out feature extraction and obtains ambient sound feature；Then selection and ambient sound from preset multiple reference voice features The maximum reference voice feature of sound characteristic similarity is searched default audio mode corresponding with the reference voice feature chosen and is obtained Audio mode is matched, and audio effect processing is carried out to voice signal to be played according to matching audio mode.In this way, can be according to ambient sound The ambient sound feature of sound signal, chooses most suitable matching audio mode automatically, and audio effect processing effect is good；Meanwhile without using Family operation improves the convenience that user uses.

Detailed description of the invention

Fig. 1 is the flow chart of sound effect treatment method in an embodiment；

Fig. 2 is to carry out feature extraction to environmental sound signal in an embodiment, obtains the detailed process of ambient sound feature Figure；

Fig. 3 is to choose from preset multiple reference voice features in an embodiment and ambient sound characteristic similarity maximum Reference voice feature specific flow chart；

Fig. 4 is the module map of sound effect processing system in an embodiment.

Specific embodiment

With reference to Fig. 1, sound effect treatment method in an embodiment includes the following steps.

S110: the environmental sound signal in acquisition preset time range.

Preset time range refers to period default setting or the pre-set time range of duration.Voice signal specifically can be with It is obtained by the sound that microphone acquires ambient enviroment.

In one embodiment, it is as initial time, with preset value at the time of preset time range is to receive play instruction The time range of duration.Wherein, play instruction refers to the instruction for being used to indicate and opening that audio/video plays or TV programme play, example The instruction of processing system is waken up when such as television boot-strap.Preset value can be specifically arranged according to actual needs.In the present embodiment, in advance If value is 5 seconds；When corresponding to the television boot-strap moment at the time of receiving play instruction, preset time range is after television boot-strap First 5 seconds.

Usually after receiving play instruction, processing system just plays audio/video after needing the shorter response time.Pass through choosing The time range that taking is initial time at the time of receiving play instruction, preset value is duration carries out the acquisition of voice signal, obtains Environmental sound signal be starting play audio/video before voice signal, avoid the sound of actual play to ambient sound The influence of sound signal acquisition, can be improved the accuracy of sound signal collecting.It is appreciated that in other embodiments, preset time Range can also with other times range, such as using current time as initial time, using preset value as the time range of duration, currently Moment can set in real time, realize and acquire environmental sound signal simultaneously during playing audio/video.

S130: feature extraction is carried out to environmental sound signal, obtains ambient sound feature.

Carrying out the ambient sound feature that feature extraction obtains to environmental sound signal can be numerical value or image.

In one embodiment, ambient sound feature is numerical value.With reference to Fig. 2, step S130 includes step S131 to step S134。

S131: environmental sound signal is converted into digital signal.

The ambient sound of acquisition is analog signal, can convert analog signals into digital signal by analog-to-digital conversion.

S132: to digital signal carry out spectrum analysis obtain include multiple Frequency points frequency information.

Spectrum analysis is carried out to digital signal, specifically can be and analyzed using Fourier transformation, obtain digital signal The Frequency point for inside including.

S133: the average value of the Frequency point in each predeterminated frequency section is calculated separately according to frequency information, as each pre- If the characteristic value of frequency band.

Predeterminated frequency section has multiple, can be arranged according to the audible audio frequency range of practical human ear.By frequency information In be not belonging to the Frequency point of any one predeterminated frequency section and give up, classify according to the size of Frequency point to each Frequency point, it is right The Frequency point belonged in a predeterminated frequency section calculates average value, the characteristic value of available corresponding predeterminated frequency section.Specifically Ground, if Frequency point is successive value in the same predeterminated frequency section, the calculating of average value be can be by predeterminated frequency section Frequency point after divided by frequency spectrum length；If Frequency point in the same predeterminated frequency section is discrete value, average value can be with Be obtained divided by Frequency point number after the sum of each Frequency point by directly calculating, for example, 30hz (hertz) in frequency information, 50hz, 100hz, 110hz belong to the same predeterminated frequency section, then the average value for calculating 30,50,100 and 110 presets frequency as this The characteristic value of rate section.

In one embodiment, predeterminated frequency section include 20hz-200hz, 200hz-700hz, 700hz-2000hz, 2000hz-7000hz and 7000hz-15000hz.In this way, the audio frequency range that human ear under normal conditions is heard is drawn Point, feature extraction is targetedly carried out, data-handling efficiency is improved.It is appreciated that in other embodiments, predeterminated frequency section It may be set to be other numerical value.

S134: calculating separately the characteristic value of each predeterminated frequency section and the product of corresponding predetermined coefficient, and calculate each product it With obtain ambient sound feature.

Each predeterminated frequency section is corresponding with a predetermined coefficient in advance, and passes through step S133, each predeterminated frequency Section is corresponding with a characteristic value.By by the characteristic value of each predeterminated frequency section predetermined coefficient phase corresponding with the predeterminated frequency section Multiply, each predeterminated frequency section is corresponding to obtain a product, and calculating the sum of products then can be obtained ambient sound feature.In the present embodiment, Each predeterminated frequency section 20hz-200hz, 200hz-700hz, 700hz-2000hz, 2000hz-7000hz and 7000hz- The corresponding predetermined coefficient of 15000hz are as follows: -100, -10,0,10 and 100.

By carrying out frequency analysis to environmental sound signal, the frequency information obtained according to frequency analysis calculate The numerical value arrived is indicated in the form of quantization as ambient sound feature, is convenient for Data Analysis Services.It is appreciated that In other embodiments, sound characteristic can also be extracted using other methods, for example, to the number after environmental sound signal analog-to-digital conversion Word signal carries out spectrum analysis, and the spectrogram that spectrum analysis is obtained is directly as environmental sound signal.

S150: it is chosen and the maximum reference voice of ambient sound characteristic similarity from preset multiple reference voice features Feature.

Reference voice feature, which refers to, to be preset for referring to the sound characteristic compared.If ambient sound feature is numerical value, Reference voice feature is similarly numerical value, and reference voice feature and the similarity of ambient sound feature are obtained by calculating the difference of the two It arrives, difference is smaller, then similarity is more maximum.If ambient sound feature is image, reference voice feature is similarly image, refers to Sound characteristic obtains compared with the similarity of ambient sound feature is by image, and image comparing difference is smaller, then similarity is bigger.

Wherein, reference voice feature can be obtained by preparatory collection analysis.In one embodiment, with reference to Fig. 3, step It further include step S101 to step S103 before S110.

S101: respectively to the voice signal in multiple model scenes acquisition preset duration, multiple reference voice signals are obtained.

Model scene refers to real life scene.The quantity of model scene is equal with the quantity of reference voice feature, mould The voice signal of type scene can specifically be acquired by microphone.In the present embodiment, model scene includes parlor in the daytime, night visitor Five kinds of the Room, hotel, sales field and dining room, respectively correspond The clamors of the people bubble up, it is quiet, quiet, mix and spacious five kinds of situations.It can manage Solution, in other embodiments, model scene can also be other scenes.

Preset duration is greater than the corresponding duration of preset time range.Model scene voice signal is carried out for a long time by setting Acquisition, obtained voice signal can accurately indicate the corresponding ambient acoustical circumstances of model scene, so that obtained reference Voice signal is more acurrate.In the present embodiment, preset duration is 10000 seconds.

S102: according to the ratio between preset duration duration corresponding with preset time range, each reference voice signal is divided into more A signal segment.

The ratio between preset duration duration corresponding with preset time range may be the integer value greater than 1, it is also possible to be greater than 1 Non integer value.Reference voice signal is the voice signal continuous collection at each moment in preset duration, and each reference voice is believed Number it is divided into multiple signal segments, it is a length of time interval when specifically can be corresponding with preset time range, suitable according to time order and function Ordered pair reference voice signal carries out interception segmentation, and the last voice signal less than in a time interval is as a signal segment. In this way, the corresponding reference voice signal of each model scene can be divided into multiple signal segments.

S103: extracting the sound characteristic of each signal segment, obtains corresponding reference voice according to the sound characteristic of each signal segment The sound characteristic of signal obtains reference voice feature.

The specific method for extracting the sound characteristic of each signal segment, the specific side with the sound characteristic of extraction environment voice signal Method is identical, and this will not be repeated here.If extracting obtained sound characteristic is numerical value, according to the acquisition pair of the sound characteristic of each signal segment The sound characteristic for the reference voice signal answered specifically can be using the average value of the numerical value of each signal segment as reference voice spy Sign；If extracting obtained sound characteristic is image, corresponding reference voice signal is obtained according to the sound characteristic of each signal segment Sound characteristic, specifically can be and image processing and analyzing carried out to the image of each signal segment, obtain representing entire reference voice letter Number image as reference voice feature.

Acquire the reference voice signal of multiple model scenes in advance to the mode of step S103 by using step S101, and It carries out feature extraction and obtains the reference voice feature of each model scene, the representative strong and accuracy of obtained reference voice feature It is high.

S170: searching default audio mode corresponding with the reference voice feature chosen, and obtains matching audio mode.

Each reference voice feature is corresponding with a kind of default audio mode in advance, specifically can be by that will refer to sound in advance Sound feature storage corresponding with default audio mode, so that corresponding default audio mould can be found according to reference voice feature Formula.

In the present embodiment, presetting audio mode includes five kinds of news, night, movie theatre, standard and music modes, is respectively corresponded In the daytime the reference voice feature in parlor, night parlor, hotel, sales field and five kinds of dining room model scene.In the setting of audio mode, Be typically used for three kinds of standard techniques: total-sonic, total volume and total surround, total-sonic have On/off two states, total volume have tri- kinds of states of normal/night/off, and total surround has on/off Two states.The state that each default audio mode corresponds to accepted standard technology is as shown in table 1 below.

Table 1

	total-sonic	total volume	total surround
				News	on	normal	off
Night	on	night	off
				Movie theatre	on	off	on
Standard	off	off	off
				Music	on	off	off

S190: audio effect processing is carried out to voice signal to be played according to matching audio mode.

After obtaining matching audio mode, audio effect processing is carried out according to matching audio mode, specifically using matching audio mould The corresponding standard technique of formula carries out audio effect processing to voice signal to be played automatically, so that the voice signal to be played of output adapts to In the environment being presently in.For example, if matching audio mode be music, be arranged total-sonic, total volume and The state of tri- kinds of standard techniques of total surround is respectively on, off, off.Voice signal to be played can be television set Sound of television signal.

Above-mentioned sound effect treatment method, by the environmental sound signal in acquisition preset time range, to environmental sound signal It carries out feature extraction and obtains ambient sound feature；Then it is chosen and ambient sound feature from preset multiple reference voice features The maximum reference voice feature of similarity searches default audio mode corresponding with the reference voice feature chosen and obtains matching sound Effect mode, and audio effect processing is carried out to voice signal to be played according to matching audio mode.In this way, can be according to environmental sound signal Ambient sound feature, choose most suitable matching audio mode automatically, audio effect processing effect is good；Meanwhile it being not necessarily to user's operation, Improve the convenience that user uses.

Above-mentioned sound effect treatment method can be applied to the processing system of television set, allow television set automatic according to environment Selection matching audio mode carries out audio effect processing.Above-mentioned sound effect treatment method also can be applied to other audio and video display devices, Such as mobile phone, plate etc., so that audio and video display device when opening player, can automatically select matching audio according to environment Mode carries out audio effect processing.

With reference to Fig. 4, sound effect processing system in an embodiment, including environmental sound signal acquisition module 110, ambient sound Feature obtains module 130, reference voice characteristic selecting module 150, matching audio mode searching module 170 and audio effect processing module 190。

Environmental sound signal acquisition module 110 is used to acquire the environmental sound signal in preset time range.

Usually after receiving play instruction, processing system just plays audio/video after needing the shorter response time.Pass through choosing The time range that taking is initial time at the time of receiving play instruction, preset value is duration carries out the acquisition of voice signal, obtains Environmental sound signal be starting play audio/video before voice signal, avoid the sound of actual play to ambient sound The influence of sound signal acquisition, can be improved the accuracy of sound signal collecting.

Ambient sound feature obtains module 130 and is used to carry out feature extraction to environmental sound signal, obtains ambient sound spy Sign.

In one embodiment, ambient sound feature is numerical value.It includes analog-to-digital conversion list that ambient sound feature, which obtains module 130, First (not shown), spectral analysis unit (not shown), characteristic value computing unit (not shown) and ambient sound feature calculation unit (not shown).

AD conversion unit is used to environmental sound signal being converted to digital signal.

Spectral analysis unit be used for digital signal carry out spectrum analysis obtain include multiple Frequency points frequency information.It is right Digital signal carries out spectrum analysis, specifically can be and is analyzed using Fourier transformation, obtains the frequency for including in digital signal Rate point.

Characteristic value computing unit is used to calculate separately the flat of the Frequency point in each predeterminated frequency section according to frequency information Mean value, the characteristic value as each predeterminated frequency section.

If Frequency point is successive value in the same predeterminated frequency section, the calculating of average value be can be by predeterminated frequency Divided by frequency spectrum length after frequency point in section；If the Frequency point in the same predeterminated frequency section is discrete value, average value It can be and obtained divided by Frequency point number after the sum of each Frequency point by directly calculating.

In one embodiment, predeterminated frequency section include 20hz-200hz, 200hz-700hz, 700hz-2000hz, 2000hz-7000hz and 7000hz-15000hz.In this way, the audio frequency range that human ear under normal conditions is heard is drawn Point, feature extraction is targetedly carried out, data-handling efficiency is improved.

Ambient sound feature calculation unit is used to calculate separately the characteristic value of each predeterminated frequency section and corresponding predetermined coefficient Product, and calculate each sum of products and obtain ambient sound feature.In the present embodiment, each predeterminated frequency section 20hz-200hz, The corresponding predetermined coefficient of 200hz-700hz, 700hz-2000hz, 2000hz-7000hz and 7000hz-15000hz are as follows:- 100, -10,0,10 and 100.

By using AD conversion unit, spectral analysis unit, characteristic value computing unit and ambient sound feature calculation list Member carries out frequency analysis, the numerical value that the frequency information obtained according to frequency analysis is calculated to environmental sound signal It as ambient sound feature, is indicated in the form of quantization, is convenient for Data Analysis Services.

Reference voice characteristic selecting module 150 is used to choose from preset multiple reference voice features special with ambient sound Levy the maximum reference voice feature of similarity.

If ambient sound feature is numerical value, reference voice feature is similarly numerical value, reference voice feature and ambient sound The similarity of feature is obtained by calculating the difference of the two, and difference is smaller, then similarity is more maximum.If ambient sound feature is figure Picture, then reference voice feature is similarly image, and reference voice feature is compared with the similarity of ambient sound feature is by image It arrives, image comparing difference is smaller, then similarity is bigger.

Reference voice feature can be obtained by preparatory collection analysis.In one embodiment, above-mentioned sound effect processing system is also It is obtained including reference voice signal acquisition module (not shown), reference voice signal segmentation module (not shown) and reference voice feature Modulus block (not shown).

Reference voice signal acquisition module for the voice signal in multiple model scenes acquisition preset duration, obtaining respectively To multiple reference voice signals.Wherein, the quantity of model scene is equal with the quantity of reference voice feature, and preset duration is greater than pre- If the corresponding duration of time range.In the present embodiment, model scene includes parlor, night parlor, hotel, sales field and dining room in the daytime Five kinds, respectively correspond The clamors of the people bubble up, it is quiet, quiet, mix and spacious five kinds of situations.It is appreciated that in other embodiments, mould Type scene can also be other scenes.In the present embodiment, preset duration is 10000 seconds.

Reference voice signal segmentation module is used for according to the ratio between preset duration duration corresponding with preset time range, will be each Reference voice signal is divided into multiple signal segments.

Reference voice signal is the voice signal continuous collection at each moment in preset duration, by each reference voice signal point For multiple signal segments, a length of time interval when specifically can be corresponding with preset time range, according to chronological order pair Reference voice signal carries out interception segmentation, and the last voice signal less than in a time interval is as a signal segment.In this way, The corresponding reference voice signal of each model scene can be divided into multiple signal segments.

Reference voice feature obtains the sound characteristic that module is used to extract each signal segment, according to the sound characteristic of each signal segment The sound characteristic for obtaining corresponding reference voice signal obtains reference voice feature.

Mould is obtained by using reference voice signal acquisition module, reference voice signal segmentation module and reference voice feature Block acquires the reference voice signal of multiple model scenes in advance, and carries out feature extraction and obtain the reference voice of each model scene Feature, the representative strong and accuracy of obtained reference voice feature are high.

Matching audio mode searching module 170 is for searching default audio mould corresponding with the reference voice feature chosen Formula obtains matching audio mode.

In the present embodiment, presetting audio mode includes five kinds of news, night, movie theatre, standard and music modes, is respectively corresponded In the daytime the reference voice feature in parlor, night parlor, hotel, sales field and five kinds of dining room model scene.In the setting of audio mode, Be typically used for three kinds of standard techniques: total-sonic, total volume and total surround, total-sonic have On/off two states, total volume have tri- kinds of states of normal/night/off, and total surround has on/off Two states.

Audio effect processing module 190 is used to carry out audio effect processing to voice signal to be played according to matching audio mode.

After obtaining matching audio mode, audio effect processing is carried out according to matching audio mode, specifically using matching audio mould The corresponding standard technique of formula carries out audio effect processing to voice signal to be played automatically, so that the voice signal to be played of output adapts to In the environment being presently in.Wherein, voice signal to be played can be the sound of television signal of television set.

Above-mentioned sound effect processing system acquires the environment in preset time range by environmental sound signal acquisition module 110 Voice signal, ambient sound feature obtain module 130 and obtain ambient sound feature to environmental sound signal progress feature extraction；So Reference voice characteristic selecting module 150 is chosen from preset multiple reference voice features with ambient sound characteristic similarity most afterwards Big reference voice feature, matching audio mode searching module 170 search default sound corresponding with the reference voice feature chosen Effect mode obtains matching audio mode, and audio effect processing module 190 carries out sound to voice signal to be played according to matching audio mode Effect processing.In this way, most suitable matching audio mode, sound can be chosen automatically according to the ambient sound feature of environmental sound signal Imitate high treating effect；Meanwhile it being not necessarily to user's operation, improve the convenience that user uses.

Each technical characteristic of embodiment described above can be combined arbitrarily, for simplicity of description, not to above-mentioned reality It applies all possible combination of each technical characteristic in example to be all described, as long as however, the combination of these technical characteristics is not deposited In contradiction, all should be considered as described in this specification.

The embodiments described above only express several embodiments of the present invention, and the description thereof is more specific and detailed, but simultaneously It cannot therefore be construed as limiting the scope of the patent.It should be pointed out that coming for those of ordinary skill in the art It says, without departing from the inventive concept of the premise, various modifications and improvements can be made, these belong to protection of the invention Range.Therefore, the scope of protection of the patent of the invention shall be subject to the appended claims.

Claims

1. a kind of sound effect treatment method characterized by comprising

Acquire the environmental sound signal in preset time range；

It is chosen and the maximum reference voice feature of the ambient sound characteristic similarity from preset multiple reference voice features；

Audio effect processing is carried out to voice signal to be played according to the matching audio mode；

It before environmental sound signal in the acquisition preset time range, further include multiple model scenes being acquired respectively pre- If the voice signal in duration, multiple reference voice signals, the quantity of the model scene and the reference voice feature are obtained Quantity it is equal, the preset duration be greater than the corresponding duration of the preset time range；According to the preset duration with it is described Each reference voice signal is divided into multiple signal segments by the ratio between corresponding duration of preset time range；Extract the sound of each signal segment It is special to obtain the reference voice according to the sound characteristic that the sound characteristic of each signal segment obtains corresponding reference voice signal for feature Sign.

2. sound effect treatment method according to claim 1, which is characterized in that the preset time range is to receive and play It is initial time at the time of instruction, using preset value as the time range of duration.

3. sound effect treatment method according to claim 1, which is characterized in that described to carry out spy to the environmental sound signal Sign is extracted, and ambient sound feature is obtained, comprising:

The environmental sound signal is converted into digital signal；

To the digital signal carry out spectrum analysis obtain include multiple Frequency points frequency information；

The average value that the Frequency point in each predeterminated frequency section is calculated separately according to the frequency information, as each predeterminated frequency The characteristic value of section；

Calculate separately the characteristic value of each predeterminated frequency section and the product of corresponding predetermined coefficient, and calculate each sum of products obtain it is described Ambient sound feature.

4. sound effect treatment method according to claim 3, which is characterized in that the predeterminated frequency section includes 20hz- 200hz, 200hz-700hz, 700hz-2000hz, 2000hz-7000hz and 7000hz-15000hz.

5. a kind of sound effect processing system characterized by comprising

Reference voice characteristic selecting module, for being chosen and the ambient sound feature from preset multiple reference voice features The maximum reference voice feature of similarity；

Match audio mode searching module, the corresponding default audio mode of reference voice feature for searching with choosing obtains Match audio mode；

Audio effect processing module, for carrying out audio effect processing to voice signal to be played according to the matching audio mode；

It further include that reference voice signal acquisition module, reference voice signal segmentation module and reference voice feature obtain module；Ginseng Sound signal collecting module is examined, for the voice signal in multiple model scenes acquisition preset duration, obtaining multiple ginsengs respectively Voice signal is examined, the quantity of the model scene is equal with the quantity of the reference voice feature, and the preset duration is greater than institute State the corresponding duration of preset time range；Reference voice signal segmentation module, for being preset according to the preset duration with described Each reference voice signal is divided into multiple signal segments by the ratio between corresponding duration of time range；Reference voice feature obtains module, uses In the sound characteristic for extracting each signal segment, the sound for obtaining corresponding reference voice signal according to the sound characteristic of each signal segment is special Obtain the reference voice feature.

6. sound effect processing system according to claim 5, which is characterized in that the preset time range is to receive and play It is initial time at the time of instruction, using preset value as the time range of duration.

7. sound effect processing system according to claim 5, which is characterized in that the ambient sound feature obtains module packet It includes:

AD conversion unit, for the environmental sound signal to be converted to digital signal；

Spectral analysis unit, for the digital signal carry out spectrum analysis obtain include multiple Frequency points frequency information；

Characteristic value computing unit, for calculating separately the flat of the Frequency point in each predeterminated frequency section according to the frequency information Mean value, the characteristic value as each predeterminated frequency section；

Ambient sound feature calculation unit, for calculating separately the characteristic value of each predeterminated frequency section and multiplying for corresponding predetermined coefficient Product, and calculate each sum of products and obtain the ambient sound feature.

8. sound effect processing system according to claim 7, which is characterized in that the predeterminated frequency section includes 20hz- 200hz, 200hz-700hz, 700hz-2000hz, 2000hz-7000hz and 7000hz-15000hz.