CN100375996C

CN100375996C - Method for judging low-frequency audio signal in sound signals and apparatus concerned

Info

Publication number: CN100375996C
Application number: CNB031545823A
Authority: CN
Inventors: 吴俊德
Original assignee: MediaTek Inc
Current assignee: MediaTek Inc
Priority date: 2003-08-19
Filing date: 2003-08-19
Publication date: 2008-03-19
Anticipated expiration: 2023-08-19
Also published as: CN1584974A

Abstract

The present invention provides a method for judging whether human voice signals are mixed in a sound signal and a relevant apparatus thereof. In a multichannel system, the method is used to calculate the number of generating the situation that the amplitude of sound signals exceeds zero in unit time according to the sound signals in different channels; if the number of generating the situation that one certain sound signal in the first channel exceeds zero is larger than the number of generating the situation that another certain sound signal in the second channel exceeds zero by a definite threshold value, the sound signal in the first channel is judged to be mixed with the human sound signals.

Description

Judge the method and the relevant apparatus that whether are mixed with the low-frequency sound signal in the voice signal

Technical field

The invention provides a kind of method and relevant apparatus that whether is mixed with low-frequency sound (voice) signal in the voice signal of judging, refer in particular to a kind of with the low cost of calculating zero passage occurrence frequency in the voice signal, the people's acoustical signal method of discrimination and the relevant apparatus of low calculated amount.

Background technology

Along with the progress of information, electronic technology with popularize, in modern society, the form of amusement also gets over variation.For instance, the sing accompanying system of the title of Karaoke is arranged, background that just can played songs is dubbed in background music, and allows the user not need the accompaniment of philharmonic society, just can dub in background music with background and sing, and enjoys professional entertainment environment.Needs in response to sing accompanying system, modern amusement dealer is when release has professional singer to join the song of singing, can release this head song does not in the lump contain the singer and joins the background of singing voice (vocal) and dub in background music yet, allow the user listen to after professional singer joins the song of singing, also can utilize sing accompanying system to play background and dub in background music, oneself enjoys the enjoyment of singing.

Because the rapid progress of information storage, communications, present electronic technology can have been joined the song of singing voice with containing and not contain and joined the background music of singing voice and be stored in same medium with the pattern synchronization of different sound channels, select a broadcast by the user.Please refer to Fig. 1.Fig. 1 is the function block schematic diagram of a conventional playing device 10.Playing device 10 can be a Disc player (player) or a CD drive (drive) that cooperates a computing machine (not shown) job, goes up the song video-audio data that stores to read a CD 24C, and is play.Finish its function with playing circuit 12 in the playing device 10, then be provided with a receiving circuit 14, a processing module 16, a change-over circuit 18, an interface circuit 20 and a loudspeaker 22 in the playing circuit 12.Be provided with a motor 24A and a read head 24B in the receiving circuit 14,, read, analyze its signal that carries 25 so that among CD 24C as the information storage medium.16 of processing modules are used for the function of master control playing device 10, wherein are provided with a processing unit 26A and and select circuit 26B.Processing unit 26A is used for the signal 25 that receiving circuit 14 produces is made further signal Processing (as separating modulation, decoding or the like).As previously mentioned, existing data processing technique can be joined the song of singing voice and joins the song of singing voice and be stored on the same medium (as CD 24C) with the pattern of different sound channels with not containing containing, and processing unit 26A also just can parse voice signal 27A, the 27B of different sound channels by in the signal 25.In addition, interface circuit 20 can be a control panel, is used for accepting user's control operation, and user's control operation is converted to electronic signal, processing module 16 transfers to processing module 16, so that can be come the work of controls playing device 10 according to user's control operation.As select circuit 26B, can accept the control of user by interface circuit 20, select with

voice signal

27A, 27B one of them as signal 29A, transfer in the change-over circuit 18.Change-over circuit 18 can be that a digital revolving die is intended change-over circuit, be converted to the signal 29B of simulation with the digital signal 29A that will select circuit 26B to transmit, to utilize simulating signal 29B to drive loudspeaker 22, will play back corresponding to the sound wave of signal 29A by loudspeaker 22, the user can be heard.

In other words, in the playing device 10 of routine, processing unit 26A can analyze out with different channel sound signal 27A, the 27B that is stored in simultaneously on the CD 24C, and by the control operation of user via interface circuit 20, selection is

voice signal

27A or 27B will be played back.In general, under existing audio and video information specification (as the DVD specification, Digital Versatile Disc), be that standard has left and right sound channels usually, can store different voice signals.Utilize the pattern of left and right acoustic channels, just can be simultaneously in CD 24C, store respectively to contain to join the song of singing voice and do not contain and join the background of singing voice and dub in background music; And the user just can be by the switching controls operation to playing device 10, selects broadcast that the song of voice is arranged or do not contain to join the background of singing voice and dub in background music.

Though the voice signal that above-mentioned configuration can allow the user play different frequency bands is enjoyed different enjoyment, but in existing audio and video information specification, standard voice song and the background voice signal of dubbing in background music will not be stored in which sound channel in the left and right acoustic channels respectively, so various music media resources on market, some is the background of no voice to be dubbed in background music be stored in L channel, some then is the background of no voice to be dubbed in background music be stored in R channel, and is unable to decide which is right.Jointly, the user also will be to attempt the method for debugging, and switching controls operation playing device 10 could broadcast the sound of being wanted smoothly.For instance, the user wants to play the background that does not contain voice and dubs in background music to enjoy the enjoyment of vocal accompaniment, but can not determine it is in that background is dubbed in background music for which sound channel, so the user only can select to broadcast earlier the wherein sound of a sound channel, if what play back is to contain voice to join the song of singing, also want control operation playing device 10 to switch the sound that broadcasts another sound channel, just can successfully play the background that does not contain voice finally and dub in background music.So, be quite inconvenience naturally to the user, the process of control operation is also very loaded down with trivial details.

Summary of the invention

Therefore, fundamental purpose of the present invention is to propose a kind of method and relevant apparatus that can detect people's acoustical signal place sound channel automatically, to overcome the shortcoming of routine techniques.

In routine techniques, because the music media resource might be dubbed in background music the background of no voice and be existed in a left side or the R channel, there is not certain standard, and the playing device of routine techniques can not detect people's acoustical signal place sound channel automatically, make that the user only can be to attempt the mode of debugging, which type of music guesses, is tested in the end left and right acoustic channels what store respectively voluntarily is, concerning the user and inconvenient.

In the present invention, then be to utilize the frequency ratio background of the voice low principle of frequency of dubbing in background music, the frequency that zero passage in the voice signal of calculating, comparison two sound channels (level of voice signal is crossed over zero level) takes place, if the zero passage occurrence frequency of a sound channel is low more than another sound channel zero passage occurrence frequency, can judge to be mixed with people's acoustical signal in this sound channel.After detecting the sound channel at people's acoustical signal place, whether the playing device among the present invention can will play the demand of background music according to the user, and selection will be play L channel or R channel automatically.So, the user just again needn't be voluntarily to attempt the mode of debugging, control operation playing device blindly.

The required calculated amount of voice sound channel detection method disclosed by the invention is few, so can implement in the mode of software, hardware or firmware simple, quick, cheaply; And method disclosed by the invention also can promote the use in the automatic detection of low frequency signal place sound channel except detecting voice place sound channel, with extremely low calculated amount, realizes that low frequency signal detects fast and effectively.

Description of drawings

Fig. 1 is the function block schematic diagram of a conventional playing device.

Fig. 2 is the synoptic diagram of various voice signal typical waveforms.

What Fig. 3 schematically illustrated is the algorithm that the present invention judges the voice sound channel.

Fig. 4 is the function block schematic diagram of the playing device that is used for realizing Fig. 3 algorithm among the present invention.

Fig. 5 tabulates expression is the present invention actual when enforcement zero passage cumulative number in different sound channels.

The reference numeral explanation

10,30 playing devices, 12,32 playing circuits

14,34 receiving circuits, 16,36 processing modules

18,38 change-over circuits, 20,40 interface circuits

22,42

loudspeaker

24A, 43A motor

24B, 43B read head 24C, 43C CD

25,29A-29B, 45,49A-49B signal

26A,

46A processing unit

26B, 46B select circuit

27A-27B, 47A-47B voice signal

50 decision circuitry 52A-52B detection modules

54 comparison module 56A-56B over-zero counting results

58 comparative results, 100 algorithms

200 form CL1, CL2 array

RW1-RW14 walks crosswise

C1 comparing unit C2 computing unit

D delay cell Vn, Mn, Sn waveform

The t1-t5 time point T1-T2 time period

L0 reference level L1-L3 level

Embodiment

For further specifying the principle of the technology of the present invention, please refer to Fig. 2; Fig. 2 is the synoptic diagram of the corresponding waveform of various voice signals; The transverse axis of each waveform is represented the time, and the longitudinal axis is represented the amplitude size of each waveform.As is known to the person skilled in the art, in the voice signal of numeral, be to represent the amplitude size of a sound wave at different sampling time points with the data of respectively organizing of series arrangement.Set is respectively organized data, the amplitude size that can rebuild back this voice signal correspondence sound wave in the voice signal.For instance, in Fig. 2, by a voice signal respectively organizing in the data of each sampling time point correspondence of time point t1, t2, t3 or the like amplitude L1, the L2 of record and L3 or the like respectively, just can form waveform Sn.And in Fig. 2, waveform Vn promptly represents the typical waveform of the voice signal that voice is only arranged, only the have powerful connections typical waveform of the voice signal of dubbing in background music of waveform Mn representative, waveform Sn is mixed with the typical waveform that voice and background are dubbed in background music, just with the result of (the additive mixing of for example saying so) behind waveform Vn, the Mn audio mixing; The reference level L0 that indicates respectively in each waveform Mn, Vn and Sn, just representing amplitude is zero zero level.

Basically, voice in the song is low frequency comparatively usually partly, and shown in the waveform Vn among Fig. 2, its wave form varies is comparatively mild.Relatively, the music of coming out by instrument playing during background is dubbed in background music, usually has higher frequency, and that various musical instruments begin, finish opportunity of playing is also inconsistent, so the waveform Mn of background music has comparatively violent variation usually, its amplitude can be vibrated between positive and negative continually, just as shown in Figure 2.And dub in background music than the waveform Vn of low frequency and background that the waveform Mn of higher-frequency mixes mutually and after forming song, the waveform Sn that it blends then can present the feature that high-frequency signal is loaded on low frequency signal, as shown in Figure 2 when voice.Waveform Mn that observing only has powerful connections dubs in background music and the song waveform Sn that is mixed with voice can find, in waveform Sn, though signal still includes the high frequency part of acute variation, owing to added the voice part than low frequency among the waveform Sn, its waveforms amplitude just can not vibrated between positive and negative continually.In other words, be mixed with the waveform Sn of voice, its amplitude is passed through zero level in the unit interval (zero passage just, number of times zero-crossing) will come fewly more than the waveform Mn of the music of only having powerful connections.For instance, just as shown in Figure 2, in time period T1, the waveform Mn of high frequency thermal agitation has nine zero passages (as between time point t4a, t4b and t5a, t5b, or the like), the waveform Sn that sneaks into low frequency voice just only has three zero passages (as between time point t6a, t6b, or the like).In like manner, at follow-up time period T2 or the like, also can find out the waveform Sn that is mixed with low frequency voice, it is many that the number of times of its amplitude zero passage in the unit interval can lack than the waveform Mn of the music of only having powerful connections.Characteristic according to above-mentioned this voice signal, the present invention can utilize what (frequencies of zero passage generation just) of zero passage number of times in the unit interval, the voice signal that compares, judges which sound channel is mixed with voice, and the voice signal of which sound channel is only had powerful connections and dubbed in background music.

Comprehensive the above as can be known, as long as the voice signal that calculates two sound channels respectively is the number of times of zero passage within a certain period of time, if wherein the zero passage number of times of a voice signal A can be judged the people's acoustical signal that is mixed with low frequency among the last voice signal A much smaller than the zero passage number of times of another voice signal B.Please refer to Fig. 3.Algorithm 100 among Fig. 3 promptly is that the algorithm that the technical conceive that the present invention is above-mentioned is described with program language presents.In algorithm 100, parameter LnZCR, RnZCR are used for writing down the count results of voice signal zero passage number of times in the left and right sound channels respectively, and parameter Ln, Rn just represent the voice signal in the left and right sound channels respectively.As previously mentioned, in voice signal, be that amplitude with different sampling time points is recorded in and respectively organizes data in the voice signal, so parameter Ln, Rn can be considered as the array parameter, come respectively to organize data in the representative voice signal respectively with different indexs.As shown in Figure 3, the A1 of algorithm 100 is used for the tired number of times of calculating zero passage among the voice signal Ln; At different index I, on behalf of the voice signal of parameter Ln correspondence, positive and negative after relatively two adjacent groups data Ln (I) and Ln (I+1) (amplitude of adjacent two sampling time points in the voice signal just) multiply each other if negatively just between the sampling time point of these two groups of data correspondences zero passage has taken place; And but on behalf of the number of times of zero passage in the corresponding voice signal of parameter Ln, also just progression 1 of parameter LnZCR increased once again.And when reality is implemented the A1 part, can set the upper limit of index I progression with a parameter SampleLength; In other words, parameter SampleLength is just corresponding to a Preset Time section, the A1 of algorithm 100 partly will calculate the number of times of voice signal zero passage accumulative total in this Preset Time section of parameter Ln correspondence exactly, and zero passage number of times result calculated is stored in parameter LnZCR.In like manner, the A2 of algorithm partly will calculate the number of times of voice signal (voice signal of another sound channel just) zero passage in same Preset Time section (being controlled by parameter SampleLength equally) of parameter Rn correspondence exactly, and the result of number of times accumulative total is stored in parameter RnZCR.

At the A3 of algorithm 100 of the present invention partly, promptly be to be used for the zero passage number of times of comparison two voice signals, be the voice that voice signal in which sound channel is mixed with low frequency to judge actually.Shown in the algorithm 100 of Fig. 3, if the zero passage number of times LnZCR of the corresponding voice signal of parameter Ln, just can judge the voice that is mixed with in the voice signal of parameter Rn correspondence than low frequency more than the zero passage number of times RnZCR big (degree that both differ is greater than a preset threshold value (threshold)) of another voice signal.Relatively, if the zero passage number of times LnZCR of voice signal in the Preset Time section of parameter Ln correspondence comes for a short time (degree that differs is greater than threshold value (threshold)) more than the zero passage number of times RnZCR of another voice signal in same Preset Time section, just can judge in the voice signal of parameter Ln correspondence just to be mixed with voice.If the mutual relationship between zero passage number of times LnZCR, the RnZCR of two sound channels does not meet above-mentioned both (as the differences between two zero passage number of times less than threshold value (threshold)), just may be that the voice signal of two sound channels all is mixed with voice, or both all is not mixed with voice.If this situation, the present invention's this moment also can take other step in addition.For instance,, then can carry out one and reduce step if the voice signal of two sound channels all is mixed with voice, voice signal is logical by a specific filter or other signal Processing, with the low frequency voice in the voice signal reduce, filtering; For example be to refuse (band-stop) wave filter, with the target signal filter of people's sonic-frequency band in the voice signal with a band.

In other words, utilize the present invention to be disclosed in the voice signal number of times that zero passage takes place in the unit interval (Preset Time section) of the algorithm 100 more different sound channels among Fig. 3, the voice signal that just can judge which sound channel is mixed with voice.Please note that the required calculated amount of algorithm of the present invention 100 is extremely low, only need merely relatively the positive negative value of voice signal two adjacent groups data judging whether zero passage takes place, and the number of times of the zero passage generation that adds up.So algorithm of the present invention 100 can be simple, fast, low-cost, implement with software, hardware circuit or firmware or the like various forms expeditiously, do not need filtering, frequency spectrum to calculate the loaded down with trivial details data processing and the signal Processing of contour calculated amount fully.In fact, in general digital audio signal, representing and having all in every group of data of amplitude that one (bit) be used for representing the positive and negative of amplitude (is positive and negative position, sign bit), so when judging whether zero passage occurs between the two adjacent groups data, only need XOR (XOR, exclusive OR) computing is carried out in the positive and negative position of these two groups of data; If the positive and negative position of these two groups of data is different, just represent zero passage to take place.Utilize the mutual exclusion exclusive disjunction of positive and negative position to judge zero passage, algorithm 100 of the present invention is also just worked more apace, and required operand is also just lower.

Please refer to Fig. 4.Fig. 4 is the function block schematic diagram of the invention process in a playing device 30.Finish its function with playing circuit 32 in the playing device 30, then be provided with a receiving circuit 34, a processing module 36, an interface circuit 40, a change-over circuit 38 and a loudspeaker 42 in the playing circuit 32.Playing device 30 can be a Disc player (player) or a CD drive (drive), and it can be provided with a motor 43A and a laser read head 43B, with by a CD 43C read output signal 45 (as video-audio signal).Processing module 36 can be provided with a processing unit 46A, a decision circuitry 50 and is selected circuit 46B; 40 of interface circuits can be control panels, are used for accepting user's control, and processing module 36 can be controlled according to the user that interface circuit 40 receives, the work of control operation playing device 30.Wherein, processing unit 46A in the processing module 36 can do signal 45 further signal Processing (as decoding, separate modulation),

voice signal

47A, 47B by different sound channels about parsing in the signal 45, and under the control of selecting circuit 46B, select one of them to become signal 49A between the two at voice signal 47A, 47B.And change-over circuit 38 can be converted to the signal 49A of digital form the signal 49B of simulation, transfers to loudspeaker 42, plays back so that signal 49B is converted to sound wave.

In playing device 30 of the present invention, select circuit 46B except the same with conventional playing device 10, can be according to the control of user by interface circuit 40, outside wherein the voice signal of a sound channel is play by the manual selection of user, can also realize the algorithm of the present invention in Fig. 3 with decision circuitry 50, automatically in voice signal 47A, the 47B of left and right acoustic channels, tell the voice signal that is mixed with voice, and control is selected circuit 40 to select suitable voice signal to become signal 49A.In other words, user's operation-interface of playing device 30 of the present invention, except broadcasting the voice signal of left and right acoustic channels by user's manual switchover, also can set up as " karaoke mode " operator scheme of (maybe can claim " unmanned sound pattern "); In case the user enters this pattern, decision circuitry 50 of the present invention will be started working, automatically in

voice signal

47A, 47B, select be not mixed with voice voice signal as signal 49, and it is played back by change-over circuit 38, loudspeaker 42.So, the user just needn't could find the background that is not mixed with voice to dub in background music in left and right acoustic channels via loaded down with trivial details trial debugging.Certainly, equivalently, playing device 30 of the present invention also can have another " the song pattern ", in case user's control operation playing device 30 carries out this pattern, decision circuitry 50 will be selected the song voice signal that is mixed with voice and be play in

voice signal

47A, 47B.

In order to realize the algorithm 100 of the present invention in Fig. 3, can realize out two

detection module

52A, 52B and a comparison module 54 in the decision circuitry 50.

Detection module

52A, 52B are used for calculating the zero passage number of times among voice signal 47A, the 47B of left and right acoustic channels respectively, and produce over-zero counting 56A, 56B as a result respectively; That is to say that

detection module

52A, 52B are used for realizing the A1 part and the A2 part of algorithm 100 among Fig. 3 respectively.54 of comparison modules can be realized the A3 part of algorithm 100, how many relations according to

voice signal

47A, 47B zero passage number of times in the Preset Time section, automatically judge which voice signal is that the background that is not mixed with voice is dubbed in background music, and produce the comparative result 58 of a correspondence.According to comparative result 58, select circuit 46B just can in

voice signal

47A, 47B, select an appropriate signals, transfer to change-over circuit 38 as signal 49A.Whether the embodiment of

detection module

52A, 52B all is identical basically, is example with detection module 52A, can realize a delayer D among the detection module 52A, be one positive one negative with two groups of data before and after a comparing unit C1 is relatively among the voice signal 47A; As previously mentioned, comparing unit C1 can be a mutual exclusion or arithmetic logic unit, whether identical with the positive and negative position of two groups of adjacent before and after among the voice signal 47A relatively data.If these two groups of data is one positive one negative really, represent zero passage to take place, and comparing unit C1 just can trigger the number of times progression 1 of a computing unit C2 with zero passage; Otherwise if these two groups of data jack per lines (being all plus or minus), comparing unit C1 just can not trigger computing unit C2 progression 1.Through (defined as parameter SampleLength among Fig. 3) after certain Preset Time section, the over-zero counting that comparing unit C1 just can add up zero passage number of times 56A as a result transfers to comparison module 54.In the present invention, the allomeric function of decision circuitry 50 can be realized with simple logical circuit, or realize with form of firmware.In other words, the algorithm 100 among Fig. 3 can be compiled into a procedure code, be stored in the relevant nonvolatile memory of processing module 36 (as flash memory, but not shown in Figure 4).Processing module 36 is carried out the function of this procedure code, just can realize the function of decision circuitry 50, judges that automatically that is mixed with voice among voice signal 47A, the 47B.

Be the result of the actual enforcement of explanation the present invention, please refer to Fig. 5 (and in the lump with reference to figure 3).After form 200 among Fig. 5 is the actual voice signal that is implemented on two sound channels about a typical music media of algorithm 100 (see figure 3)s of the present invention, the zero passage number of times that true accumulative total is come out.CL1 in upright arrangement, CL2 in the form 200 writes down respectively be about the zero passage number of times of two sound channels, the result that on behalf of algorithm 1 00, CL3 in upright arrangement then partly judge at A3; The zero passage number of times that two sound channels add up is respectively then represented in the different time periods in each line (walking crosswise RW1, RW2 to RW14 as what mark among Fig. 5).When drawing the form 200 of Fig. 5, the voice signal of two sound channels has 44100 hertz of sampling frequencies (HZ), and just each voice signal has 44100 groups of data in a second; The tired Preset Time section of calculating of zero passage number of times is (to that is to say that the parameter SampleLength in the algorithm 100 was made as 44100 in 1 second; Because 44100 groups of data are arranged) in one second; And will draw comparative result the time, the threshold value in the algorithm 100 (threshold) then is made as 200.Every the time span of a Preset Time section, just perform calculations method 100 once again.For instance, be exactly at N to the (N+1) in second as what walk crosswise RW1 representative among Fig. 5, about two sound channels zero passage is arranged respectively 4527 and 1308 times; After the A3 of the method for performing calculations 100 part, the voice signal that can judge L channel is not mixed with voice (because of the zero passage number of times of its L channel is bigger than R channel zero passage number of times, and both difference value are greater than threshold value (threshold)).In ensuing (N+1) to (N+2) second, algorithm 100 is carried out once again again, begins to add up the number of times of two sound channel zero passages by 0 once again; And the result of its counting is just as walking crosswise shown in the RW2, about zero passage is arranged 2569 and 1673 times in two sound channels respectively, similarly also can judge voice and be mixed in R channel.Walking crosswise RW3 then is the zero passage number of times and the comparative result of algorithm 100 accumulative total in (N+2) to (N+3) second.At last, walking crosswise among the RW14, promptly is in (N+13) to (N+14) second, the zero passage number of times and the comparative result of two sound channels.And behind the actual sound of listening to left and right acoustic channels, also can find, voice be mixed in R channel really, and L channel is that the background of no voice is dubbed in background music.In summary, as shown in Figure 5, be disclosed in the algorithm 100 among Fig. 3 according to the present invention, the voice signal that can correctly judge which sound channel really is mixed with voice.

As previously mentioned, in playing device 30 of the present invention (asking for an interview Fig. 4), can set up as " karaoke mode " or " the song pattern ", realize algorithms 100 by decision circuitry 50, judge the sound channel at voice place automatically.When real work, decision circuitry 50 also can be according to situation among Fig. 5, every one section Preset Time section, just again by the zero passage number of times of 0 each voice signal of accumulative total, carries out again relatively reaching judgement between two sound channels; And decision circuitry 50 also can be selected suitable sound channel constantly according to the comparative result in each time period.In addition, the setting of threshold value in the algorithm 100 (threshold) then is to be used for preventing the generation of judging by accident.Number of times by zero passage in each sound channel is a random value, and under some comparatively special situation, in some time period, the sound channel that might be mixed with voice has more zero passage than the sound channel of no voice on the contrary, but both zero passage number of times differ must be limited; So in algorithm 100, set suitable threshold, just can prevent that the situation of judging by accident from taking place.That is to say that only the zero passage number of times in two sound channels differs above threshold value, judge that by the zero passage number of times voice place sound channel is only significant; If the degree very few (being less than threshold value) that differs of two sound channel zero passage number of times, zero passage number of times difference between the two may be just just some occur at random zero passage caused, lack meaning.But, can find out that the chance that this special case takes place is not high by the example of Fig. 5.

Design of the present invention is except using Disc player, CD drive, but also widespread even becomes the some of software playing program in the computing machine at other playing device.For instance, except being the optical disc servo mechanism as Fig. 4, also can be a wired or wireless networking interface circuit at the receiving circuit among Fig. 4 34, can obtain video-audio signal by wired or wireless networking.Also have, just as among Fig. 3 to the discussion of the A3 part of algorithm 100, can realize in addition also that in processing module 36 one reduces filtration module (being not shown in Fig. 4), when the difference of zero passage number of times between two

voice signal

47A, 47B surpasses threshold value, promptly can this reduce that filtration module reduces, the voice in the filtering sound signal.In addition, in computing machine, the audio/video file of some special format (as the music file of MP3 format) often will be decoded, be play with a playout software, and algorithm of the present invention also may be implemented in this kind playout software, allows playout software itself can judge the sound channel at voice place automatically.In addition, by the principle discussion of the present invention in Fig. 2 as can be known, the present invention also can be widely used in the system of multichannel except the sound sound channel of finding out the voice place, with low calculated amount, low cost, quick efficient straightforward procedure, find out the sound channel that is mixed with low frequency signal.

In the playing device of routine techniques, because the method that lacks effective, low operand is judged the sound channel at voice place in the multi-channel system, so the user only can own carry out manual switchover with the method for attempting debugging, could tell smoothly in the signal of which sound channel and be mixed with voice.In comparison, the present invention then discloses the method and the relevant apparatus of a low cost, low operand, can calculate the number of times of zero passage in each channel sound signal in the Preset Time section, and judges in which sound channel according to the difference of zero passage number of times and to be mixed with voice.So, the present invention just can be judged the sound channel at voice place automatically by playing device, allows the user save the trouble of attempting debugging voluntarily, and the user is provided audio-visual more easily broadcast service.

The above only is preferred embodiment of the present invention, and all equivalences of making according to claim of the present invention change and revise, and all should belong to covering scope of the present invention.

Claims

1. judge the method that whether is mixed with a low-frequency sound signal in the voice signal for one kind; Include multi-group data in this voice signal, each is organized data and represents the amplitude size of a sound wave at different time respectively; And this method includes:

Set a reference level and a Preset Time section;

Carry out a calculation procedure,, calculate the amplitude of this sound wave and in this Preset Time section, cross over the number of times of this reference level, and produce the count results of a correspondence with according to this multi-group data; And

Carry out a determining step,, judge whether sneak into this low-frequency sound signal in this voice signal with according to this count results.

2. the method for claim 1 wherein when judging according to this count results,, is then judged and is sneaked into this low-frequency sound signal in this voice signal less than a preset value as if this count results.

3. the method for claim 1 wherein when judging according to this count results,, is then judged and is not sneaked into this low-frequency sound signal in this voice signal greater than a preset value as if this count results.

4. the method for claim 1, wherein the frequency band range of this low-frequency sound signal is the frequency band range of voice.

5. the method for claim 1, wherein when carrying out this calculation procedure, in should the multi-group data of Preset Time section, relatively one group of data and one group data whether have respectively one group of data greater than and less than this reference level; If these group data and this time group data have one group of data respectively greater than reaching less than this reference level, judge that then this sound wave has this reference level of leap between these group data and this time group data.

6. the method for claim 1, wherein this reference level is a zero level.

7. the method for claim 1, it includes in addition: if judge in this voice signal and sneaked into this low-frequency sound signal, then carry out one and reduce step, to reduce the size of this low-frequency sound signal in this voice signal.

8. the method for claim 1, it includes in addition:

Obtain a second sound signal, include multi-group data in this second sound signal, each is organized data and represents the amplitude size of one second sound wave at different time respectively;

According to the multi-group data in this second sound signal, calculate the amplitude of this second sound wave and in this Preset Time section, cross over the number of times of this reference level, and produce second count results of a correspondence; And

When carrying out this determining step, whether greater than this second count results, judge whether sneak into this low-frequency sound signal in this voice signal according to the count results of this voice signal.

9. method as claimed in claim 8 is wherein when carrying out this determining step, if this count results than the little threshold value of this second count results, is then judged and sneaked into this low-frequency sound signal in this voice signal.

10. playing circuit, it includes:

One decision circuitry is used for judging whether be mixed with a low-frequency sound signal in the voice signal;

Include multi-group data in this voice signal, each is organized data and represents the amplitude size of a sound wave at different time respectively; This decision circuitry includes:

One detection module is used for according to this multi-group data, calculates the amplitude of this sound wave and cross over the number of times of this reference level in this Preset Time section, and produce the count results of a correspondence;

One comparison module is used for according to this count results, judges whether sneak into this low-frequency sound signal in this voice signal.

11. playing circuit as claimed in claim 10, wherein if this count results less than a preset value, then this comparison module can be judged and sneaks into this low-frequency sound signal in this voice signal.

12. playing circuit as claimed in claim 10, wherein if this count results greater than a preset value, then this comparison module can be judged and not sneak into this low-frequency sound signal in this voice signal.

13. playing circuit as claimed in claim 10, wherein the frequency band range of this low-frequency sound signal is the frequency band range of voice.

14. playing circuit as claimed in claim 10, wherein this detection module can be in should the multi-group data of Preset Time section, relatively one group of data and one group data whether have respectively one group of data greater than and less than this reference level; If these group data and this time group data have one group of data respectively greater than reaching less than this reference level, then this detection module can judge that this sound wave has this reference level of leap between these group data and this time group data.

15. playing circuit as claimed in claim 10, wherein this reference level is a zero level.

16. playing circuit as claimed in claim 10, it can receive a second sound signal in addition, includes multi-group data in this second sound signal, and each is organized data and represents the amplitude size of one second sound wave at different time respectively; And include in addition in this decision circuitry:

One second detection module is used for according to the multi-group data in this second sound signal, calculates the amplitude of this second sound wave and cross over the number of times of this reference level in this Preset Time section, and produce second count results of a correspondence;

And this comparison module according to the count results of this voice signal whether greater than this second count results, judge whether sneak into this low-frequency sound signal in this voice signal.

17. playing circuit as claimed in claim 16, wherein if this count results than the little threshold value of this second count results, then this comparison module can be judged and sneaks into this low-frequency sound signal in this voice signal.

18. playing circuit as claimed in claim 16, it includes a loudspeaker in addition, is used for the result that judges according to this comparison module, this voice signal or this second sound signal is converted to sound wave plays back.

19. playing circuit as claimed in claim 10, it includes a receiving circuit in addition, is used for producing this voice signal.

20. playing circuit as claimed in claim 19, wherein this receiving circuit can be by reading this voice signal on the CD.