CN106601269A - Terminal state determining method and apparatus - Google Patents
Terminal state determining method and apparatus Download PDFInfo
- Publication number
- CN106601269A CN106601269A CN201611233992.8A CN201611233992A CN106601269A CN 106601269 A CN106601269 A CN 106601269A CN 201611233992 A CN201611233992 A CN 201611233992A CN 106601269 A CN106601269 A CN 106601269A
- Authority
- CN
- China
- Prior art keywords
- voice signal
- signal
- terminal
- descending
- threshold value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 64
- 206010048669 Terminal state Diseases 0.000 title abstract 2
- 230000000644 propagated effect Effects 0.000 claims description 7
- 230000000052 comparative effect Effects 0.000 claims description 6
- 230000001902 propagating effect Effects 0.000 claims description 6
- 230000005236 sound signal Effects 0.000 abstract description 30
- 238000004364 calculation method Methods 0.000 abstract description 3
- 238000012545 processing Methods 0.000 abstract description 3
- 238000004891 communication Methods 0.000 description 22
- 230000008569 process Effects 0.000 description 13
- 238000010586 diagram Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 6
- 230000003287 optical effect Effects 0.000 description 4
- 230000000712 assembly Effects 0.000 description 3
- 238000000429 assembly Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000001133 acceleration Effects 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 238000001228 spectrum Methods 0.000 description 2
- KLDZYURQCUYZBL-UHFFFAOYSA-N 2-[3-[(2-hydroxyphenyl)methylideneamino]propyliminomethyl]phenol Chemical compound OC1=CC=CC=C1C=NCCCN=CC1=CC=CC=C1O KLDZYURQCUYZBL-UHFFFAOYSA-N 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 210000004899 c-terminal region Anatomy 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 201000001098 delayed sleep phase syndrome Diseases 0.000 description 1
- 208000033921 delayed sleep phase type circadian rhythm sleep disease Diseases 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 239000012092 media component Substances 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M9/00—Arrangements for interconnection not involving centralised switching
- H04M9/08—Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L2021/02082—Noise filtering the noise being echo, reverberation of the speech
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Quality & Reliability (AREA)
- Telephone Function (AREA)
Abstract
The invention provides a terminal state determining method and apparatus and belongs to the field of signal processing. The method comprises the following steps: playing downlink sound signals to enable echo signals to be formed after the downlink sound signals are spread based on a spreading path besides a terminal, wherein the signal frequency of the echo signals is higher than a preset frequency threshold; picking up sound signals in a preset scope around the terminal to as to obtain uplink sound signals; determining whether the downlink sound signals are noise signals; determining whether uplink sound signals comprise sound signals lower than the preset frequency threshold; and if the uplink sound signals are not the noise signals and the uplink sound signals do not include the sound signals lower than the preset frequency threshold, determining that the terminal is at a dual-speaking state. The method provided by the invention can determine whether the terminal is at the dual-speaking state and is less influenced by the external environment of the terminal, thereby being applied to a variable terminal external environment. Besides, the calculation is quite simple, and thus the universality is quite high.
Description
Technical field
It relates to field of signal processing, more particularly to a kind of SOT state of termination determines method and device.
Background technology
In hand-free calls such as instant messaging, videoconference, IP (Internet Protocol, iso-ip Internetworking protocol ISO-IP) phones
During, need the real-time voice exchange for carrying out both sides.
During real-time voice is exchanged, the sound-producing device of terminal, such as loudspeaker can play communication opposite end (remote
End) voice signal that sends, that is to say the sound of callee user, the voice signal that sound-producing device is played is based on exterior of terminal
The echo signal that propagation path is formed after propagating can be by the sound pick up equipment of terminal, such as microphone pickup, while sound pick up equipment
The voice signal of calling part user can also be picked up.Then, the sound of the calling part user that terminal can pick up sound pick up equipment
Signal and echo signal are sent to communication opposite end, and this results in callee user except the sound of the above-mentioned calling part user of uppick
The sound of oneself is can listen to outward, so as to have a strong impact on speech quality.Wherein, the sound letter that sound-producing device is played in terminal
Propagation path number based on exterior of terminal is picked up after propagating by sound pick up equipment, and is transferred back to the phenomenon of communication opposite end and be referred to as double saying.
In order to ensure speech quality, terminal generally requires to carry out echo cancellation, that is to say what is picked up in sound pick up equipment by filtering method
Echo signal is filtered in voice signal, and the prerequisite for carrying out echo cancellation is correctly to determine whether terminal says shape in double
State.
In correlation technique, determine whether the method in double speaking state has energy comparison, such as Geigel algorithms to terminal, and
Cross-correlation comparison method, such as Benesty algorithms.However, the theoretical foundation of energy comparison be exterior of terminal propagation path it is basic
Unanimously, that is to say that energy comparison is applied to fixed exterior of terminal environment, its scope of application is narrower;Though and cross-correlation comparison method
So accuracy of judgement degree is higher, but its calculating is sufficiently complex, and amount of calculation is more huge, some terminals cannot the load cross-correlation compare
The amount of calculation of method.Therefore, the double speaking state in correlation technique determines that the versatility of method is relatively low.
The content of the invention
For the problem for overcoming the determination method versatility of double speaking state present in correlation technique relatively low, disclosure offer one
Plant the SOT state of termination and determine method and device.
According to the first aspect of the embodiment of the present disclosure, there is provided a kind of SOT state of termination determines method, including:
Descending voice signal is played out, shape after making the descending voice signal propagate based on the propagation path outside terminal
Into echo signal, the signal frequency of the echo signal is higher than predeterminated frequency threshold value;
Voice signal in the terminal surrounding preset range is picked up, to obtain up voice signal;
Judge whether the descending voice signal is noise signal;
Judge the up voice signal whether comprising the signal less than the predeterminated frequency threshold value;
If the descending voice signal is not noise signal, and, the up voice signal is comprising less than the default frequency
The signal of rate threshold value, it is determined that the terminal is in double speaking state.
Optionally, it is described to judge whether the descending voice signal is noise signal, including:
Obtain the energy of the descending voice signal and the zero-crossing rate of the descending voice signal;
The energy of the comparison descending voice signal and the energy of noise signal;
The zero-crossing rate of the comparison descending voice signal and the zero-crossing rate of noise signal;
According to comparative result, judge whether the descending voice signal is noise signal.
Optionally, it is described that descending voice signal is played out, including:
The descending voice signal is played out by the sound-producing device of the terminal, the sound-producing device is to broadcast
Put the device of the voice signal higher than predeterminated frequency threshold value.
Optionally, methods described also includes:
If the descending voice signal is not noise signal, and, the up voice signal is not comprising default less than described
The signal of frequency threshold, it is determined that the terminal is in distal end talk situation.
Optionally, methods described also includes:
If the descending voice signal is noise signal, and, the up voice signal is comprising less than the predeterminated frequency
The signal of threshold value, it is determined that the terminal is in near-end talk situation.
According to the second aspect of the embodiment of the present disclosure, there is provided a kind of SOT state of termination determining device, including:
Playing module, is configured to play out descending voice signal, makes the descending voice signal based on outside terminal
Propagation path propagate after form echo signal, the signal frequency of the echo signal is higher than predeterminated frequency threshold value;
Pickup module, is configured to be picked up the voice signal in the terminal surrounding preset range, with acquisition
Row voice signal;
Judge module, is configured to judge whether the descending voice signal is noise signal;
The judge module, is additionally configured to judge whether the described up voice signal of the pickup module pickup includes
Less than the signal of the predeterminated frequency threshold value;
Determining module, it is not noise signal to be configured in the descending voice signal, and, the up voice signal bag
During containing the signal for being less than the predeterminated frequency threshold value, determine that the terminal is in double speaking state.
Optionally, the judge module is configured to:
Obtain the energy of the descending voice signal and the zero-crossing rate of the descending voice signal;
The energy of the comparison descending voice signal and the energy of noise signal;
The zero-crossing rate of the comparison descending voice signal and the zero-crossing rate of noise signal;
According to comparative result, judge whether the descending voice signal is noise signal.
Optionally, the playing module is configured to:
The descending voice signal is played out by the sounding component of the terminal, the sounding component is to broadcast
Put the component of the voice signal higher than predeterminated frequency threshold value.
Optionally, the determining module is additionally configured to:
It is not noise signal in the descending voice signal, and, the up voice signal is not comprising default less than described
During the signal of frequency threshold, determine that the terminal is in distal end talk situation.
Optionally, the determining module is additionally configured to:
It is noise signal in the descending voice signal, and, the up voice signal is comprising less than the predeterminated frequency
During the signal of threshold value, determine that the terminal is in near-end talk situation.
According to the third aspect of the embodiment of the present disclosure, there is provided a kind of SOT state of termination determining device, including:
Processor;
For storing the memory of the executable instruction of processor;
Wherein, the processor is configured to:
Descending voice signal is played out, shape after making the descending voice signal propagate based on the propagation path outside terminal
Into echo signal, the signal frequency of the echo signal is higher than predeterminated frequency threshold value;
Voice signal in the terminal surrounding preset range is picked up, to obtain up voice signal;
Judge whether the descending voice signal is noise signal;
Judge the up voice signal whether comprising the signal less than the predeterminated frequency threshold value;
If the descending voice signal is not noise signal, and, the up voice signal is comprising less than the default frequency
The signal of rate threshold value, it is determined that the terminal is in double speaking state.
The technical scheme that embodiment of the disclosure is provided can include following beneficial effect:
Due to the restriction of sound-producing device performance in terminal so that the signal frequency of echo signal is higher than predeterminated frequency threshold value,
Terminal by judge descending voice signal be whether noise signal to determine whether distal end talks, by judging up voice signal
In whether comprising the voice signal less than predeterminated frequency threshold value, to determine near-end (communication local terminal) whether in speech.When distal end and
Near-end all in speech, that is to say that descending voice signal is not noise signal, and, up voice signal is comprising less than default frequency
During the voice signal of rate threshold value, it may be determined that terminal is in double speaking state.Whether the determination terminal that the disclosure is provided is said in double
The method of state, is affected less by exterior of terminal environment, thus is gone in variable exterior of terminal environment, and is calculated
It is relatively simple, so versatility is higher.
It should be appreciated that the general description of the above and detailed description hereinafter are only exemplary and explanatory, not
The disclosure can be limited.
Description of the drawings
Accompanying drawing herein is merged in specification and constitutes the part of this specification, shows the enforcement for meeting the disclosure
Example, and be used to explain the principle of the disclosure together with specification.
Figure 1A is the schematic diagram that a kind of terminal according to an exemplary embodiment carries out voice communication.
Figure 1B is the flow chart that a kind of SOT state of termination according to an exemplary embodiment determines method.
Fig. 2A is the flow chart that a kind of SOT state of termination according to an exemplary embodiment determines method.
Fig. 2 B are showing for the data link that a kind of terminal according to an exemplary embodiment obtains descending voice signal
It is intended to.
Fig. 2 C are the schematic diagrames of transmission path after a kind of descending voice signal according to an exemplary embodiment is played.
Fig. 3 is a kind of block diagram of the SOT state of termination determining device according to an exemplary embodiment.
Fig. 4 is a kind of block diagram of the SOT state of termination determining device according to an exemplary embodiment.
Specific embodiment
To make purpose, technical scheme and the advantage of the disclosure clearer, below in conjunction with accompanying drawing to disclosure embodiment party
Formula is described in further detail.
Here exemplary embodiment will be illustrated in detail, its example is illustrated in the accompanying drawings.Explained below is related to
During accompanying drawing, unless otherwise indicated, the same numbers in different accompanying drawings represent same or analogous key element.Following exemplary embodiment
Described in embodiment do not represent all embodiments consistent with the disclosure.Conversely, they be only with it is such as appended
The example of the consistent apparatus and method of some aspects described in detail in claims, the disclosure.
The embodiment of the present disclosure determines method there is provided a kind of SOT state of termination, and it is mainly used in determining the end for carrying out voice communication
Whether end is in double speaking state, below the embodiment of the present disclosure process that voice communication is carried out to terminal is briefly described.
As shown in Figure 1A, terminal A and terminal B can carry out voice communication, and the sound pick up equipment in terminal A can be with pickup terminal
Voice signal 11 around A in preset range, and the voice signal 11 of pickup is sent to into terminal by communication network by terminal A
B.Terminal B can receive the voice signal 11 by communication network, then play the voice signal 11, the sound by sound-producing device
Message number 11 can be propagated to around terminal B in preset range after playing via the propagation path outside terminal B, form echo letter
Numbers 12, at the same time, the user belonging to terminal B can send sound, and the sound pick up equipment of terminal B can be using belonging to pickup terminal B
The voice signal 13 that family sends can also pick up above-mentioned echo signal 12, finally give up voice signal 14, and by communication
The up voice signal 14 is sent to terminal A by network.Due in up voice signal 14 not only include echo signal 12 but also including
The voice signal 13 that terminal B owning user sends, therefore, terminal A owning user can simultaneously hear the sound and terminal B of oneself
The sound of owning user.Above-mentioned phenomenon be it is double says phenomenon, and terminal B state in which is and double says shape in above-mentioned phenomenon
State.
Figure 1B is the flow chart that a kind of SOT state of termination according to an exemplary embodiment determines method, such as Figure 1B institutes
Show, the SOT state of termination determines method in terminal, comprising the following steps:
The descending voice signal of step 101, terminal-pair is played out, and makes descending voice signal based on the propagation path outside terminal
Echo signal is formed after propagation, the signal frequency of echo signal is higher than predeterminated frequency threshold value.
Voice signal around step 102, terminal-pair itself in preset range is picked up, to obtain up sound letter
Number.
Whether the descending voice signal of step 103, terminal judges is noise signal.
Whether the up voice signal of step 104, terminal judges is comprising the voice signal less than predeterminated frequency threshold value.
If step 105, descending voice signal are not noise signal, and, up voice signal is comprising less than predeterminated frequency threshold
The voice signal of value, then terminal determine terminal be in double speaking state.
In sum, the SOT state of termination that the embodiment of the present disclosure is provided determines method, due to sound-producing device performance in terminal
Limit so that the signal frequency of echo signal is higher than predeterminated frequency threshold value, therefore terminal can be by judging descending voice signal
Be whether noise signal to determine whether distal end talks, by judging up voice signal in whether comprising being less than predeterminated frequency threshold
Whether the voice signal of value is determining near-end in speech.When proximally and distally all in speech, that is to say, descending voice signal is not
For noise signal, and, when up voice signal is comprising voice signal less than predeterminated frequency threshold value, it may be determined that terminal is in double
Say state.Whether the determination terminal that the present embodiment is provided in the method for double speaking state, is affected less by exterior of terminal environment,
Thus go in variable exterior of terminal environment, and calculate relatively simple, so versatility is higher.
Fig. 2A is the flow chart that a kind of SOT state of termination according to an exemplary embodiment determines method, such as Fig. 2A institutes
Show, the SOT state of termination determines method in terminal, comprising the following steps:
Step 201, terminal obtain descending voice signal.
As shown in Figure 2 B, during voice signal is descending, the antenna 210 of terminal can be received and sent by communication network
To the voice signal of itself, then process, demodulated process are demodulated by 220 pairs of voice signals for receiving of baseband chip
Voice signal afterwards is the descending voice signal described in the disclosure.
Step 202, terminal are played out by sound-producing device to descending voice signal so that the descending voice signal is based on
Propagation path outside terminal forms echo signal after propagating, wherein, the sound-producing device is higher than predeterminated frequency threshold value to be only capable of playing
Voice signal device, the signal frequency of the echo signal is higher than predeterminated frequency threshold value.
Audio frequency Codec (solution encoder) chip can also be included in terminal, audio frequency Codec chips can be to above-mentioned descending
Voice signal carries out digital-to-analogue conversion operation, and the descending voice signal is converted into into analoging sound signal, then the simulated sound
Signal can be transferred to the sound-producing device of terminal, so as to produce sound.Wherein, above-mentioned sound-producing device can be loudspeaker, ear
The electro-acoustic elements such as machine, the disclosure is not specifically limited to this.
After being played out to descending voice signal by sound-producing device, the descending voice signal can be outside terminal
At least one propagation path is propagated, and is finally propagated in terminal surrounding preset range, forms echo signal, said process
Can be described as with mathematical linguistics:If the descending voice signal is x (n), propagation path is h, then echo signal y (n)=hx
(n).As shown in the top view of Fig. 2 C, in interior, descending voice signal is based on such as the dotted line institute in Fig. 2 C terminal 10 after playing
After the propagation path for showing is propagated, finally propagate in the surrounding preset range of terminal 10, form above-mentioned echo signal, such as Fig. 2 C
Shown, the propagation path is to be reflected finally to propagate to around terminal 10 in advance by wall C again after descending voice signal is reflected by wall B
If in scope.
It should be noted that when distal end is talked, that is to say that the descending voice signal includes the sound of callee user
During signal, also including the voice signal of callee user in the echo signal;And when distal end is kept silence, that is to say the descending sound
When not including the voice signal of callee user in message number, the descending voice signal is noise signal, similarly, echo letter
Number also it is noise signal.
Also, it should be noted that in actual applications, due to the restriction of hardware cost, the performance of sound-producing device is past in terminal
Toward limited, cause it to play the voice signal less than predeterminated frequency threshold value, that is to say, be only capable of playing higher than predeterminated frequency threshold
The voice signal of value, in one embodiment of the disclosure, the predeterminated frequency threshold value can be 400 hertz.Due to sound-producing device
The voice signal less than predeterminated frequency threshold value cannot be played, then the signal frequency of above-mentioned echo signal is higher than above-mentioned predeterminated frequency threshold
Value.
Step 203, terminal are picked up by sound pick up equipment to the voice signal in preset range around itself, to obtain
Up voice signal.
Sound pick up equipment in terminal can be picked up to the voice signal in terminal surrounding preset range, picked sound
Message number is analoging sound signal, and the audio frequency Codec chips in terminal can carry out analog-to-digital conversion behaviour to the analoging sound signal
Make, be translated into digital audio signal, the digital audio signal that is to say up voice signal mentioned above.Need explanation
, above-mentioned sound pick up equipment can be the electro-acoustic elements such as microphone.
In actual applications, because descending voice signal by sound-producing device after being played, the biography of exterior of terminal can be based on
Broadcast propagated and echo signal is formed in terminal surrounding preset range.Therefore, can include in terminal surrounding preset range near
End voice signal and/or above-mentioned echo signal, that is to say, above-mentioned up voice signal can include near end sound signal and/or
Above-mentioned echo signal.Wherein, near end sound signal refers to other sound in terminal surrounding environment in addition to above-mentioned echo signal
Message number.When up voice signal includes near end sound signal, and, when echo signal is noise signal, illustrate that distal end is not said
Words, and near-end is being talked, now terminal is in near-end talk situation;When up voice signal does not include near end sound signal, and,
When echo signal is not noise signal, distal end is illustrated in speech, and near-end is not being talked, now terminal is in distal end speech shape
State, when up voice signal includes near end sound signal, and, when echo signal is not noise signal, distal end is illustrated in speech, and
, also in speech, now terminal is in double speaking state for near-end.
Whether the descending voice signal of step 204, terminal judges is noise signal.
As described above, in order to whether determine terminal in double speaking state, terminal is it needs to be determined that whether echo signal is noise
Signal, due to echo signal and the correlation of descending voice signal, terminal can be by determining whether descending voice signal is to make an uproar
The mode of message number determines whether echo signal is noise signal.In actual applications, terminal can be determined using following methods
Whether descending voice signal is noise signal, specifically:
Terminal can obtain the zero-crossing rate of the energy of descending voice signal and descending voice signal, and then, terminal can compare
The energy of more descending voice signal and the energy of noise signal, and the zero-crossing rate and the mistake of noise signal of relatively more descending voice signal
Zero rate, final terminal can be based on comparative result, judge whether descending voice signal is noise signal.
It should be noted that above-mentioned energy is the metric parameter of sound signal intensity, and above-mentioned zero-crossing rate is also referred to as in short-term
Zero-crossing rate, refers to the number of times that interior signal value per second passes through null value.In actual applications, the energy of noise signal is typically relatively low,
That is to say less than preset energy threshold value, meanwhile, the zero-crossing rate of noise signal is typically relatively low, that is to say less than default zero-crossing rate threshold
Value.Therefore, if the energy of descending voice signal is less than preset energy threshold value, and, the zero-crossing rate of descending voice signal is less than default
Zero-crossing rate threshold value, then illustrate that the descending voice signal is noise signal.
Certainly, in actual applications, also other judge that whether descending voice signal is the method for noise signal, the disclosure
This is not just repeated one by one.
Whether the up voice signal of step 205, terminal judges is comprising the voice signal less than predeterminated frequency threshold value.
As described above, in order to whether determine terminal in double speaking state, except it needs to be determined that whether echo signal is noise
Outside signal, in addition it is also necessary to which whether determine in up voice signal includes near end sound signal.As described above, echo signal is signal
Frequency is higher than the voice signal of predeterminated frequency threshold value, if then believing comprising the sound less than predeterminated frequency threshold value in up voice signal
Near end sound signal is included by number in the up voice signal of explanation.
In actual applications, terminal can carry out Fourier transformation process to up voice signal, to obtain up sound
The spectrum information of signal.Can determine the up voice signal in the frequency less than predeterminated frequency threshold value according to the spectrum information terminal
Amplitude in section, if the amplitude is less than default amplitude thresholds, illustrates in the up voice signal not comprising less than predeterminated frequency
The voice signal of threshold value, if the amplitude is higher than default amplitude thresholds, illustrates in the up voice signal comprising less than default frequency
The voice signal of rate threshold value.
If step 206, descending voice signal are not noise signal, and, up voice signal is comprising less than predeterminated frequency threshold
The voice signal of value, then terminal determine itself be in double speaking state.
Based on above step, terminal can determine whether descending voice signal is noise signal, and whether up voice signal
Including the voice signal less than predeterminated frequency threshold value.It is not noise signal in descending voice signal, and, up voice signal is included
Less than predeterminated frequency threshold value voice signal when, terminal determine itself be in double speaking state.Meanwhile, it is not in descending voice signal
Noise signal, and, when up voice signal is not comprising the voice signal for being less than predeterminated frequency threshold value, terminal can determine and be in certainly
In distal end talk situation;It is noise signal in descending voice signal, and, up voice signal is comprising less than predeterminated frequency threshold value
During signal, terminal can determine that itself is in near-end talk situation.
It should be noted that the sequencing of the step of SOT state of termination method of embodiment of the present disclosure offer can be fitted
Work as adjustment, step according to circumstances can also accordingly be increased and decreased, and any those familiar with the art takes off in the disclosure
In the technical scope of dew, the method that can readily occur in change all should cover within the protection domain of the disclosure, therefore no longer go to live in the household of one's in-laws on getting married
State.
In sum, the SOT state of termination that the embodiment of the present disclosure is provided determines method, due to sound-producing device performance in terminal
Limit so that the signal frequency of echo signal is higher than predeterminated frequency threshold value, therefore terminal can be by judging descending voice signal
Be whether noise signal to determine whether distal end talks, by judging up voice signal in whether comprising being less than predeterminated frequency threshold
Whether the voice signal of value is determining near-end in speech.When proximally and distally all in speech, that is to say, descending voice signal is not
For noise signal, and, when up voice signal is comprising voice signal less than predeterminated frequency threshold value, it may be determined that terminal is in double
Say state.Whether the determination terminal that the present embodiment is provided in the method for double speaking state, is affected less by exterior of terminal environment,
Thus go in variable exterior of terminal environment, and calculate relatively simple, so versatility is higher.
Fig. 3 is a kind of block diagram of the SOT state of termination determining device 300 according to an exemplary embodiment.With reference to Fig. 3, should
Device includes:Playing module 301, pickup module 302, judge module 303 and determining module 304.
The playing module 301, is configured to play out descending voice signal, makes the descending voice signal be based on terminal
Outer propagation path forms echo signal after propagating, and the signal frequency of the echo signal is higher than predeterminated frequency threshold value.
The pickup module 302, is configured to be picked up the voice signal in the terminal surrounding preset range, to obtain
Up voice signal.
The judge module 303, is configured to judge whether the descending voice signal is noise signal.
The judge module 303, is additionally configured to judge whether the up voice signal of the pickup of pickup module 302 includes
Less than the signal of the predeterminated frequency threshold value.
The determining module 304, it is not noise signal to be configured in the descending voice signal, and, the pickup module 3,020
When the up voice signal for taking is comprising the signal for being less than the predeterminated frequency threshold value, determine that the terminal is in double speaking state.
Optionally, the judge module 303 is configured to:Obtain the energy and the descending voice signal of the descending voice signal
Zero-crossing rate;Compare the energy of the descending voice signal and the energy of noise signal;Compare the zero-crossing rate of the descending voice signal
With the zero-crossing rate of noise signal;According to comparative result, judge whether the descending voice signal is noise signal.
Optionally, the playing module 301 is configured to:The descending voice signal is carried out by the sounding component of the terminal
Play, the sounding component is the component that can only play the voice signal higher than predeterminated frequency threshold value.
Optionally, the determining module 304 is additionally configured to:It is not noise signal in the descending voice signal, and, this is up
When voice signal is not comprising the signal for being less than the predeterminated frequency threshold value, determine that the terminal is in distal end talk situation.
Optionally, the determining module 304 is additionally configured to:It is noise signal in the descending voice signal, and, the up sound
When message number is comprising the signal for being less than the predeterminated frequency threshold value, determine that the terminal is in near-end talk situation.
In sum, the SOT state of termination determining device that the embodiment of the present disclosure is provided, due to sound-producing device in playing module
The restriction of energy, the signal frequency of echo signal is higher than predeterminated frequency threshold value so that judge module can be by judging descending sound
Signal be whether noise signal to determine whether distal end talks, by judging up voice signal in whether comprising less than default frequency
Whether the voice signal of rate threshold value is determining near-end in speech.When proximally and distally all in speech, that is to say, descending sound letter
It is not number noise signal, and, when up voice signal is comprising voice signal less than predeterminated frequency threshold value, determining module can be true
Terminal is determined in double speaking state.The SOT state of termination determining device that the present embodiment is provided determines that terminal, whether in double speaking state, is received
The impact of exterior of terminal environment is less, in going for variable exterior of terminal environment, and calculate it is relatively simple, versatility compared with
It is high.
With regard to the device in above-described embodiment, wherein modules perform the concrete mode of operation in relevant the method
Embodiment in be described in detail, explanation will be not set forth in detail herein.
Fig. 4 is a kind of block diagram of the SOT state of termination determining device 400 according to an exemplary embodiment.For example, device
400 can be mobile phone, and computer, digital broadcast terminal, messaging devices, game console, tablet device, medical treatment sets
It is standby, body-building equipment, personal digital assistant etc..
With reference to Fig. 4, device 400 can include following one or more assemblies:Process assembly 402, memory 404, power supply
Component 406, multimedia groupware 408, audio-frequency assembly 410, the interface 412 of input/output (I/O), sensor cluster 414, and
Communication part 416.
The integrated operation of the usual control device 400 of process assembly 402, such as with display, call, data communication, phase
Machine operates and records the associated operation of operation.Process assembly 402 can refer to including one or more processors 420 to perform
Order, to complete all or part of step of above-mentioned method.Additionally, process assembly 402 can include one or more modules, just
Interaction between process assembly 402 and other assemblies.For example, process assembly 402 can include multi-media module, many to facilitate
Interaction between media component 408 and process assembly 402.
Memory 404 is configured to store various types of data to support the operation in device 400.These data are shown
Example includes the instruction of any application program for operating on device 400 or method, and contact data, telephone book data disappears
Breath, picture, video etc..Memory 404 can be by any kind of volatibility or non-volatile memory device or their group
Close and realize, such as static RAM (SRAM), Electrically Erasable Read Only Memory (EEPROM) is erasable to compile
Journey read-only storage (EPROM), programmable read only memory (PROM), read-only storage (ROM), magnetic memory, flash
Device, disk or CD.
Power supply module 406 provides electric power for the various assemblies of device 400.Power supply module 406 can include power management system
System, one or more power supplys, and other generate, manage and distribute the component that electric power is associated with for device 400.
Multimedia groupware 408 is included in the screen of one output interface of offer between described device 400 and user.One
In a little embodiments, screen can include liquid crystal display (LCD) and touch panel (TP).If screen includes touch panel, screen
Curtain may be implemented as touch-screen, to receive the input signal from user.Touch panel includes one or more touch sensings
Device is with the gesture on sensing touch, slip and touch panel.The touch sensor can not only sensing touch or sliding action
Border, but also detect and the touch or slide related duration and pressure.In certain embodiments, many matchmakers
Body component 408 includes a front-facing camera and/or post-positioned pick-up head.When device 400 be in operator scheme, such as screening-mode or
During video mode, front-facing camera and/or post-positioned pick-up head can receive outside multi-medium data.Each front-facing camera and
Post-positioned pick-up head can be a fixed optical lens system or with focusing and optical zoom capabilities.
Audio-frequency assembly 410 is configured to output and/or input audio signal.For example, audio-frequency assembly 410 includes a Mike
Wind (MIC), when device 400 is in operator scheme, such as call model, logging mode and speech recognition mode, microphone is matched somebody with somebody
It is set to reception external audio signal.The audio signal for being received can be further stored in memory 404 or via communication group
Part 416 sends.In certain embodiments, audio-frequency assembly 410 also includes a loudspeaker, for exports audio signal.
, to provide interface between process assembly 402 and peripheral interface module, above-mentioned peripheral interface module can for I/O interfaces 412
To be keyboard, click wheel, button etc..These buttons may include but be not limited to:Home button, volume button, start button and lock
Determine button.
Sensor cluster 414 includes one or more sensors, and the state for providing various aspects for device 400 is commented
Estimate.For example, sensor cluster 414 can detect the opening/closed mode of device 400, and the relative positioning of component is for example described
Component is the display and keypad of device 400, and sensor cluster 414 can be with 400 1 components of detection means 400 or device
Position change, user is presence or absence of with what device 400 was contacted, the orientation of device 400 or acceleration/deceleration and device 400
Temperature change.Sensor cluster 414 can include proximity transducer, be configured to be detected when without any physical contact
The presence of object nearby.Sensor cluster 414 can also include optical sensor, such as CMOS or ccd image sensor, for into
As used in application.In certain embodiments, the sensor cluster 414 can also include acceleration transducer, gyro sensors
Device, Magnetic Sensor, pressure sensor or temperature sensor.
Communication part 416 is configured to facilitate the communication of wired or wireless way between device 400 and other equipment.Device
400 can access based on the wireless network of communication standard, such as WiFi, 2G or 3G, or combinations thereof.In an exemplary enforcement
In example, communication component 416 receives the broadcast singal or broadcast related information from external broadcasting management system via broadcast channel.
In one exemplary embodiment, the communication component 416 also includes near-field communication (NFC) module, to promote short range communication.Example
Such as, NFC module can be based on RF identification (RFID) technology, Infrared Data Association (IrDA) technology, ultra broadband (UWB) technology,
Bluetooth (BT) technology and other technologies are realizing.
In the exemplary embodiment, device 400 can be by one or more application specific integrated circuits (ASIC), numeral letter
Number processor (DSP), digital signal processing appts (DSPD), PLD (PLD), field programmable gate array
(FPGA), controller, microcontroller, microprocessor or other electronic components realizations, for performing said method.
In the exemplary embodiment, a kind of non-transitorycomputer readable storage medium including instruction, example are additionally provided
Such as include the memory 404 of instruction, above-mentioned instruction can be performed to complete said method by the processor 420 of device 400.For example,
The non-transitorycomputer readable storage medium can be ROM, random access memory (RAM), CD-ROM, tape, floppy disk
With optical data storage devices etc..
In the exemplary embodiment, a kind of non-transitorycomputer readable storage medium is additionally provided, when the storage is situated between
Instruction in matter by mobile terminal computing device when so that mobile terminal is able to carry out following methods:Descending sound is believed
Number play out, make descending voice signal form echo signal, the letter of echo signal after propagating based on the propagation path outside terminal
Number frequency is higher than predeterminated frequency threshold value;Voice signal in terminal surrounding preset range is picked up, to obtain up sound
Signal;Judge whether descending voice signal is noise signal;Judge up voice signal whether comprising less than predeterminated frequency threshold value
Voice signal;If descending voice signal is not noise signal, and, up voice signal includes the sound less than predeterminated frequency threshold value
Message number, it is determined that terminal is in double speaking state.
Those skilled in the art will readily occur to its of the disclosure after considering specification and putting into practice invention disclosed herein
Its embodiment.The application is intended to any modification, purposes or the adaptations of the disclosure, these modifications, purposes or
Person's adaptations follow the general principle of the disclosure and including the undocumented common knowledge in the art of the disclosure
Or conventional techniques.Description and embodiments are considered only as exemplary, and the true scope of the disclosure and spirit are by following
Claim is pointed out.
It should be appreciated that the disclosure is not limited to the precision architecture for being described above and being shown in the drawings, and
And can without departing from the scope carry out various modifications and changes.The scope of the present disclosure is only limited by appended claim.
Claims (11)
1. a kind of SOT state of termination determines method, it is characterised in that methods described includes:
Descending voice signal is played out so that the descending voice signal is formed after being propagated based on the propagation path outside terminal
Echo signal, the signal frequency of the echo signal is higher than predeterminated frequency threshold value;
Voice signal in the terminal surrounding preset range is picked up, to obtain up voice signal;
Judge whether the descending voice signal is noise signal;
Judge the up voice signal whether comprising the voice signal less than the predeterminated frequency threshold value;
If the descending voice signal is not noise signal, and, the up voice signal is comprising less than the predeterminated frequency threshold
The voice signal of value, it is determined that the terminal is in double speaking state.
2. method according to claim 1, it is characterised in that described to judge whether the descending voice signal is noise letter
Number, including:
Obtain the energy of the descending voice signal and the zero-crossing rate of the descending voice signal;
The energy of the comparison descending voice signal and the energy of noise signal;
The zero-crossing rate of the comparison descending voice signal and the zero-crossing rate of noise signal;
According to comparative result, judge whether the descending voice signal is noise signal.
3. method according to claim 1, it is characterised in that described that descending voice signal is played out, including:
The descending voice signal is played out by the sound-producing device of the terminal, the sound-producing device is high to be only capable of broadcasting
In the device of the voice signal of the predeterminated frequency threshold value.
4. method according to claim 1, it is characterised in that methods described also includes:
If the descending voice signal is not noise signal, and, the up voice signal is not comprising less than the predeterminated frequency
The voice signal of threshold value, it is determined that the terminal is in distal end talk situation.
5. method according to claim 1, it is characterised in that methods described also includes:
If the descending voice signal is noise signal, and, the up voice signal is comprising less than the predeterminated frequency threshold value
Voice signal, it is determined that the terminal be in near-end talk situation.
6. a kind of SOT state of termination determining device, it is characterised in that described device includes:
Playing module, is configured to play out descending voice signal, makes the descending voice signal based on the biography outside terminal
Broadcast and formed after propagated echo signal, the signal frequency of the echo signal is higher than predeterminated frequency threshold value;
Pickup module, is configured to be picked up the voice signal in the terminal surrounding preset range, to obtain up sound
Message number;
Judge module, is configured to judge whether the descending voice signal is noise signal;
Whether the judge module, be additionally configured to judge the described up voice signal of the pickup module pickup comprising being less than
The signal of the predeterminated frequency threshold value;
Determining module, it is not noise signal to be configured in the descending voice signal, and, it is described that the pickup module is picked up
When up voice signal is comprising the signal for being less than the predeterminated frequency threshold value, determine that the terminal is in double speaking state.
7. device according to claim 6, it is characterised in that the judge module is configured to:
Obtain the energy of the descending voice signal and the zero-crossing rate of the descending voice signal;
The energy of the comparison descending voice signal and the energy of noise signal;
The zero-crossing rate of the comparison descending voice signal and the zero-crossing rate of noise signal;
According to comparative result, judge whether the descending voice signal is noise signal.
8. device according to claim 6, it is characterised in that the playing module is configured to:
The descending voice signal is played out by the sounding component of the terminal, the sounding component is high to be only capable of broadcasting
In the component of the voice signal of predeterminated frequency threshold value.
9. device according to claim 6, it is characterised in that the determining module is additionally configured to:
It is not noise signal in the descending voice signal, and, the up voice signal is not comprising less than the predeterminated frequency
During the signal of threshold value, determine that the terminal is in distal end talk situation.
10. device according to claim 6, it is characterised in that the determining module is additionally configured to:
It is noise signal in the descending voice signal, and, the up voice signal is comprising less than the predeterminated frequency threshold value
Voice signal when, determine the terminal be in near-end talk situation.
11. a kind of SOT state of termination determining devices, it is characterised in that described device includes:
Processor;
For storing the memory of the executable instruction of processor;
Wherein, the processor is configured to:
Descending voice signal is played out, the descending voice signal is formed back after propagating based on the propagation path outside terminal
Message number, the signal frequency of the echo signal is higher than predeterminated frequency threshold value;
Voice signal in the terminal surrounding preset range is picked up, to obtain up voice signal;
Judge whether the descending voice signal is noise signal;
Judge the up voice signal whether comprising the signal less than the predeterminated frequency threshold value;
If the descending voice signal is not noise signal, and, the up voice signal is comprising less than the predeterminated frequency threshold
The signal of value, it is determined that the terminal is in double speaking state.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611233992.8A CN106601269A (en) | 2016-12-28 | 2016-12-28 | Terminal state determining method and apparatus |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611233992.8A CN106601269A (en) | 2016-12-28 | 2016-12-28 | Terminal state determining method and apparatus |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106601269A true CN106601269A (en) | 2017-04-26 |
Family
ID=58602843
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611233992.8A Pending CN106601269A (en) | 2016-12-28 | 2016-12-28 | Terminal state determining method and apparatus |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106601269A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114571473A (en) * | 2020-12-01 | 2022-06-03 | 北京小米移动软件有限公司 | Control method and device for foot type robot and foot type robot |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1486566A (en) * | 2000-09-15 | 2004-03-31 | 英特尔公司 | Residual echo estimation for echo cancellation |
CN101067927A (en) * | 2007-04-19 | 2007-11-07 | 北京中星微电子有限公司 | Sound volume adjusting method and device |
WO2010083641A1 (en) * | 2009-01-20 | 2010-07-29 | 华为技术有限公司 | Method and apparatus for detecting double talk |
CN104157290A (en) * | 2014-08-19 | 2014-11-19 | 大连理工大学 | Speaker recognition method based on depth learning |
CN105427868A (en) * | 2015-10-30 | 2016-03-23 | 杭州乐哈思智能科技有限公司 | Method for eliminating noise of VOIP system bidirectional duplex hand-free voice |
-
2016
- 2016-12-28 CN CN201611233992.8A patent/CN106601269A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1486566A (en) * | 2000-09-15 | 2004-03-31 | 英特尔公司 | Residual echo estimation for echo cancellation |
CN101067927A (en) * | 2007-04-19 | 2007-11-07 | 北京中星微电子有限公司 | Sound volume adjusting method and device |
WO2010083641A1 (en) * | 2009-01-20 | 2010-07-29 | 华为技术有限公司 | Method and apparatus for detecting double talk |
CN104157290A (en) * | 2014-08-19 | 2014-11-19 | 大连理工大学 | Speaker recognition method based on depth learning |
CN105427868A (en) * | 2015-10-30 | 2016-03-23 | 杭州乐哈思智能科技有限公司 | Method for eliminating noise of VOIP system bidirectional duplex hand-free voice |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114571473A (en) * | 2020-12-01 | 2022-06-03 | 北京小米移动软件有限公司 | Control method and device for foot type robot and foot type robot |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108509232A (en) | Screen recording method, device and computer readable storage medium | |
CN106161781A (en) | Method for regulation of sound volume and device | |
CN104991754A (en) | Recording method and apparatus | |
CN104104771A (en) | Conversation processing method and device | |
CN106157952B (en) | Sound identification method and device | |
CN106791245A (en) | Determine the method and device of filter coefficient | |
CN104935729B (en) | Audio-frequency inputting method and device | |
CN109087650A (en) | voice awakening method and device | |
CN107833579A (en) | Noise cancellation method, device and computer-readable recording medium | |
CN108076199A (en) | The air-tightness detection method and device of microphone | |
CN108200267A (en) | A kind of terminal control method, terminal and computer readable storage medium | |
CN106888327A (en) | Speech playing method and device | |
CN104702756A (en) | Detecting method and detecting device for soundless call | |
CN105744210B (en) | Echo cancel method, the apparatus and system of video conference | |
CN106601269A (en) | Terminal state determining method and apparatus | |
CN108206884A (en) | Terminal, the method for adjustment of terminal transmission signal of communication and electronic equipment | |
CN103973883B (en) | A kind of method and device controlling voice-input device | |
CN109862171A (en) | Terminal equipment control method and device | |
CN106683683A (en) | Terminal state determining method and device | |
US11388281B2 (en) | Adaptive method and apparatus for intelligent terminal, and terminal | |
CN112217948B (en) | Echo processing method, device, equipment and storage medium for voice call | |
CN111694539B (en) | Method, device and medium for switching between earphone and loudspeaker | |
CN106507023A (en) | The method and device processed by audio frequency and video request | |
CN107682101A (en) | Noise detecting method, device and electronic equipment | |
CN107172557B (en) | Method and device for detecting polarity of loudspeaker and receiver |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170426 |
|
RJ01 | Rejection of invention patent application after publication |