CN106601269A - Terminal state determining method and apparatus - Google Patents

Terminal state determining method and apparatus Download PDF

Info

Publication number
CN106601269A
CN106601269A CN201611233992.8A CN201611233992A CN106601269A CN 106601269 A CN106601269 A CN 106601269A CN 201611233992 A CN201611233992 A CN 201611233992A CN 106601269 A CN106601269 A CN 106601269A
Authority
CN
China
Prior art keywords
voice signal
signal
terminal
descending
threshold value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611233992.8A
Other languages
Chinese (zh)
Inventor
周瑜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xiaomi Mobile Software Co Ltd
Original Assignee
Beijing Xiaomi Mobile Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xiaomi Mobile Software Co Ltd filed Critical Beijing Xiaomi Mobile Software Co Ltd
Priority to CN201611233992.8A priority Critical patent/CN106601269A/en
Publication of CN106601269A publication Critical patent/CN106601269A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M9/00Arrangements for interconnection not involving centralised switching
    • H04M9/08Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L2021/02082Noise filtering the noise being echo, reverberation of the speech

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Telephone Function (AREA)

Abstract

The invention provides a terminal state determining method and apparatus and belongs to the field of signal processing. The method comprises the following steps: playing downlink sound signals to enable echo signals to be formed after the downlink sound signals are spread based on a spreading path besides a terminal, wherein the signal frequency of the echo signals is higher than a preset frequency threshold; picking up sound signals in a preset scope around the terminal to as to obtain uplink sound signals; determining whether the downlink sound signals are noise signals; determining whether uplink sound signals comprise sound signals lower than the preset frequency threshold; and if the uplink sound signals are not the noise signals and the uplink sound signals do not include the sound signals lower than the preset frequency threshold, determining that the terminal is at a dual-speaking state. The method provided by the invention can determine whether the terminal is at the dual-speaking state and is less influenced by the external environment of the terminal, thereby being applied to a variable terminal external environment. Besides, the calculation is quite simple, and thus the universality is quite high.

Description

The SOT state of termination determines method and device
Technical field
It relates to field of signal processing, more particularly to a kind of SOT state of termination determines method and device.
Background technology
In hand-free calls such as instant messaging, videoconference, IP (Internet Protocol, iso-ip Internetworking protocol ISO-IP) phones During, need the real-time voice exchange for carrying out both sides.
During real-time voice is exchanged, the sound-producing device of terminal, such as loudspeaker can play communication opposite end (remote End) voice signal that sends, that is to say the sound of callee user, the voice signal that sound-producing device is played is based on exterior of terminal The echo signal that propagation path is formed after propagating can be by the sound pick up equipment of terminal, such as microphone pickup, while sound pick up equipment The voice signal of calling part user can also be picked up.Then, the sound of the calling part user that terminal can pick up sound pick up equipment Signal and echo signal are sent to communication opposite end, and this results in callee user except the sound of the above-mentioned calling part user of uppick The sound of oneself is can listen to outward, so as to have a strong impact on speech quality.Wherein, the sound letter that sound-producing device is played in terminal Propagation path number based on exterior of terminal is picked up after propagating by sound pick up equipment, and is transferred back to the phenomenon of communication opposite end and be referred to as double saying. In order to ensure speech quality, terminal generally requires to carry out echo cancellation, that is to say what is picked up in sound pick up equipment by filtering method Echo signal is filtered in voice signal, and the prerequisite for carrying out echo cancellation is correctly to determine whether terminal says shape in double State.
In correlation technique, determine whether the method in double speaking state has energy comparison, such as Geigel algorithms to terminal, and Cross-correlation comparison method, such as Benesty algorithms.However, the theoretical foundation of energy comparison be exterior of terminal propagation path it is basic Unanimously, that is to say that energy comparison is applied to fixed exterior of terminal environment, its scope of application is narrower;Though and cross-correlation comparison method So accuracy of judgement degree is higher, but its calculating is sufficiently complex, and amount of calculation is more huge, some terminals cannot the load cross-correlation compare The amount of calculation of method.Therefore, the double speaking state in correlation technique determines that the versatility of method is relatively low.
The content of the invention
For the problem for overcoming the determination method versatility of double speaking state present in correlation technique relatively low, disclosure offer one Plant the SOT state of termination and determine method and device.
According to the first aspect of the embodiment of the present disclosure, there is provided a kind of SOT state of termination determines method, including:
Descending voice signal is played out, shape after making the descending voice signal propagate based on the propagation path outside terminal Into echo signal, the signal frequency of the echo signal is higher than predeterminated frequency threshold value;
Voice signal in the terminal surrounding preset range is picked up, to obtain up voice signal;
Judge whether the descending voice signal is noise signal;
Judge the up voice signal whether comprising the signal less than the predeterminated frequency threshold value;
If the descending voice signal is not noise signal, and, the up voice signal is comprising less than the default frequency The signal of rate threshold value, it is determined that the terminal is in double speaking state.
Optionally, it is described to judge whether the descending voice signal is noise signal, including:
Obtain the energy of the descending voice signal and the zero-crossing rate of the descending voice signal;
The energy of the comparison descending voice signal and the energy of noise signal;
The zero-crossing rate of the comparison descending voice signal and the zero-crossing rate of noise signal;
According to comparative result, judge whether the descending voice signal is noise signal.
Optionally, it is described that descending voice signal is played out, including:
The descending voice signal is played out by the sound-producing device of the terminal, the sound-producing device is to broadcast Put the device of the voice signal higher than predeterminated frequency threshold value.
Optionally, methods described also includes:
If the descending voice signal is not noise signal, and, the up voice signal is not comprising default less than described The signal of frequency threshold, it is determined that the terminal is in distal end talk situation.
Optionally, methods described also includes:
If the descending voice signal is noise signal, and, the up voice signal is comprising less than the predeterminated frequency The signal of threshold value, it is determined that the terminal is in near-end talk situation.
According to the second aspect of the embodiment of the present disclosure, there is provided a kind of SOT state of termination determining device, including:
Playing module, is configured to play out descending voice signal, makes the descending voice signal based on outside terminal Propagation path propagate after form echo signal, the signal frequency of the echo signal is higher than predeterminated frequency threshold value;
Pickup module, is configured to be picked up the voice signal in the terminal surrounding preset range, with acquisition Row voice signal;
Judge module, is configured to judge whether the descending voice signal is noise signal;
The judge module, is additionally configured to judge whether the described up voice signal of the pickup module pickup includes Less than the signal of the predeterminated frequency threshold value;
Determining module, it is not noise signal to be configured in the descending voice signal, and, the up voice signal bag During containing the signal for being less than the predeterminated frequency threshold value, determine that the terminal is in double speaking state.
Optionally, the judge module is configured to:
Obtain the energy of the descending voice signal and the zero-crossing rate of the descending voice signal;
The energy of the comparison descending voice signal and the energy of noise signal;
The zero-crossing rate of the comparison descending voice signal and the zero-crossing rate of noise signal;
According to comparative result, judge whether the descending voice signal is noise signal.
Optionally, the playing module is configured to:
The descending voice signal is played out by the sounding component of the terminal, the sounding component is to broadcast Put the component of the voice signal higher than predeterminated frequency threshold value.
Optionally, the determining module is additionally configured to:
It is not noise signal in the descending voice signal, and, the up voice signal is not comprising default less than described During the signal of frequency threshold, determine that the terminal is in distal end talk situation.
Optionally, the determining module is additionally configured to:
It is noise signal in the descending voice signal, and, the up voice signal is comprising less than the predeterminated frequency During the signal of threshold value, determine that the terminal is in near-end talk situation.
According to the third aspect of the embodiment of the present disclosure, there is provided a kind of SOT state of termination determining device, including:
Processor;
For storing the memory of the executable instruction of processor;
Wherein, the processor is configured to:
Descending voice signal is played out, shape after making the descending voice signal propagate based on the propagation path outside terminal Into echo signal, the signal frequency of the echo signal is higher than predeterminated frequency threshold value;
Voice signal in the terminal surrounding preset range is picked up, to obtain up voice signal;
Judge whether the descending voice signal is noise signal;
Judge the up voice signal whether comprising the signal less than the predeterminated frequency threshold value;
If the descending voice signal is not noise signal, and, the up voice signal is comprising less than the default frequency The signal of rate threshold value, it is determined that the terminal is in double speaking state.
The technical scheme that embodiment of the disclosure is provided can include following beneficial effect:
Due to the restriction of sound-producing device performance in terminal so that the signal frequency of echo signal is higher than predeterminated frequency threshold value, Terminal by judge descending voice signal be whether noise signal to determine whether distal end talks, by judging up voice signal In whether comprising the voice signal less than predeterminated frequency threshold value, to determine near-end (communication local terminal) whether in speech.When distal end and Near-end all in speech, that is to say that descending voice signal is not noise signal, and, up voice signal is comprising less than default frequency During the voice signal of rate threshold value, it may be determined that terminal is in double speaking state.Whether the determination terminal that the disclosure is provided is said in double The method of state, is affected less by exterior of terminal environment, thus is gone in variable exterior of terminal environment, and is calculated It is relatively simple, so versatility is higher.
It should be appreciated that the general description of the above and detailed description hereinafter are only exemplary and explanatory, not The disclosure can be limited.
Description of the drawings
Accompanying drawing herein is merged in specification and constitutes the part of this specification, shows the enforcement for meeting the disclosure Example, and be used to explain the principle of the disclosure together with specification.
Figure 1A is the schematic diagram that a kind of terminal according to an exemplary embodiment carries out voice communication.
Figure 1B is the flow chart that a kind of SOT state of termination according to an exemplary embodiment determines method.
Fig. 2A is the flow chart that a kind of SOT state of termination according to an exemplary embodiment determines method.
Fig. 2 B are showing for the data link that a kind of terminal according to an exemplary embodiment obtains descending voice signal It is intended to.
Fig. 2 C are the schematic diagrames of transmission path after a kind of descending voice signal according to an exemplary embodiment is played.
Fig. 3 is a kind of block diagram of the SOT state of termination determining device according to an exemplary embodiment.
Fig. 4 is a kind of block diagram of the SOT state of termination determining device according to an exemplary embodiment.
Specific embodiment
To make purpose, technical scheme and the advantage of the disclosure clearer, below in conjunction with accompanying drawing to disclosure embodiment party Formula is described in further detail.
Here exemplary embodiment will be illustrated in detail, its example is illustrated in the accompanying drawings.Explained below is related to During accompanying drawing, unless otherwise indicated, the same numbers in different accompanying drawings represent same or analogous key element.Following exemplary embodiment Described in embodiment do not represent all embodiments consistent with the disclosure.Conversely, they be only with it is such as appended The example of the consistent apparatus and method of some aspects described in detail in claims, the disclosure.
The embodiment of the present disclosure determines method there is provided a kind of SOT state of termination, and it is mainly used in determining the end for carrying out voice communication Whether end is in double speaking state, below the embodiment of the present disclosure process that voice communication is carried out to terminal is briefly described.
As shown in Figure 1A, terminal A and terminal B can carry out voice communication, and the sound pick up equipment in terminal A can be with pickup terminal Voice signal 11 around A in preset range, and the voice signal 11 of pickup is sent to into terminal by communication network by terminal A B.Terminal B can receive the voice signal 11 by communication network, then play the voice signal 11, the sound by sound-producing device Message number 11 can be propagated to around terminal B in preset range after playing via the propagation path outside terminal B, form echo letter Numbers 12, at the same time, the user belonging to terminal B can send sound, and the sound pick up equipment of terminal B can be using belonging to pickup terminal B The voice signal 13 that family sends can also pick up above-mentioned echo signal 12, finally give up voice signal 14, and by communication The up voice signal 14 is sent to terminal A by network.Due in up voice signal 14 not only include echo signal 12 but also including The voice signal 13 that terminal B owning user sends, therefore, terminal A owning user can simultaneously hear the sound and terminal B of oneself The sound of owning user.Above-mentioned phenomenon be it is double says phenomenon, and terminal B state in which is and double says shape in above-mentioned phenomenon State.
Figure 1B is the flow chart that a kind of SOT state of termination according to an exemplary embodiment determines method, such as Figure 1B institutes Show, the SOT state of termination determines method in terminal, comprising the following steps:
The descending voice signal of step 101, terminal-pair is played out, and makes descending voice signal based on the propagation path outside terminal Echo signal is formed after propagation, the signal frequency of echo signal is higher than predeterminated frequency threshold value.
Voice signal around step 102, terminal-pair itself in preset range is picked up, to obtain up sound letter Number.
Whether the descending voice signal of step 103, terminal judges is noise signal.
Whether the up voice signal of step 104, terminal judges is comprising the voice signal less than predeterminated frequency threshold value.
If step 105, descending voice signal are not noise signal, and, up voice signal is comprising less than predeterminated frequency threshold The voice signal of value, then terminal determine terminal be in double speaking state.
In sum, the SOT state of termination that the embodiment of the present disclosure is provided determines method, due to sound-producing device performance in terminal Limit so that the signal frequency of echo signal is higher than predeterminated frequency threshold value, therefore terminal can be by judging descending voice signal Be whether noise signal to determine whether distal end talks, by judging up voice signal in whether comprising being less than predeterminated frequency threshold Whether the voice signal of value is determining near-end in speech.When proximally and distally all in speech, that is to say, descending voice signal is not For noise signal, and, when up voice signal is comprising voice signal less than predeterminated frequency threshold value, it may be determined that terminal is in double Say state.Whether the determination terminal that the present embodiment is provided in the method for double speaking state, is affected less by exterior of terminal environment, Thus go in variable exterior of terminal environment, and calculate relatively simple, so versatility is higher.
Fig. 2A is the flow chart that a kind of SOT state of termination according to an exemplary embodiment determines method, such as Fig. 2A institutes Show, the SOT state of termination determines method in terminal, comprising the following steps:
Step 201, terminal obtain descending voice signal.
As shown in Figure 2 B, during voice signal is descending, the antenna 210 of terminal can be received and sent by communication network To the voice signal of itself, then process, demodulated process are demodulated by 220 pairs of voice signals for receiving of baseband chip Voice signal afterwards is the descending voice signal described in the disclosure.
Step 202, terminal are played out by sound-producing device to descending voice signal so that the descending voice signal is based on Propagation path outside terminal forms echo signal after propagating, wherein, the sound-producing device is higher than predeterminated frequency threshold value to be only capable of playing Voice signal device, the signal frequency of the echo signal is higher than predeterminated frequency threshold value.
Audio frequency Codec (solution encoder) chip can also be included in terminal, audio frequency Codec chips can be to above-mentioned descending Voice signal carries out digital-to-analogue conversion operation, and the descending voice signal is converted into into analoging sound signal, then the simulated sound Signal can be transferred to the sound-producing device of terminal, so as to produce sound.Wherein, above-mentioned sound-producing device can be loudspeaker, ear The electro-acoustic elements such as machine, the disclosure is not specifically limited to this.
After being played out to descending voice signal by sound-producing device, the descending voice signal can be outside terminal At least one propagation path is propagated, and is finally propagated in terminal surrounding preset range, forms echo signal, said process Can be described as with mathematical linguistics:If the descending voice signal is x (n), propagation path is h, then echo signal y (n)=hx (n).As shown in the top view of Fig. 2 C, in interior, descending voice signal is based on such as the dotted line institute in Fig. 2 C terminal 10 after playing After the propagation path for showing is propagated, finally propagate in the surrounding preset range of terminal 10, form above-mentioned echo signal, such as Fig. 2 C Shown, the propagation path is to be reflected finally to propagate to around terminal 10 in advance by wall C again after descending voice signal is reflected by wall B If in scope.
It should be noted that when distal end is talked, that is to say that the descending voice signal includes the sound of callee user During signal, also including the voice signal of callee user in the echo signal;And when distal end is kept silence, that is to say the descending sound When not including the voice signal of callee user in message number, the descending voice signal is noise signal, similarly, echo letter Number also it is noise signal.
Also, it should be noted that in actual applications, due to the restriction of hardware cost, the performance of sound-producing device is past in terminal Toward limited, cause it to play the voice signal less than predeterminated frequency threshold value, that is to say, be only capable of playing higher than predeterminated frequency threshold The voice signal of value, in one embodiment of the disclosure, the predeterminated frequency threshold value can be 400 hertz.Due to sound-producing device The voice signal less than predeterminated frequency threshold value cannot be played, then the signal frequency of above-mentioned echo signal is higher than above-mentioned predeterminated frequency threshold Value.
Step 203, terminal are picked up by sound pick up equipment to the voice signal in preset range around itself, to obtain Up voice signal.
Sound pick up equipment in terminal can be picked up to the voice signal in terminal surrounding preset range, picked sound Message number is analoging sound signal, and the audio frequency Codec chips in terminal can carry out analog-to-digital conversion behaviour to the analoging sound signal Make, be translated into digital audio signal, the digital audio signal that is to say up voice signal mentioned above.Need explanation , above-mentioned sound pick up equipment can be the electro-acoustic elements such as microphone.
In actual applications, because descending voice signal by sound-producing device after being played, the biography of exterior of terminal can be based on Broadcast propagated and echo signal is formed in terminal surrounding preset range.Therefore, can include in terminal surrounding preset range near End voice signal and/or above-mentioned echo signal, that is to say, above-mentioned up voice signal can include near end sound signal and/or Above-mentioned echo signal.Wherein, near end sound signal refers to other sound in terminal surrounding environment in addition to above-mentioned echo signal Message number.When up voice signal includes near end sound signal, and, when echo signal is noise signal, illustrate that distal end is not said Words, and near-end is being talked, now terminal is in near-end talk situation;When up voice signal does not include near end sound signal, and, When echo signal is not noise signal, distal end is illustrated in speech, and near-end is not being talked, now terminal is in distal end speech shape State, when up voice signal includes near end sound signal, and, when echo signal is not noise signal, distal end is illustrated in speech, and , also in speech, now terminal is in double speaking state for near-end.
Whether the descending voice signal of step 204, terminal judges is noise signal.
As described above, in order to whether determine terminal in double speaking state, terminal is it needs to be determined that whether echo signal is noise Signal, due to echo signal and the correlation of descending voice signal, terminal can be by determining whether descending voice signal is to make an uproar The mode of message number determines whether echo signal is noise signal.In actual applications, terminal can be determined using following methods Whether descending voice signal is noise signal, specifically:
Terminal can obtain the zero-crossing rate of the energy of descending voice signal and descending voice signal, and then, terminal can compare The energy of more descending voice signal and the energy of noise signal, and the zero-crossing rate and the mistake of noise signal of relatively more descending voice signal Zero rate, final terminal can be based on comparative result, judge whether descending voice signal is noise signal.
It should be noted that above-mentioned energy is the metric parameter of sound signal intensity, and above-mentioned zero-crossing rate is also referred to as in short-term Zero-crossing rate, refers to the number of times that interior signal value per second passes through null value.In actual applications, the energy of noise signal is typically relatively low, That is to say less than preset energy threshold value, meanwhile, the zero-crossing rate of noise signal is typically relatively low, that is to say less than default zero-crossing rate threshold Value.Therefore, if the energy of descending voice signal is less than preset energy threshold value, and, the zero-crossing rate of descending voice signal is less than default Zero-crossing rate threshold value, then illustrate that the descending voice signal is noise signal.
Certainly, in actual applications, also other judge that whether descending voice signal is the method for noise signal, the disclosure This is not just repeated one by one.
Whether the up voice signal of step 205, terminal judges is comprising the voice signal less than predeterminated frequency threshold value.
As described above, in order to whether determine terminal in double speaking state, except it needs to be determined that whether echo signal is noise Outside signal, in addition it is also necessary to which whether determine in up voice signal includes near end sound signal.As described above, echo signal is signal Frequency is higher than the voice signal of predeterminated frequency threshold value, if then believing comprising the sound less than predeterminated frequency threshold value in up voice signal Near end sound signal is included by number in the up voice signal of explanation.
In actual applications, terminal can carry out Fourier transformation process to up voice signal, to obtain up sound The spectrum information of signal.Can determine the up voice signal in the frequency less than predeterminated frequency threshold value according to the spectrum information terminal Amplitude in section, if the amplitude is less than default amplitude thresholds, illustrates in the up voice signal not comprising less than predeterminated frequency The voice signal of threshold value, if the amplitude is higher than default amplitude thresholds, illustrates in the up voice signal comprising less than default frequency The voice signal of rate threshold value.
If step 206, descending voice signal are not noise signal, and, up voice signal is comprising less than predeterminated frequency threshold The voice signal of value, then terminal determine itself be in double speaking state.
Based on above step, terminal can determine whether descending voice signal is noise signal, and whether up voice signal Including the voice signal less than predeterminated frequency threshold value.It is not noise signal in descending voice signal, and, up voice signal is included Less than predeterminated frequency threshold value voice signal when, terminal determine itself be in double speaking state.Meanwhile, it is not in descending voice signal Noise signal, and, when up voice signal is not comprising the voice signal for being less than predeterminated frequency threshold value, terminal can determine and be in certainly In distal end talk situation;It is noise signal in descending voice signal, and, up voice signal is comprising less than predeterminated frequency threshold value During signal, terminal can determine that itself is in near-end talk situation.
It should be noted that the sequencing of the step of SOT state of termination method of embodiment of the present disclosure offer can be fitted Work as adjustment, step according to circumstances can also accordingly be increased and decreased, and any those familiar with the art takes off in the disclosure In the technical scope of dew, the method that can readily occur in change all should cover within the protection domain of the disclosure, therefore no longer go to live in the household of one's in-laws on getting married State.
In sum, the SOT state of termination that the embodiment of the present disclosure is provided determines method, due to sound-producing device performance in terminal Limit so that the signal frequency of echo signal is higher than predeterminated frequency threshold value, therefore terminal can be by judging descending voice signal Be whether noise signal to determine whether distal end talks, by judging up voice signal in whether comprising being less than predeterminated frequency threshold Whether the voice signal of value is determining near-end in speech.When proximally and distally all in speech, that is to say, descending voice signal is not For noise signal, and, when up voice signal is comprising voice signal less than predeterminated frequency threshold value, it may be determined that terminal is in double Say state.Whether the determination terminal that the present embodiment is provided in the method for double speaking state, is affected less by exterior of terminal environment, Thus go in variable exterior of terminal environment, and calculate relatively simple, so versatility is higher.
Fig. 3 is a kind of block diagram of the SOT state of termination determining device 300 according to an exemplary embodiment.With reference to Fig. 3, should Device includes:Playing module 301, pickup module 302, judge module 303 and determining module 304.
The playing module 301, is configured to play out descending voice signal, makes the descending voice signal be based on terminal Outer propagation path forms echo signal after propagating, and the signal frequency of the echo signal is higher than predeterminated frequency threshold value.
The pickup module 302, is configured to be picked up the voice signal in the terminal surrounding preset range, to obtain Up voice signal.
The judge module 303, is configured to judge whether the descending voice signal is noise signal.
The judge module 303, is additionally configured to judge whether the up voice signal of the pickup of pickup module 302 includes Less than the signal of the predeterminated frequency threshold value.
The determining module 304, it is not noise signal to be configured in the descending voice signal, and, the pickup module 3,020 When the up voice signal for taking is comprising the signal for being less than the predeterminated frequency threshold value, determine that the terminal is in double speaking state.
Optionally, the judge module 303 is configured to:Obtain the energy and the descending voice signal of the descending voice signal Zero-crossing rate;Compare the energy of the descending voice signal and the energy of noise signal;Compare the zero-crossing rate of the descending voice signal With the zero-crossing rate of noise signal;According to comparative result, judge whether the descending voice signal is noise signal.
Optionally, the playing module 301 is configured to:The descending voice signal is carried out by the sounding component of the terminal Play, the sounding component is the component that can only play the voice signal higher than predeterminated frequency threshold value.
Optionally, the determining module 304 is additionally configured to:It is not noise signal in the descending voice signal, and, this is up When voice signal is not comprising the signal for being less than the predeterminated frequency threshold value, determine that the terminal is in distal end talk situation.
Optionally, the determining module 304 is additionally configured to:It is noise signal in the descending voice signal, and, the up sound When message number is comprising the signal for being less than the predeterminated frequency threshold value, determine that the terminal is in near-end talk situation.
In sum, the SOT state of termination determining device that the embodiment of the present disclosure is provided, due to sound-producing device in playing module The restriction of energy, the signal frequency of echo signal is higher than predeterminated frequency threshold value so that judge module can be by judging descending sound Signal be whether noise signal to determine whether distal end talks, by judging up voice signal in whether comprising less than default frequency Whether the voice signal of rate threshold value is determining near-end in speech.When proximally and distally all in speech, that is to say, descending sound letter It is not number noise signal, and, when up voice signal is comprising voice signal less than predeterminated frequency threshold value, determining module can be true Terminal is determined in double speaking state.The SOT state of termination determining device that the present embodiment is provided determines that terminal, whether in double speaking state, is received The impact of exterior of terminal environment is less, in going for variable exterior of terminal environment, and calculate it is relatively simple, versatility compared with It is high.
With regard to the device in above-described embodiment, wherein modules perform the concrete mode of operation in relevant the method Embodiment in be described in detail, explanation will be not set forth in detail herein.
Fig. 4 is a kind of block diagram of the SOT state of termination determining device 400 according to an exemplary embodiment.For example, device 400 can be mobile phone, and computer, digital broadcast terminal, messaging devices, game console, tablet device, medical treatment sets It is standby, body-building equipment, personal digital assistant etc..
With reference to Fig. 4, device 400 can include following one or more assemblies:Process assembly 402, memory 404, power supply Component 406, multimedia groupware 408, audio-frequency assembly 410, the interface 412 of input/output (I/O), sensor cluster 414, and Communication part 416.
The integrated operation of the usual control device 400 of process assembly 402, such as with display, call, data communication, phase Machine operates and records the associated operation of operation.Process assembly 402 can refer to including one or more processors 420 to perform Order, to complete all or part of step of above-mentioned method.Additionally, process assembly 402 can include one or more modules, just Interaction between process assembly 402 and other assemblies.For example, process assembly 402 can include multi-media module, many to facilitate Interaction between media component 408 and process assembly 402.
Memory 404 is configured to store various types of data to support the operation in device 400.These data are shown Example includes the instruction of any application program for operating on device 400 or method, and contact data, telephone book data disappears Breath, picture, video etc..Memory 404 can be by any kind of volatibility or non-volatile memory device or their group Close and realize, such as static RAM (SRAM), Electrically Erasable Read Only Memory (EEPROM) is erasable to compile Journey read-only storage (EPROM), programmable read only memory (PROM), read-only storage (ROM), magnetic memory, flash Device, disk or CD.
Power supply module 406 provides electric power for the various assemblies of device 400.Power supply module 406 can include power management system System, one or more power supplys, and other generate, manage and distribute the component that electric power is associated with for device 400.
Multimedia groupware 408 is included in the screen of one output interface of offer between described device 400 and user.One In a little embodiments, screen can include liquid crystal display (LCD) and touch panel (TP).If screen includes touch panel, screen Curtain may be implemented as touch-screen, to receive the input signal from user.Touch panel includes one or more touch sensings Device is with the gesture on sensing touch, slip and touch panel.The touch sensor can not only sensing touch or sliding action Border, but also detect and the touch or slide related duration and pressure.In certain embodiments, many matchmakers Body component 408 includes a front-facing camera and/or post-positioned pick-up head.When device 400 be in operator scheme, such as screening-mode or During video mode, front-facing camera and/or post-positioned pick-up head can receive outside multi-medium data.Each front-facing camera and Post-positioned pick-up head can be a fixed optical lens system or with focusing and optical zoom capabilities.
Audio-frequency assembly 410 is configured to output and/or input audio signal.For example, audio-frequency assembly 410 includes a Mike Wind (MIC), when device 400 is in operator scheme, such as call model, logging mode and speech recognition mode, microphone is matched somebody with somebody It is set to reception external audio signal.The audio signal for being received can be further stored in memory 404 or via communication group Part 416 sends.In certain embodiments, audio-frequency assembly 410 also includes a loudspeaker, for exports audio signal.
, to provide interface between process assembly 402 and peripheral interface module, above-mentioned peripheral interface module can for I/O interfaces 412 To be keyboard, click wheel, button etc..These buttons may include but be not limited to:Home button, volume button, start button and lock Determine button.
Sensor cluster 414 includes one or more sensors, and the state for providing various aspects for device 400 is commented Estimate.For example, sensor cluster 414 can detect the opening/closed mode of device 400, and the relative positioning of component is for example described Component is the display and keypad of device 400, and sensor cluster 414 can be with 400 1 components of detection means 400 or device Position change, user is presence or absence of with what device 400 was contacted, the orientation of device 400 or acceleration/deceleration and device 400 Temperature change.Sensor cluster 414 can include proximity transducer, be configured to be detected when without any physical contact The presence of object nearby.Sensor cluster 414 can also include optical sensor, such as CMOS or ccd image sensor, for into As used in application.In certain embodiments, the sensor cluster 414 can also include acceleration transducer, gyro sensors Device, Magnetic Sensor, pressure sensor or temperature sensor.
Communication part 416 is configured to facilitate the communication of wired or wireless way between device 400 and other equipment.Device 400 can access based on the wireless network of communication standard, such as WiFi, 2G or 3G, or combinations thereof.In an exemplary enforcement In example, communication component 416 receives the broadcast singal or broadcast related information from external broadcasting management system via broadcast channel. In one exemplary embodiment, the communication component 416 also includes near-field communication (NFC) module, to promote short range communication.Example Such as, NFC module can be based on RF identification (RFID) technology, Infrared Data Association (IrDA) technology, ultra broadband (UWB) technology, Bluetooth (BT) technology and other technologies are realizing.
In the exemplary embodiment, device 400 can be by one or more application specific integrated circuits (ASIC), numeral letter Number processor (DSP), digital signal processing appts (DSPD), PLD (PLD), field programmable gate array (FPGA), controller, microcontroller, microprocessor or other electronic components realizations, for performing said method.
In the exemplary embodiment, a kind of non-transitorycomputer readable storage medium including instruction, example are additionally provided Such as include the memory 404 of instruction, above-mentioned instruction can be performed to complete said method by the processor 420 of device 400.For example, The non-transitorycomputer readable storage medium can be ROM, random access memory (RAM), CD-ROM, tape, floppy disk With optical data storage devices etc..
In the exemplary embodiment, a kind of non-transitorycomputer readable storage medium is additionally provided, when the storage is situated between Instruction in matter by mobile terminal computing device when so that mobile terminal is able to carry out following methods:Descending sound is believed Number play out, make descending voice signal form echo signal, the letter of echo signal after propagating based on the propagation path outside terminal Number frequency is higher than predeterminated frequency threshold value;Voice signal in terminal surrounding preset range is picked up, to obtain up sound Signal;Judge whether descending voice signal is noise signal;Judge up voice signal whether comprising less than predeterminated frequency threshold value Voice signal;If descending voice signal is not noise signal, and, up voice signal includes the sound less than predeterminated frequency threshold value Message number, it is determined that terminal is in double speaking state.
Those skilled in the art will readily occur to its of the disclosure after considering specification and putting into practice invention disclosed herein Its embodiment.The application is intended to any modification, purposes or the adaptations of the disclosure, these modifications, purposes or Person's adaptations follow the general principle of the disclosure and including the undocumented common knowledge in the art of the disclosure Or conventional techniques.Description and embodiments are considered only as exemplary, and the true scope of the disclosure and spirit are by following Claim is pointed out.
It should be appreciated that the disclosure is not limited to the precision architecture for being described above and being shown in the drawings, and And can without departing from the scope carry out various modifications and changes.The scope of the present disclosure is only limited by appended claim.

Claims (11)

1. a kind of SOT state of termination determines method, it is characterised in that methods described includes:
Descending voice signal is played out so that the descending voice signal is formed after being propagated based on the propagation path outside terminal Echo signal, the signal frequency of the echo signal is higher than predeterminated frequency threshold value;
Voice signal in the terminal surrounding preset range is picked up, to obtain up voice signal;
Judge whether the descending voice signal is noise signal;
Judge the up voice signal whether comprising the voice signal less than the predeterminated frequency threshold value;
If the descending voice signal is not noise signal, and, the up voice signal is comprising less than the predeterminated frequency threshold The voice signal of value, it is determined that the terminal is in double speaking state.
2. method according to claim 1, it is characterised in that described to judge whether the descending voice signal is noise letter Number, including:
Obtain the energy of the descending voice signal and the zero-crossing rate of the descending voice signal;
The energy of the comparison descending voice signal and the energy of noise signal;
The zero-crossing rate of the comparison descending voice signal and the zero-crossing rate of noise signal;
According to comparative result, judge whether the descending voice signal is noise signal.
3. method according to claim 1, it is characterised in that described that descending voice signal is played out, including:
The descending voice signal is played out by the sound-producing device of the terminal, the sound-producing device is high to be only capable of broadcasting In the device of the voice signal of the predeterminated frequency threshold value.
4. method according to claim 1, it is characterised in that methods described also includes:
If the descending voice signal is not noise signal, and, the up voice signal is not comprising less than the predeterminated frequency The voice signal of threshold value, it is determined that the terminal is in distal end talk situation.
5. method according to claim 1, it is characterised in that methods described also includes:
If the descending voice signal is noise signal, and, the up voice signal is comprising less than the predeterminated frequency threshold value Voice signal, it is determined that the terminal be in near-end talk situation.
6. a kind of SOT state of termination determining device, it is characterised in that described device includes:
Playing module, is configured to play out descending voice signal, makes the descending voice signal based on the biography outside terminal Broadcast and formed after propagated echo signal, the signal frequency of the echo signal is higher than predeterminated frequency threshold value;
Pickup module, is configured to be picked up the voice signal in the terminal surrounding preset range, to obtain up sound Message number;
Judge module, is configured to judge whether the descending voice signal is noise signal;
Whether the judge module, be additionally configured to judge the described up voice signal of the pickup module pickup comprising being less than The signal of the predeterminated frequency threshold value;
Determining module, it is not noise signal to be configured in the descending voice signal, and, it is described that the pickup module is picked up When up voice signal is comprising the signal for being less than the predeterminated frequency threshold value, determine that the terminal is in double speaking state.
7. device according to claim 6, it is characterised in that the judge module is configured to:
Obtain the energy of the descending voice signal and the zero-crossing rate of the descending voice signal;
The energy of the comparison descending voice signal and the energy of noise signal;
The zero-crossing rate of the comparison descending voice signal and the zero-crossing rate of noise signal;
According to comparative result, judge whether the descending voice signal is noise signal.
8. device according to claim 6, it is characterised in that the playing module is configured to:
The descending voice signal is played out by the sounding component of the terminal, the sounding component is high to be only capable of broadcasting In the component of the voice signal of predeterminated frequency threshold value.
9. device according to claim 6, it is characterised in that the determining module is additionally configured to:
It is not noise signal in the descending voice signal, and, the up voice signal is not comprising less than the predeterminated frequency During the signal of threshold value, determine that the terminal is in distal end talk situation.
10. device according to claim 6, it is characterised in that the determining module is additionally configured to:
It is noise signal in the descending voice signal, and, the up voice signal is comprising less than the predeterminated frequency threshold value Voice signal when, determine the terminal be in near-end talk situation.
11. a kind of SOT state of termination determining devices, it is characterised in that described device includes:
Processor;
For storing the memory of the executable instruction of processor;
Wherein, the processor is configured to:
Descending voice signal is played out, the descending voice signal is formed back after propagating based on the propagation path outside terminal Message number, the signal frequency of the echo signal is higher than predeterminated frequency threshold value;
Voice signal in the terminal surrounding preset range is picked up, to obtain up voice signal;
Judge whether the descending voice signal is noise signal;
Judge the up voice signal whether comprising the signal less than the predeterminated frequency threshold value;
If the descending voice signal is not noise signal, and, the up voice signal is comprising less than the predeterminated frequency threshold The signal of value, it is determined that the terminal is in double speaking state.
CN201611233992.8A 2016-12-28 2016-12-28 Terminal state determining method and apparatus Pending CN106601269A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611233992.8A CN106601269A (en) 2016-12-28 2016-12-28 Terminal state determining method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611233992.8A CN106601269A (en) 2016-12-28 2016-12-28 Terminal state determining method and apparatus

Publications (1)

Publication Number Publication Date
CN106601269A true CN106601269A (en) 2017-04-26

Family

ID=58602843

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611233992.8A Pending CN106601269A (en) 2016-12-28 2016-12-28 Terminal state determining method and apparatus

Country Status (1)

Country Link
CN (1) CN106601269A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114571473A (en) * 2020-12-01 2022-06-03 北京小米移动软件有限公司 Control method and device for foot type robot and foot type robot

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1486566A (en) * 2000-09-15 2004-03-31 英特尔公司 Residual echo estimation for echo cancellation
CN101067927A (en) * 2007-04-19 2007-11-07 北京中星微电子有限公司 Sound volume adjusting method and device
WO2010083641A1 (en) * 2009-01-20 2010-07-29 华为技术有限公司 Method and apparatus for detecting double talk
CN104157290A (en) * 2014-08-19 2014-11-19 大连理工大学 Speaker recognition method based on depth learning
CN105427868A (en) * 2015-10-30 2016-03-23 杭州乐哈思智能科技有限公司 Method for eliminating noise of VOIP system bidirectional duplex hand-free voice

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1486566A (en) * 2000-09-15 2004-03-31 英特尔公司 Residual echo estimation for echo cancellation
CN101067927A (en) * 2007-04-19 2007-11-07 北京中星微电子有限公司 Sound volume adjusting method and device
WO2010083641A1 (en) * 2009-01-20 2010-07-29 华为技术有限公司 Method and apparatus for detecting double talk
CN104157290A (en) * 2014-08-19 2014-11-19 大连理工大学 Speaker recognition method based on depth learning
CN105427868A (en) * 2015-10-30 2016-03-23 杭州乐哈思智能科技有限公司 Method for eliminating noise of VOIP system bidirectional duplex hand-free voice

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114571473A (en) * 2020-12-01 2022-06-03 北京小米移动软件有限公司 Control method and device for foot type robot and foot type robot

Similar Documents

Publication Publication Date Title
CN108509232A (en) Screen recording method, device and computer readable storage medium
CN106161781A (en) Method for regulation of sound volume and device
CN104991754A (en) Recording method and apparatus
CN104104771A (en) Conversation processing method and device
CN106157952B (en) Sound identification method and device
CN106791245A (en) Determine the method and device of filter coefficient
CN104935729B (en) Audio-frequency inputting method and device
CN109087650A (en) voice awakening method and device
CN107833579A (en) Noise cancellation method, device and computer-readable recording medium
CN108076199A (en) The air-tightness detection method and device of microphone
CN108200267A (en) A kind of terminal control method, terminal and computer readable storage medium
CN106888327A (en) Speech playing method and device
CN104702756A (en) Detecting method and detecting device for soundless call
CN105744210B (en) Echo cancel method, the apparatus and system of video conference
CN106601269A (en) Terminal state determining method and apparatus
CN108206884A (en) Terminal, the method for adjustment of terminal transmission signal of communication and electronic equipment
CN103973883B (en) A kind of method and device controlling voice-input device
CN109862171A (en) Terminal equipment control method and device
CN106683683A (en) Terminal state determining method and device
US11388281B2 (en) Adaptive method and apparatus for intelligent terminal, and terminal
CN112217948B (en) Echo processing method, device, equipment and storage medium for voice call
CN111694539B (en) Method, device and medium for switching between earphone and loudspeaker
CN106507023A (en) The method and device processed by audio frequency and video request
CN107682101A (en) Noise detecting method, device and electronic equipment
CN107172557B (en) Method and device for detecting polarity of loudspeaker and receiver

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170426

RJ01 Rejection of invention patent application after publication