CN108616667B

CN108616667B - Call method and device

Info

Publication number: CN108616667B
Application number: CN201810460502.0A
Authority: CN
Inventors: 谢军; 张雪元
Original assignee: Lenovo Beijing Ltd
Current assignee: Lenovo Beijing Ltd
Priority date: 2018-05-14
Filing date: 2018-05-14
Publication date: 2021-02-19
Anticipated expiration: 2038-05-14
Also published as: CN108616667A

Abstract

The present disclosure provides a method for a call, comprising: receiving a user input indicating that the user desires to end a call; responding to the user input, and determining whether the call counterpart is to end the call according to an audio signal and/or a communication link state sent by the call counterpart; and ending the call under the condition that the audio signal and/or the communication link state sent by the call counterpart indicate that the call counterpart is to end the call. The present disclosure also provides an apparatus for calling. By utilizing the method and the device for conversation provided by the disclosure, the problem that the user directly ends the conversation because of unconsciousness under the condition that the conversation counterpart wants to keep the conversation can be at least partially solved, and the user experience is favorably improved.

Description

Call method and device

Technical Field

The disclosure relates to a conversation method and device.

Background

Communication terminals play an increasingly important role in people's daily life, and people can use communication terminals to realize various functions, such as listening to music, watching videos, performing voice chat, performing visual chat and the like by using smart phones. The voice call is the most basic function of the communication terminal and is also the preferred information interaction mode for people in many occasions.

The voice call has its own characteristics, and when one party hangs up the call and the other party has information which is not transmitted, a call needs to be created again to transmit other information, which reduces the user experience.

Disclosure of Invention

One aspect of the present disclosure provides a method and an apparatus method for a call, including:

and receiving user input, wherein when the user input indicates that the user expects to end the call, responding to the user input, and determining whether the call counterpart is to end the call according to the audio signal and/or the communication link state sent by the call counterpart, so that the user can be helped to determine whether the call counterpart has untransmitted information, and ending the call under the condition that the audio signal and/or the communication link state sent by the call counterpart indicates that the call counterpart is to end the call.

Optionally, when the audio signal and/or the communication link state sent by the call partner represent that the call partner is to maintain the call, the prompt information is output to help the user to know that the call partner has the information that is not transferred, so as to avoid that the user directly ends the call when the user does not know that the call partner has the information that is not transferred.

Optionally, the user input indicating that the user desires to end the call includes, but is not limited to, any of: corresponding to an operation of ending a call, an operation of closing a screen of a mobile phone, or an operation of moving a proximity sensor away from an object during a call.

Optionally, whether the call counterpart is to end the call may be determined, on the first hand, if the communication link is off, or a value of a volume of an audio signal sent by the call counterpart is continuously smaller than a first set volume threshold, determining that the call counterpart is to end the call, on the second hand, if the communication link is on and the value of the volume of the audio signal sent by the call counterpart is larger than the first set volume threshold, determining that the call counterpart is to keep the call, and on the third hand, if the communication link is on, the value of the volume of the audio signal sent by the call counterpart is larger than a second set volume threshold, and a duration exceeds a first set duration threshold, determining that the call counterpart is to keep the call.

Alternatively, whether the call partner is to end the call may be determined by, in a first aspect, determining that the call partner is to end the call if the communication link state is off, in a second aspect, determining that the call partner is to end the call if the communication link state is on, in a case where a value of a volume of an audio signal transmitted by the call partner is continuously smaller than a first set volume threshold, in a third aspect, determining that the call partner is to end the call if the communication link state is on, in a case where one or more call end words are included in the audio signal transmitted by the call partner, or one or more call end words are included in the audio signal transmitted by the call partner and a value of a volume of an audio signal corresponding to at least one of the call end words is larger than a third set volume threshold, in a fourth aspect, in a case where the communication link state is on, a fifth aspect, if the communication link state is on, if the volume value of the audio signal sent by the other party is greater than a second set volume threshold, the duration exceeds a first set duration threshold, and the audio signal does not include the end word, determining that the other party is to keep talking, a sixth aspect, if the audio signal sent by the other party includes one or more holding words, or the audio signal sent by the other party includes one or more holding words and the volume value of the audio signal corresponding to at least one holding word is greater than a fourth set volume threshold, it is determined that the call partner is to keep the call.

Optionally, the method may further comprise the operations of: in the process of communication, a first voiceprint feature of an audio signal sent by the opposite party of communication is obtained, the first voiceprint feature can be used for representing the identity of the opposite party of communication, a second voiceprint feature of the audio signal containing the end word of communication is obtained in response to the fact that the audio signal containing the end word of communication sent by the opposite party of communication is received, the second voiceprint feature can be used for representing the identity information of a speaker corresponding to the audio signal containing the end word of communication, if the first voiceprint feature is matched with the second voiceprint feature, the audio signal containing the end word of communication is sent by the opposite party of communication, or if the second voiceprint feature is matched with a prestored third voiceprint feature of the opposite party of communication, the audio signal containing the end word of communication sent by the speaker around the opposite party of communication is judged by mistake, the opposite party of communication is determined to be ended, and the wrong judgment of the audio signal containing the end word sent by the speaker around the opposite party of communication can And sending the audio signal for the opposite call party.

Optionally, the method may further comprise the operations of: and during the call, if the identity attribute of the call counterpart corresponds to the category that the user does not want to keep the call, ending the call in response to the user input. This can avoid the user inconveniently and quickly ending the call that is not desired to be held.

Optionally, the method may further comprise the operations of: during a call, if the signals acquired by the one or more sensors indicate that the user is in a situation where it is inconvenient to answer the call, the call is ended in response to the user input.

Optionally, the prompt message may prompt the user in any one or more of the following forms: the first form prompts the graph corresponding to the call ending in the graphical user interface in a flashing way so as to facilitate the user to intuitively know that the other party is still transmitting information, the second form blunts the graph corresponding to the call ending in the graphical user interface and displays the waveform graph of the currently received audio signal at the position of the graph for ending the call, the form can prompt the user of the information amount and the information importance degree of the other party still transmitting the call, if the more complex the waveform in the waveform graph indicates that the more the information amount is transmitting, the higher the peak in the waveform graph indicates that the more important the information is transmitting, and the third form executes vibration in response to the currently received audio signal including a call holding word, which is more suitable for the other party when suddenly thinking that the important information is not transmitting to the other party during the preparation of ending the call, the method can effectively avoid that the user ends the call unconsciously after the call counterpart sends the audio signal containing the call holding word.

Another aspect of the present disclosure provides a system for calling, which may include the following modules: the device comprises a receiving module, a determining module and a call ending module, wherein the receiving module is used for receiving user input, the user input indicates that a user expects to end a call, the determining module is used for responding to the user input and determining whether the call opposite side needs to end the call according to an audio signal and/or a communication link state sent by the call opposite side, and the call ending module is used for ending the call under the condition that the audio signal and/or the communication link state sent by the call opposite side indicates that the call opposite side needs to end the call.

Optionally, the call system may further include: and the prompting module is used for outputting prompting information under the condition that the audio signal and/or the communication link state sent by the call counterpart represent that the call counterpart needs to keep calling.

Optionally, the determining module may include the following units: a first determination unit, a second determination unit, and a third determination unit. The first determination unit is configured to determine that the call counterpart needs to end the call if the communication link state is off or a value of volume of an audio signal sent by the call counterpart is continuously smaller than a first set volume threshold, the second determination unit is configured to determine that the call counterpart needs to keep the call if the communication link state is on and the value of volume of the audio signal sent by the call counterpart is larger than the first set volume threshold, and the third determination unit is configured to determine that the call counterpart needs to keep the call if the communication link state is on, the value of volume of the audio signal sent by the call counterpart is larger than a second set volume threshold and the duration exceeds a first set duration threshold.

Optionally, the determining module may include the following units: a fourth determining unit, a fifth determining unit, a sixth determining unit, a seventh determining unit, an eighth determining unit, and a ninth determining unit, wherein the fourth determining unit is configured to determine that the other party is to end the call if the communication link state is off, the fifth determining unit is configured to determine that the other party is to end the call if the communication link state is on, and if the value of the volume of the audio signal sent by the other party is continuously smaller than the first set volume threshold, the sixth determining unit is configured to determine that the other party is to end the call if the communication link state is on, and if the audio signal sent by the other party contains one or more call ending words or the audio signal sent by the other party contains one or more call ending words and the value of the volume of the audio signal corresponding to at least one call ending word is larger than the third set volume threshold, determining that the call of the other party is to be ended, wherein the seventh determining unit is configured to determine that the call of the other party is to be kept if the communication link state is on, and the value of the volume of the audio signal sent by the other party is greater than a first set volume threshold, and the audio signal does not include the call ending word, the eighth determining unit is configured to determine that the call of the other party is to be kept if the communication link state is on, and the ninth determining unit is configured to determine that the call of the other party is to be kept if the value of the volume of the audio signal sent by the other party is greater than a second set volume threshold, the duration exceeds a first set duration threshold, and the audio signal does not include the call ending word, and the ninth determining unit is configured to determine that the call of the other party is to be kept if the audio signal sent by the other party contains one or more call holding words, or the audio signal sent by the other party contains one or more call holding words and at least one audio And if the volume value is larger than the fourth set volume threshold value, determining that the opposite party needs to keep talking.

Optionally, the call system may further include: the device comprises a first voiceprint acquisition module and a second voiceprint acquisition module, wherein the first voiceprint acquisition module is used for acquiring a first voiceprint feature of an audio signal sent by a call counterpart in a call process, the second voiceprint acquisition module is used for acquiring a second voiceprint feature of the audio signal containing a call ending word in response to receiving the audio signal containing the call ending word sent by the call counterpart, and the determination module is specifically used for determining that the call counterpart is to end the call if the first voiceprint feature is matched with the second voiceprint feature or if the second voiceprint feature is matched with a prestored third voiceprint feature of the call counterpart.

Optionally, the call system may further include: the identity attribute acquisition module is used for acquiring the identity attribute of the call counterpart in the call process, and the call ending module is specifically used for responding to the user input to end the call if the identity attribute of the call counterpart corresponds to the category of the call which the user does not want to keep in the call process.

Optionally, the call system may further include: the system comprises a sensor signal acquisition module and a call ending module, wherein the sensor signal acquisition module is used for acquiring signals through one or more sensors in the call process, and the call ending module is specifically used for responding to the input of the user to end the call if the signals acquired by the one or more sensors indicate that the user is in a situation where the user is not convenient to answer the call in the call process.

Optionally, the prompting module may specifically include any one or more of the following units: the voice prompt device comprises a first prompt unit, a second prompt unit or a third prompt unit, wherein the first prompt unit is used for prompting in a flashing mode corresponding to a graph of finishing conversation in a graphical user interface, the second prompt unit is used for blurring the graph of finishing conversation in the graphical user interface and displaying a waveform diagram of a currently received audio signal at the position of the graph of finishing conversation, and the third prompt unit is used for responding to the currently received audio signal and comprises a conversation holding word and executing vibration.

Another aspect of the present disclosure provides an apparatus for calling, including: a receiver for receiving user input indicating that the user desires to end a call, and one or more processors running a program to perform the method as described above.

Another aspect of the disclosure provides a non-volatile storage medium storing computer-executable instructions for implementing the method as described above when executed.

Another aspect of the disclosure provides a computer program comprising computer executable instructions for implementing the method as described above when executed.

Drawings

For a more complete understanding of the present disclosure and the advantages thereof, reference is now made to the following descriptions taken in conjunction with the accompanying drawings, in which:

fig. 1A schematically illustrates a first application scenario of a method and apparatus for call according to an embodiment of the present disclosure;

fig. 1B schematically illustrates a second application scenario of the method and apparatus for calling according to the embodiment of the present disclosure;

fig. 2A schematically illustrates a first flowchart of a method for calling in accordance with an embodiment of the present disclosure;

fig. 2B schematically illustrates a second flowchart of a method for calling in accordance with an embodiment of the present disclosure;

fig. 2C schematically illustrates a third flowchart of a method for calling according to an embodiment of the present disclosure;

fig. 2D schematically illustrates a fourth flowchart of a method for calling in accordance with an embodiment of the present disclosure;

fig. 2E schematically illustrates a fifth flowchart of a method for calling according to an embodiment of the present disclosure;

fig. 3A schematically illustrates a first block diagram of a system for telephony in accordance with an embodiment of the present disclosure;

fig. 3B schematically illustrates a second block diagram of a system for telephony in accordance with an embodiment of the present disclosure;

fig. 3C schematically illustrates a third block diagram of a system for telephony in accordance with an embodiment of the present disclosure;

fig. 3D schematically illustrates a fourth block diagram of a system for telephony in accordance with an embodiment of the present disclosure;

fig. 3E schematically illustrates a fifth block diagram of a system for telephony in accordance with an embodiment of the present disclosure;

fig. 4 schematically shows a block diagram of an apparatus for conversation according to an embodiment of the present disclosure.

Detailed Description

Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is illustrative only and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.

All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.

Where a convention analogous to "at least one of A, B and C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B and C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.). Where a convention analogous to "A, B or at least one of C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B or C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.). It will be further understood by those within the art that virtually any disjunctive word and/or phrase presenting two or more alternative terms, whether in the description, claims, or drawings, should be understood to contemplate the possibilities of including one of the terms, either of the terms, or both terms. For example, the phrase "a or B" should be understood to include the possibility of "a" or "B", or "a and B".

Some block diagrams and/or flow diagrams are shown in the figures. It will be understood that some blocks of the block diagrams and/or flowchart illustrations, or combinations thereof, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the instructions, which execute via the processor, create means for implementing the functions/acts specified in the block diagrams and/or flowchart block or blocks. It should be noted that the above-mentioned computer may refer to a computer device including a call function, for example, a mobile terminal having a call function, and more specifically, may be a mobile phone, a smart phone, and the like.

Accordingly, the techniques of this disclosure may be implemented in hardware and/or software (including firmware, microcode, etc.). In addition, the techniques of this disclosure may take the form of a computer program product on a computer-readable medium having instructions stored thereon for use by or in connection with an instruction execution system. In the context of this disclosure, a computer-readable medium may be any medium that can contain, store, communicate, propagate, or transport the instructions. For example, the computer readable medium can include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, device, or propagation medium. Specific examples of the computer readable medium include: magnetic storage devices, such as magnetic tape or Hard Disk Drives (HDDs); optical storage devices, such as compact disks (CD-ROMs); a memory, such as a Random Access Memory (RAM) or a flash memory; and/or wired/wireless communication links.

Embodiments of the present disclosure provide a method for a call and a system capable of applying the method. The method includes receiving a user input, determining whether a call counterpart is to end the call according to an audio signal and/or a communication link state sent by the call counterpart in response to the user input when the user input indicates that the user desires to end the call, and ending the call in a case where the audio signal and/or the communication link state sent by the call counterpart indicates that the call counterpart is to end the call. Therefore, when the input of the user indicates that the user expects to end the call, the method and the device can determine whether the call of the opposite party needs to be ended or not through the audio signal and/or the communication link state sent by the opposite party, and execute the operation of ending the call if the call of the opposite party needs to be ended, so that the situation that when the call of the opposite party also wants to keep the call, the user does not know that the call of the opposite party wants to be kept but directly ends the call, the information which the call of the opposite party wants to transmit cannot be transmitted to the user in the call process, and the user experience is low can be avoided.

Fig. 1A to 1B schematically illustrate application scenarios of the call method and apparatus according to an embodiment of the present disclosure. It should be noted that the application scenarios shown in fig. 1A to 1B are only examples of scenarios in which the embodiments of the present disclosure may be applied to help those skilled in the art understand the technical content of the present disclosure, but do not mean that the embodiments of the present disclosure may not be applied to other devices, systems, environments or scenarios. For example, both parties may communicate using a wristwatch, glasses, tablet computer, or the like having a communication function.

As shown in fig. 1A, a first application scenario of the method and apparatus for call according to the embodiment of the disclosure is schematically shown.

The method comprises the steps that a user communicates with a communication opposite side through a mobile phone, a Proximity Sensor (p-Sensor) is arranged on the mobile phone, when the user keeps the mobile phone away from the head, the p-Sensor sends a signal indicating that the mobile phone is far away from the head to a mobile phone processor, the processor judges whether the user has the intention of ending the communication in advance, the processor determines whether the communication opposite side needs to end the communication according to an audio signal and/or a communication link state sent by the communication opposite side, and if the communication opposite side needs to end the communication, user input is executed. Of course, the call may also be ended after the user performs the second confirmation, for example, when the processor receives the signal indicating that the mobile phone is far away from the head sent by the p-sensor, determines that the call is to be ended by the opposite party, and receives the instruction corresponding to the call ending input by the user, the call is ended.

As shown in fig. 1B, a second application scenario of the method and apparatus for call according to the embodiment of the disclosure is schematically shown.

The method comprises the steps that a user communicates with a communication counterpart through a mobile phone, the mobile phone is connected with a communication earphone, when the user inputs an operation corresponding to the fact that the user expects to end the communication through a communication ending graph displayed on a mobile phone screen or a communication ending key of the communication earphone, the processor determines that the user has the intention of ending the communication, then the processor determines whether the communication counterpart wants to end the communication according to an audio signal and/or a communication link state sent by the communication counterpart, and if the operation is yes, the user input is executed. For example, when the processor receives a signal corresponding to the call ending input by the user and determines that the call counterpart is to end the call, the call is ended. Of course, the current scenario can also be applied to a scenario of performing a call by using a hands-free function of a mobile phone, and is not described in detail herein.

Fig. 2A-2E schematically illustrate various flow diagrams of a method for telephony in accordance with an embodiment of the present disclosure.

As shown in fig. 2A, the method includes operations S201 to S203.

In operation S201, a user input indicating that the user desires to end a call is received.

The input of the user may be any input in the prior art, for example, inputting characters, inputting a specific instruction, specifically, an operation such as clicking, sliding, and the like. The user input in this embodiment is a specific user input indicating that the user desires to end the call, and in a specific embodiment, the user input includes any one of the following: corresponding to an operation of ending a call, an operation of closing a screen of a mobile phone, or an operation of moving a proximity sensor away from an object during a call.

The operation corresponding to the call ending may be a preset operation: clicking and sliding a graph corresponding to the call ending in the graphical user interface, and performing preset operations (swinging, placing the front of a mobile phone screen on other objects, and the like) corresponding to the call ending on the mobile phone; and corresponding to the operation of turning off the screen of the mobile phone: clicking, sliding and other operations are carried out on a graph corresponding to a screen off in a graphical user interface, and preset operations (swinging, placing the front side of a mobile phone screen on other objects and the like) corresponding to the screen off are carried out on the mobile phone; corresponding to the operation of moving the proximity sensor away from the object during the call: an operation of moving the mobile phone having the proximity sensor away from an object (e.g., a head), and the like.

In operation S202, in response to the user input, it is determined whether the call partner is to end the call according to an audio signal and/or a communication link state transmitted by the call partner.

If the communication link state is disconnected, indicating that the call opposite side has ended the call, determining that the call opposite side is about to end the call; if the call counterpart is still sending audio signals, it can be determined whether the counterpart wants to end the call according to the received audio signals.

In one embodiment, determining whether the call partner is to end the call according to the audio signal and/or the communication link status sent by the call partner comprises:

and if the communication link is disconnected or the volume value of the audio signal sent by the opposite call party is continuously smaller than the first set volume threshold, determining that the opposite call party is to finish the call. When the value of the volume of the audio signal sent by the call partner is continuously smaller than the first set volume threshold, it indicates that the call partner may not speak to the microphone even though the call partner still speaks, which indicates that the call partner has no information to be transmitted to the user, i.e., the call partner wishes to end the call. The first set volume threshold may be empirically or experimentally determined and may be in decibels.

And if the communication link state is on and the value of the volume of the audio signal sent by the call partner is greater than a first set volume threshold value, determining that the call partner keeps calling. If the value of the volume of the audio signal sent by the call partner is larger than the first set volume threshold, the information that the call partner wants to transmit to the user is carried in at least part of the audio signal sent by the call partner, namely the call partner needs to keep talking.

And if the communication link is in a connected state, the volume value of the audio signal sent by the opposite call party is greater than a second set volume threshold value, and the duration exceeds a first set duration threshold value, determining that the opposite call party needs to keep calling. If the volume value of the audio signal sent by the call partner is greater than the second set volume threshold and the duration exceeds the first set duration threshold, the call partner is indicated to speak all the time, and although the sound may be small, the call partner can be indicated to send information to the user continuously. It should be noted that the second set volume threshold may be smaller than or equal to the first set volume threshold, and the first set duration threshold may be determined according to experience or experimental results, for example, the duration required for completing a phrase or specifying a number at a normal speed.

Through the operation, whether the call counterpart is to end the call can be determined.

In another embodiment, in order to more accurately determine whether a call partner is to end a call, the determining whether the call partner is to end a call according to an audio signal and/or a communication link state transmitted by the call partner may include various ways as shown below.

And if the communication link state is disconnected, determining that the call counterpart is to finish the call.

And if the communication link is in the on state, determining that the call counterpart is to end the call under the condition that the volume value of the audio signal sent by the call counterpart is continuously smaller than a first set volume threshold value.

And if the communication link state is on, determining that the call of the opposite party is to be ended under the condition that the audio signal sent by the opposite party contains one or more call ending words or the audio signal sent by the opposite party contains one or more call ending words and the volume value of the audio signal corresponding to at least one call ending word is larger than a third set volume threshold value. The call ending word may be a preset word preset and stored in the call device, for example, a bye, a bailey, a next chat, a talk back, a hang, a like bar, and the like. The recognition process of the call end word may include: and inputting the audio signal into a pre-trained voice recognition model for voice recognition, and judging whether the current audio signal comprises a call ending word or not by a matching method after recognizing the word corresponding to the audio signal. Of course, in order to improve the accuracy of speech recognition, operations such as noise reduction, and obtaining sentences or phrases in the audio signal by an endpoint detection technique may also be performed, and will not be described in detail herein. The third set volume threshold may be less than or equal to the second set volume threshold.

And if the communication link state is on, determining that the call opposite side needs to keep calling under the condition that the volume value of the audio signal sent by the call opposite side is larger than a first set volume threshold value and the audio signal does not include the call ending word.

And if the communication link is in a connected state, determining that the communication opposite side needs to keep communicating under the conditions that the volume value of an audio signal sent by the communication opposite side is larger than a second set volume threshold value, the duration exceeds a first set duration threshold value, and the audio signal does not include the communication ending word. The recognition process of the call end word may be as described above and will not be described in detail here.

And if the audio signal sent by the call opposite side contains one or more call holding words, or the audio signal sent by the call opposite side contains one or more call holding words and the value of the volume of the audio signal corresponding to at least one call holding word is larger than a fourth set volume threshold value, determining that the call opposite side needs to hold the call. The call holding words may be preset words preset and stored in the call device, for example, words indicating that the call partner wishes to hold the call when waiting, when returning, when feeding, when returning, when forgetting, when waiting, and the like. The recognition process of the call end word may include: and inputting the audio signal into a pre-trained voice recognition model for voice recognition, and judging whether the current audio signal comprises a conversation holding word or not by a matching method after recognizing the word corresponding to the audio signal. Of course, in order to improve the accuracy of speech recognition, operations such as noise reduction, and obtaining sentences or phrases in the audio signal by an endpoint detection technique may also be performed, and will not be described in detail herein. The fourth set volume threshold may be less than or equal to the second set volume threshold.

In operation S203, in case that the audio signal and/or the communication link status transmitted by the call partner indicate that the call partner is to end the call, the call is ended.

In the embodiment of the disclosure, the user input indicating that the user wants to end the call is directly responded only when the call counterpart wants to end the call, so that the situation that the user directly hangs up the call when the user does not know that the call counterpart wants to keep the call can be at least partially avoided.

As shown in fig. 2B, a second flowchart of a method for calling according to an embodiment of the present disclosure is schematically shown.

In this embodiment, the method may further include an operation S204 of outputting a prompt message when the audio signal and/or the communication link status sent by the call partner indicate that the call partner is to keep on call.

Specifically, in a specific embodiment, the counterpart to keep talking may correspond to the following situation: in the first situation, the communication link is in a connected state, and the volume value of the audio signal sent by the opposite call party is greater than a first set volume threshold value; in the second situation, the communication link is in a connected state, the volume value of the audio signal sent by the opposite call party is greater than the second set volume threshold, and the duration exceeds the first set duration threshold.

In another embodiment, the case that the call partner wants to keep calling may correspond to the following case: in the first situation, the communication link is in a connected state, the volume value of an audio signal sent by a call counterpart is greater than a first set volume threshold, and the audio signal does not include the call end word; in the second situation, the communication link is in a connected state, the volume value of an audio signal sent by a call counterpart is greater than a second set volume threshold, the duration exceeds a first set duration threshold, and the audio signal does not include the call end word; in a third situation, the audio signal sent by the call partner includes one or more call holding words, or the audio signal sent by the call partner includes one or more call holding words and the value of the volume of the audio signal corresponding to at least one of the call holding words is greater than a fourth set volume threshold. It should be noted that the present embodiment is higher in accuracy of determining that the call partner is to keep the call than the previous embodiment. Reference may be made to the corresponding embodiment of fig. 2A with respect to setting the volume threshold and setting the duration threshold.

The output prompt information can be prompted in a form of sound and light, and can also be prompted in a form of mechanical vibration and the like.

In a specific embodiment, the output prompt message may include any one or more of the following: prompting in a flashing mode corresponding to a graph for finishing the call in the graphical user interface; for example, the image corresponding to the call completion flashes.

Blurring a graph corresponding to the call ending in the graphical user interface, and displaying a waveform diagram of the currently received audio signal at the position of the graph for ending the call; for example, a waveform diagram showing the waveform of audio received in real time is displayed at the position of the graph for ending the call, indicating that the amount of information transmitted by the call partner is larger when the waveform is more complex, and indicating that the sound of the call partner speaking is larger when the peak value of the waveform is higher.

And executing vibration in response to the fact that the call holding words are included in the currently received audio signals. The user may thus be prompted by way of a vibration: the other party of the call has important information which is being sent or is about to be sent, so that the important information is prevented from being missed.

As shown in fig. 2C, a third flowchart of a method for calling according to an embodiment of the present disclosure is schematically shown.

In this embodiment, the method may further include operations S205 to S207.

In operation S205, during a call, a first voiceprint feature of an audio signal sent by the call partner is acquired. Since the voiceprint feature is particularly suitable for remote identity verification and the audio signal of the call counterpart is easily obtained during the call, in this embodiment, it is determined whether the speaker containing the audio of the call end word and the call counterpart are the same person through the voiceprint feature. In particular, the voiceprint features can include any one or more of: acoustic features (cepstrum), lexical features (e.g., n-grams of speaker-dependent words, phoneme n-grams), prosodic features (pitch and energy "poses" described by the n-grams), phonetic information (including language, dialect, and accent information), channel information (what channel is used), and so forth.

In operation S206, in response to receiving the audio signal containing the call end word sent by the call partner, a second voiceprint feature of the audio signal containing the call end word is obtained. Specifically, the extraction process of the second voiceprint feature may be different from the extraction process of the first voiceprint feature only in that the object of feature extraction is different: the first voiceprint feature corresponds to an audio signal which does not include a call ending word in a call process, the second voiceprint feature corresponds to an audio signal which includes a call ending word in a call process, and preferably, the second voiceprint feature corresponds to an audio signal which includes a call ending word after user input is received in a call process, wherein the user input indicates that the user expects to end a call.

In operation S207, if the first voiceprint feature matches the second voiceprint feature, or if the second voiceprint feature matches a third voiceprint feature of the other party of the call, which is stored in advance, it is determined that the other party of the call is to end the call.

Specifically, the matching method may include the following several ways: hidden Markov Model (HMM) methods, for example, a single state HMM or Gaussian Mixture Model (GMM) may be used; neural network methods, for example, a neural network for voiceprint recognition is trained in advance to perform voiceprint recognition. In addition, for the third voiceprint feature of the call partner which is stored in advance, the third voiceprint feature is preferably a voiceprint feature which is extracted in advance by using the voice about the call ending word sent by the call partner, and in this case, a template matching method can be adopted, so that the accuracy and the efficiency are high.

In the embodiment of the disclosure, after the audio signal containing the call ending word is received, whether the audio signal containing the call ending word is sent by the call counterpart is further judged through the voiceprint feature, so that the situation that the call counterpart is judged to be expected to end the call by mistake when the voice spoken by the speaker around the call counterpart contains the call ending word is eliminated, and thus, the user experience can be improved.

As shown in fig. 2D, a fourth flowchart of a method for calling according to an embodiment of the present disclosure is schematically shown.

In this embodiment, the method may further include an operation S208 of ending the call in response to the user input if the identity attribute of the call partner corresponds to a category in which the user does not wish to keep the call during the call.

In this embodiment, the identity attribute may be a category of a group of the call counterpart stored in the address book, for example, family, leader, client, colleague, classmates, strangers, and the like; the classification may also be determined according to the contact information of the call partner, for example, the classification to which the contact information given by the server according to the contact information (e.g., the mobile phone number, the landline number, the category of the company to which the mobile phone number belongs, etc.) belongs may be a common incoming call, a promotional call, a nuisance call, an unknown incoming call, etc. Different categories may set different levels of response, e.g. the highest prompt level for family, leadership, client, colleagues, i.e. once the call partner sends an audio signal when the user tends to end the call, the user is prompted in the most obvious way: the opposite party wants to keep talking; and setting the lowest prompt level for promotion calls, harassing calls and the like, and executing the call ending once the user desires to end the call.

Further, the first set volume threshold, the second set volume threshold, the third set volume threshold, the fourth set volume threshold, the first set time length threshold, and the like may be dynamic thresholds, and different categories may correspond to different thresholds, for example, the more important category corresponds to a threshold having higher sensitivity, and are not listed here.

In addition, the presentation modes corresponding to different categories may also be different. For example, important categories are prompted in a more obvious fashion.

By the operation, the user can conveniently and quickly finish the conversation unwilling to be kept, and the user experience satisfaction is improved.

As shown in fig. 2E, a fifth flowchart of a method for calling according to an embodiment of the disclosure is schematically shown.

In this embodiment, the method may further include an operation S209 of ending the call in response to the user input if the signals acquired by the one or more sensors indicate that the user is in a situation where it is inconvenient to answer the call during the call.

In this embodiment, the current state of the user can be judged through information collected by one or more sensors, and then whether the user is convenient to answer the call currently is judged, if the user is inconvenient to answer the call, the user input can be directly responded to, and the call is ended. For example, when the signal collected by the speed sensor and/or the GPS indicates that the user is in a running state (for example, the terminal moving speed is greater than the preset first speed threshold and less than the preset second speed threshold), it indicates that the user is currently inconvenient to answer the call, and the call can be terminated directly in response to the user input. For another example, when the signal collected by the voltage sensor indicates that the power level is low, it indicates that the user is currently inconvenient to answer the call, and the call can be terminated in direct response to the user input. For another example, when the signal collected by the current sensor indicates that the mobile terminal is currently in a charging state, and the signal collected by the speed sensor and/or the GPS indicates that the user is moving at a high speed (for example, the moving speed of the mobile terminal is greater than the preset second speed threshold), it indicates that the user may be in a driving state, and it is inconvenient to answer the call, and the call may be terminated in direct response to the user input. The first speed threshold and the second speed threshold may be determined empirically or experimentally, for example, the first speed threshold may be a speed of a normal walking.

By the operation, the user can finish the call in time, and the user experience satisfaction is improved.

Fig. 3A to 3E schematically show block diagrams of a system for calling according to an embodiment of the present disclosure.

As shown in fig. 3A, a first block diagram of a system for telephony in accordance with an embodiment of the present disclosure is schematically illustrated.

The system for calling can comprise the following modules: a receiving module 301, a determining module 302 and a call ending module 303.

The receiving module 301 is configured to receive a user input indicating that the user desires to end the call.

The determining module 302 is configured to determine, in response to the user input, whether the call counterpart is to end the call according to an audio signal and/or a communication link status sent by the call counterpart. And

the call ending module 303 is configured to end the call when the audio signal and/or the communication link state sent by the call counterpart indicate that the call counterpart intends to end the call.

According to an embodiment of the present disclosure, the user input may include any one of: corresponding to an operation of ending a call, an operation of closing a screen of a mobile phone, or an operation of moving a proximity sensor away from an object during a call.

In a particular embodiment, the determination module 302 may include a first determination unit, a second determination unit, and a third determination unit.

The first determining unit is used for determining that the call of the opposite party is to be ended if the communication link state is disconnected or the value of the volume of the audio signal sent by the opposite party is continuously smaller than a first set volume threshold value.

The second determination unit is used for determining that the communication opposite side needs to keep talking if the communication link state is on and the value of the volume of the audio signal sent by the communication opposite side is larger than the first set volume threshold value.

The third determining unit is used for determining that the communication opposite side needs to keep communicating if the communication link state is on, the volume value of the audio signal sent by the communication opposite side is larger than the second set volume threshold value, and the duration exceeds the first set duration threshold value.

In another embodiment, the determining module 302 may include the following units: a fourth determining unit, a fifth determining unit, a sixth determining unit, a seventh determining unit, an eighth determining unit, and a ninth determining unit.

The fourth determining unit is configured to determine that the call partner is to end the call if the communication link state is disconnected.

The fifth determination unit is used for determining that the call counterpart is to end the call if the communication link state is on and the value of the volume of the audio signal sent by the call counterpart is continuously smaller than the first set volume threshold.

The sixth determining unit is configured to determine that the call of the other party is to be ended if the communication link is in a connected state, and if the audio signal sent by the other party contains one or more call ending words, or the audio signal sent by the other party contains one or more call ending words and a value of a volume of an audio signal corresponding to at least one of the call ending words is greater than a third set volume threshold value.

The seventh determining unit is used for determining that the call counterpart wants to keep calling if the communication link state is on and the value of the volume of the audio signal sent by the call counterpart is larger than the first set volume threshold and the audio signal does not include the call end word.

The eighth determining unit is configured to determine that the other party is to keep talking if the communication link is in a connected state, and the value of the volume of the audio signal sent by the other party is greater than a second set volume threshold, the duration exceeds a first set duration threshold, and the audio signal does not include the end word.

The ninth determining unit is configured to determine that the call counterpart wants to keep calling if the audio signal sent by the call counterpart includes one or more call holding words, or the audio signal sent by the call counterpart includes one or more call holding words and a value of a volume of an audio signal corresponding to at least one of the call holding words is greater than a fourth set volume threshold.

As shown in fig. 3B, a second block diagram of a system for telephony in accordance with an embodiment of the present disclosure is schematically illustrated. In this embodiment, the illustrated system may also include a prompt module 304.

The prompting module 304 is configured to output a prompting message when the audio signal and/or the communication link state sent by the call partner represent that the call partner is to keep on talking.

In a preferred embodiment, the prompting module 304 includes any one or more of the following: the first prompting unit, the second prompting unit or the third prompting unit.

The first prompting unit is used for prompting in a flashing mode corresponding to a graph ending the call in the graphical user interface.

The second prompting unit is used for blurring a graph corresponding to the call ending in the graphical user interface and displaying a waveform diagram of the currently received audio signal at the position of the graph for ending the call.

The third prompting unit is used for responding to the fact that the currently received audio signal comprises a conversation keeping word and executing vibration.

As shown in fig. 3C, a third block diagram of a system for telephony in accordance with an embodiment of the present disclosure is schematically illustrated.

In this embodiment, the system may further include the following modules: a first voiceprint acquisition module 305 and a second voiceprint acquisition module 306.

The first voiceprint obtaining module 305 is configured to obtain a first voiceprint feature of an audio signal sent by a call partner in a call process.

The second fingerprint obtaining module 306 is configured to obtain a second fingerprint feature of the audio signal containing the call ending word in response to receiving the audio signal containing the call ending word sent by the call counterpart.

The determining module 302 is specifically configured to determine that the other party is to end the call if the first voiceprint feature matches the second voiceprint feature, or if the second voiceprint feature matches a third voiceprint feature of the other party, which is stored in advance.

To further improve the accuracy of the system, a fourth block diagram of the system for calling according to an embodiment of the present disclosure is schematically shown as shown in fig. 3D.

In this embodiment, the system may further include an identity attribute obtaining module 307.

The identity attribute obtaining module 307 is configured to obtain an identity attribute of the other party during the call.

The call ending module 303 is specifically configured to, in a call process, end a call in response to the user input if the identity attribute of the call counterpart corresponds to a category in which the user does not want to keep the call.

In another embodiment, as shown in fig. 3E, a fifth block diagram of a system for telephony in accordance with an embodiment of the present disclosure is schematically illustrated.

In this embodiment, the system may further include a sensor signal acquisition module 308.

The sensor signal acquiring module 308 is configured to acquire signals through one or more sensors during a call.

The call ending module 303 is specifically configured to, in a call process, end a call in response to the user input if a signal acquired by one or more sensors indicates that the user is in a situation where it is inconvenient to answer the call.

Fig. 4 schematically shows a block diagram of an apparatus for conversation according to an embodiment of the present disclosure. The apparatus shown in fig. 4 is only an example, and should not bring any limitation to the function and scope of use of the embodiments of the present disclosure.

As shown in fig. 4, the apparatus 400 for calling includes a processor 410 and a readable storage medium 420. The apparatus 400 may perform a method according to an embodiment of the present disclosure.

In particular, processor 410 may include, for example, a general purpose microprocessor, an instruction set processor and/or related chip set and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), and/or the like. The processor 410 may also include onboard memory for caching purposes. Processor 410 may be a single processing unit or a plurality of processing units for performing different actions of a method flow according to embodiments of the disclosure.

Readable storage medium 420 may be, for example, any medium that can contain, store, communicate, propagate, or transport the instructions. For example, a readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, device, or propagation medium. Specific examples of the readable storage medium 420 include: magnetic storage devices, such as magnetic tape or Hard Disk Drives (HDDs); memory, such as Random Access Memory (RAM), read-only memory (ROM); optical storage devices, such as compact disks (CD-ROMs); and/or wired/wireless communication links.

The readable storage medium 420 may comprise a computer program 421, which computer program 421 may comprise code/computer-executable instructions that, when executed by the processor 410, cause the processor 410 to perform a method according to an embodiment of the disclosure or any variant thereof.

The computer program 421 may be configured with, for example, computer program code comprising computer program modules. For example, in an example embodiment, code in computer program 421 may include one or more program modules, including for example program module 421A, program module 421B, and so on. It should be noted that the division and number of the modules are not fixed, and those skilled in the art may use suitable program modules or program module combinations according to actual situations, so that the processor 410 may execute the method according to the embodiment of the present disclosure or any variation thereof when the program modules are executed by the processor 410.

According to an embodiment of the present disclosure, the processor 410 may interact with the readable storage medium 420 to perform a method according to an embodiment of the present disclosure or any variant thereof.

According to an embodiment of the present invention, at least one of the receiving module 301, the determining module 302, the call ending module 303, the prompting module 304, the first voiceprint acquisition module 305, the second voiceprint acquisition module 306, the identity attribute acquisition module 307, and the sensor signal acquisition module 308 can be implemented as a computer program module as described with reference to fig. 4, which when executed by the processor 410 can implement the corresponding operations described above.

The present disclosure also provides a computer-readable medium, which may be embodied in the apparatus/device/system described in the above embodiments; or may exist separately and not be assembled into the device/apparatus/system. The computer readable medium carries one or more programs which, when executed, implement the method according to an embodiment of the disclosure.

According to embodiments of the present disclosure, a computer readable medium may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer-readable signal medium may include a propagated data signal with computer-readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wired, optical fiber cable, radio frequency signals, etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Those skilled in the art will appreciate that various combinations and/or combinations of features recited in the various embodiments and/or claims of the present disclosure can be made, even if such combinations or combinations are not expressly recited in the present disclosure. In particular, various combinations and/or combinations of the features recited in the various embodiments and/or claims of the present disclosure may be made without departing from the spirit or teaching of the present disclosure. All such combinations and/or associations are within the scope of the present disclosure.

While the disclosure has been shown and described with reference to certain exemplary embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the disclosure as defined by the appended claims and their equivalents. Accordingly, the scope of the present disclosure should not be limited to the above-described embodiments, but should be defined not only by the appended claims, but also by equivalents thereof.

Claims

1. A method for calling, comprising:

receiving a user input indicating that the user desires to end a call;

responding to the user input, and determining whether the call counterpart is to end the call according to the audio signal sent by the call counterpart and the communication link state;

under the condition that the audio signal and the communication link state sent by the opposite communication party indicate that the opposite communication party needs to finish the communication, the communication is finished; and

outputting prompt information under the condition that an audio signal and a communication link state sent by a conversation party represent that the conversation party needs to keep a conversation;

determining whether the call counterpart wants to end the call according to the audio signal sent by the call counterpart and the communication link state comprises:

in the conversation process, acquiring a first voiceprint characteristic of an audio signal sent by the conversation opposite side;

responding to the received audio signal containing the call ending word sent by the call opposite side, and acquiring a second sound pattern characteristic of the audio signal containing the call ending word;

and if the first voiceprint feature is matched with the second voiceprint feature or if the second voiceprint feature is matched with a third voiceprint feature of the other party of the call, which is stored in advance, determining that the other party of the call is to end the call.

2. The method of claim 1, wherein: the user input comprises any one of:

corresponding to an operation of ending a call, an operation of closing a screen of a mobile phone, or an operation of moving a proximity sensor away from an object during a call.

3. The method of claim 1, wherein:

if the communication link state is disconnected or the value of the volume of the audio signal sent by the opposite call party is continuously smaller than a first set volume threshold value, determining that the opposite call party is to finish the call;

if the communication link state is on and the volume value of the audio signal sent by the opposite call party is greater than a first set volume threshold value, determining that the opposite call party needs to keep calling;

and if the communication link is in a connected state, the volume value of the audio signal sent by the opposite call party is greater than a second set volume threshold value, and the duration exceeds a first set duration threshold value, determining that the opposite call party needs to keep calling.

4. The method of claim 1, wherein:

if the communication link state is disconnected, determining that the call counterpart is to end the call;

if the communication link state is on, determining that the call counterpart is to end the call under the condition that the volume value of the audio signal sent by the call counterpart is continuously smaller than a first set volume threshold value;

if the communication link state is on, determining that the opposite party is about to finish the call under the condition that the audio signal sent by the opposite party contains one or more call ending words or the audio signal sent by the opposite party contains one or more call ending words and the volume value of the audio signal corresponding to at least one call ending word is larger than a third set volume threshold value;

if the communication link state is on, determining that the communication opposite side needs to keep communicating under the condition that the volume value of an audio signal sent by the communication opposite side is larger than a first set volume threshold value and the audio signal does not include the communication end word;

if the communication link state is on, determining that the communication opposite side needs to keep communicating under the conditions that the volume value of an audio signal sent by the communication opposite side is larger than a second set volume threshold value, the duration exceeds a first set duration threshold value, and the audio signal does not include the communication ending word;

and if the audio signal sent by the call opposite side contains one or more call holding words, or the audio signal sent by the call opposite side contains one or more call holding words and the value of the volume of the audio signal corresponding to at least one call holding word is larger than a fourth set volume threshold value, determining that the call opposite side needs to hold the call.

5. The method of claim 1, further comprising:

and during the call, if the identity attribute of the call counterpart corresponds to the category that the user does not want to keep the call, ending the call in response to the user input.

6. The method of claim 1, further comprising:

during a call, if the signals acquired by the one or more sensors indicate that the user is in a situation where it is inconvenient to answer the call, the call is ended in response to the user input.

7. The method of claim 1, wherein the output prompt information comprises any one or more of:

prompting in a flashing mode corresponding to a graph for finishing the call in the graphical user interface;

blurring a graph corresponding to the call ending in the graphical user interface, and displaying a waveform diagram of the currently received audio signal at the position of the graph for ending the call;

and executing vibration in response to the fact that the call holding words are included in the currently received audio signals.

8. An apparatus for a call, comprising:

one or more processors;

readable storage medium for storing one or more computer programs which, when executed by the processor, implement the method according to any of claims 1-7.