CN116962360A - Call method, device, network equipment and terminal - Google Patents

Call method, device, network equipment and terminal Download PDF

Info

Publication number
CN116962360A
CN116962360A CN202211082451.5A CN202211082451A CN116962360A CN 116962360 A CN116962360 A CN 116962360A CN 202211082451 A CN202211082451 A CN 202211082451A CN 116962360 A CN116962360 A CN 116962360A
Authority
CN
China
Prior art keywords
information
terminal
target
voice
network device
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211082451.5A
Other languages
Chinese (zh)
Inventor
李凯
张昕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Communications Ltd Research Institute
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Communications Ltd Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Communications Ltd Research Institute filed Critical China Mobile Communications Group Co Ltd
Priority to CN202211082451.5A priority Critical patent/CN116962360A/en
Publication of CN116962360A publication Critical patent/CN116962360A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/1066Session management
    • H04L65/1083In-session procedures
    • H04L65/1089In-session procedures by adding media; by removing media
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7844Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using original textual content or text extracted from visual content or transcript of audio data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/1066Session management
    • H04L65/1083In-session procedures

Abstract

The invention provides a call method, a call device, network equipment and a call terminal, and relates to the technical field of data service. The method comprises the following steps: under the condition that the first terminal and the second terminal establish audio call connection, acquiring first voice information sent by the second terminal; transmitting the first voice information to a second network device; receiving target information fed back by the second network equipment according to the first voice information; determining target video information according to the target information, wherein the target video information comprises first text information corresponding to the first voice information; and sending the target video information to the first terminal. The scheme of the invention solves the problems of complex operation and unsmooth user experience of the existing barrier-free conversation scheme.

Description

Call method, device, network equipment and terminal
Technical Field
The present invention relates to the field of data service technologies, and in particular, to a call method, a call device, a network device, and a terminal.
Background
In some conversation scenarios, there may be a need for barrier-free communication, for example, in cross-language conversation scenarios, a language translation function may be required to help a user understand the language of the end user of the conversation partner, and for example, when a group of people with impaired hearing calls, a function of converting speech into text may be required to improve the perception of speech by the user, thereby achieving barrier-free communication.
However, in the implementation process of the related art, the user needs to switch from the call interface to the display interface in other applications, and when the user needs to operate the call interface, the user needs to operate again to return to the original call interface, and in some special scenes (such as a customer service voice navigation scene), the user even needs to perform operation switching between the call interface and the display interface in other applications for many times, so that the user experience is not smooth, the operation process is complicated, and especially for some old user groups, the problem that the user experience of the existing barrier-free call scheme is not friendly enough is more highlighted because the user is not familiar with the operation mode of switching between different applications of the mobile phone.
Disclosure of Invention
The invention aims to provide a call method, a call device, network equipment and a call terminal, which solve the problems that the existing barrier-free call scheme is complex in operation and unsmooth in user experience.
To achieve the above object, an embodiment of the present invention provides a call method, applied to a first network device, including:
under the condition that the first terminal and the second terminal establish audio call connection, acquiring first voice information sent by the second terminal;
Transmitting the first voice information to a second network device;
receiving target information fed back by the second network equipment according to the first voice information;
determining target video information according to the target information, wherein the target video information comprises first text information corresponding to the first voice information;
and sending the target video information to the first terminal.
Optionally, before the acquiring the first voice information sent by the second terminal, the method further includes at least one of the following:
transmitting first media renegotiation information to the first terminal, wherein the first media renegotiation information is used for anchoring an audio call endpoint of the first terminal;
and sending second media renegotiation information to the second terminal, wherein the second media renegotiation information is used for anchoring an audio call endpoint of the second terminal.
Optionally, before the acquiring the first voice information sent by the second terminal, the method further includes:
transmitting first indication information to second network equipment, wherein the first indication information is used for indicating related information of barrier-free call service;
wherein the first indication information includes at least one of:
A control event for indicating the start or end of the barrier-free talk service;
transmitting media parameters of audio and/or video;
terminal identity information;
a translation language indication.
Optionally, in the case that the target information is the first text information, determining, according to the target information, target video information includes:
and generating the target video information according to the first text information and the first voice information.
Optionally, in the case that the target information is the first video information, determining the target video information according to the target information includes:
the first video information is determined as the target video information.
Optionally, the sending the first voice information to the second network device includes:
transmitting first request information to the first terminal, wherein the first request information is used for requesting to convert an audio call between the first terminal and the first network device into a video call;
receiving first information fed back by the first terminal according to the first request information;
and transmitting the first voice information to a second network device in the case that the first information indicates that the user agrees to convert the audio call between the first terminal and the first network device into the video call.
To achieve the above object, an embodiment of the present invention provides a call method, which is applied to a second network device, including:
receiving first voice information sent by first network equipment;
determining target information according to the first voice information;
and sending the target information to the first network equipment.
Optionally, before receiving the first voice information sent by the first network device, the method further includes:
receiving first indication information sent by first network equipment, wherein the first indication information is used for indicating related information of barrier-free call service;
wherein the first indication information includes at least one of:
a control event for indicating the start or end of the barrier-free talk service;
transmitting media parameters of audio and/or video;
terminal identity information;
a translation language indication.
Optionally, the determining the target information according to the first voice information includes:
and generating first text information corresponding to the first voice information according to the first voice information and the first indication information, and taking the first text information as the target information.
Optionally, the determining the target information according to the first voice information includes:
Generating first text information corresponding to the first voice information according to the first voice information and the first indication information;
and generating first video information according to the first text information and the first voice information, and taking the first video information as the target information.
In order to achieve the above object, an embodiment of the present invention provides a call method, which is applied to a first terminal, including:
receiving target video information sent by a first server;
displaying the target video information on a current call interface;
the target video information comprises first text information corresponding to first voice information, and the first voice information is sent by a second terminal.
To achieve the above object, an embodiment of the present invention provides a communication apparatus, applied to a first network device, including:
the first acquisition module is used for acquiring first voice information sent by the second terminal under the condition that the first terminal and the second terminal establish audio call connection;
the information sending module is used for sending the first voice information to the second network equipment;
the information receiving module is used for receiving target information fed back by the second network equipment according to the first voice information;
The first processing module is used for determining target video information according to the target information, wherein the target video information comprises first text information corresponding to the first voice information;
and the first sending module is used for sending the target video information to the first terminal.
Optionally, the apparatus further comprises:
the first media renegotiation module is used for sending first media renegotiation information to the first terminal, and the first media renegotiation information is used for anchoring an audio call endpoint of the first terminal;
and the second media renegotiation module is used for sending first media renegotiation information to the first terminal, and the first media renegotiation information is used for anchoring an audio call endpoint of the first terminal.
Optionally, the apparatus further comprises:
the first sending submodule is used for sending first indication information to the second network equipment, wherein the first indication information is used for indicating relevant information of the barrier-free call service;
wherein the first indication information includes at least one of:
a control event for indicating the start or end of the barrier-free talk service;
transmitting media parameters of audio and/or video;
terminal identity information;
A translation language indication.
Optionally, the information sending module includes:
and the first processing unit is used for generating the target video information according to the first text information and the first voice information.
Optionally, the second processing sub-module includes:
and the second processing unit is used for determining the first video information as the target video information.
Optionally, the second transmitting submodule includes:
a first sending unit, configured to send first request information to the first terminal, where the first request information is used to request conversion of an audio call between the first terminal and the first network device into a video call;
the first receiving unit is used for receiving first information fed back by the first terminal according to the first request information;
and a second sending unit, configured to send the first voice information to a second network device if the first information indicates that the user agrees to convert the audio call between the first terminal and the first network device into a video call.
To achieve the above object, an embodiment of the present invention provides a communication apparatus applied to a second network device, including:
The first receiving module is used for receiving first voice information sent by the first network equipment;
the second processing module is used for determining target information according to the first voice information;
and the second sending module is used for sending the target information to the first network equipment.
Optionally, the apparatus further comprises:
the third receiving module is used for receiving first indication information sent by the first network equipment and used for indicating related information of the barrier-free call service;
wherein the first indication information includes at least one of:
a control event for indicating the start or end of the barrier-free talk service;
transmitting media parameters of audio and/or video;
terminal identity information;
a translation language indication.
Optionally, the second processing module includes:
and the third processing sub-module is used for generating first text information corresponding to the first voice information according to the first voice information and the first indication information, and taking the first text information as the target information.
Optionally, the second processing module includes:
the fourth processing sub-module is used for generating first text information corresponding to the first voice information according to the first voice information and the first indication information;
And the fifth processing sub-module is used for generating first video information according to the first text information and the first voice information, and taking the first video information as the target information.
To achieve the above object, an embodiment of the present invention provides a call device, which is applied to a first terminal, including:
the second receiving module is used for receiving the target video information sent by the first server;
the first display module is used for displaying the target video information on the current call interface;
the target video information comprises first text information corresponding to first voice information, and the first voice information is sent by a second terminal.
To achieve the above object, an embodiment of the present invention provides a network device, which is a first network device, including a processor and a transceiver, where the processor is configured to:
under the condition that the first terminal and the second terminal establish audio call connection, acquiring first voice information sent by the second terminal;
transmitting the first voice information to a second network device;
receiving target information fed back by the second network equipment according to the first voice information;
Determining target video information according to the target information, wherein the target video information comprises first text information corresponding to the first voice information;
and sending the target video information to the first terminal.
Optionally, the processor is further configured to:
transmitting first media renegotiation information to the first terminal, wherein the first media renegotiation information is used for anchoring an audio call endpoint of the first terminal;
and sending second media renegotiation information to the second terminal, wherein the second media renegotiation information is used for anchoring an audio call endpoint of the second terminal.
Optionally, the processor is further configured to:
transmitting first indication information to second network equipment, wherein the first indication information is used for indicating related information of barrier-free call service;
wherein the first indication information includes at least one of:
a control event for indicating the start or end of the barrier-free talk service;
transmitting media parameters of audio and/or video;
terminal identity information;
a translation language indication.
Optionally, the processor is specifically configured to, when determining the target video information according to the target information:
and generating the target video information according to the first text information and the first voice information.
Optionally, the processor is specifically configured to, when determining the target video information according to the target information:
the first video information is determined as the target video information.
Optionally, the processor is specifically configured to, when sending the first voice information to the second network device:
transmitting first request information to the first terminal, wherein the first request information is used for requesting to convert an audio call between the first terminal and the first network device into a video call;
receiving first information fed back by the first terminal according to the first request information;
and transmitting the first voice information to a second network device in the case that the first information indicates that the user agrees to convert the audio call between the first terminal and the first network device into the video call.
To achieve the above object, an embodiment of the present invention provides a network device, which is a second network device, including a processor and a transceiver, where the processor is configured to:
receiving first voice information sent by first network equipment;
determining target information according to the first voice information;
and sending the target information to the first network equipment.
Optionally, the processor is further configured to:
receiving first indication information sent by first network equipment, wherein the first indication information is used for indicating related information of barrier-free call service;
wherein the first indication information includes at least one of:
a control event for indicating the start or end of the barrier-free talk service;
transmitting media parameters of audio and/or video;
terminal identity information;
a translation language indication.
Optionally, the processor is specifically configured to, when determining the target information according to the first voice information:
and generating first text information corresponding to the first voice information according to the first voice information and the first indication information, and taking the first text information as the target information.
Optionally, the processor is specifically configured to, when determining the target information according to the first voice information:
generating first text information corresponding to the first voice information according to the first voice information and the first indication information;
and generating first video information according to the first text information and the first voice information, and taking the first video information as the target information.
To achieve the above object, an embodiment of the present invention provides a terminal, which is a first terminal, including: a transceiver and a processor; the processor is configured to:
Receiving target video information sent by a first server;
displaying the target video information on a current call interface;
the target video information comprises first text information corresponding to first voice information, and the first voice information is sent by a second terminal.
To achieve the above object, an embodiment of the present invention provides a network device including a transceiver, a processor, a memory, and a program or instructions stored on the memory and executable on the processor; the processor, when executing the program or instructions, implements a talk method as applied to the first network device as described above, or implements a talk method as applied to the second network device as described above.
To achieve the above object, an embodiment of the present invention provides a terminal including a transceiver, a processor, a memory, and a program or instructions stored on the memory and executable on the processor; the processor, when executing the program or instructions, implements the call method as applied to the first terminal as described above.
To achieve the above object, an embodiment of the present invention provides a readable storage medium having stored thereon a program or instructions which, when executed by a processor, implement steps in a call method as applied to a first network device, or steps in a call method as applied to a second network device, or steps in a call method as applied to a first terminal.
The technical scheme of the invention has the following beneficial effects:
according to the method, under the condition that the first terminal and the second terminal establish audio call connection, the first voice information sent by the second terminal can be obtained, the first voice information is sent to the second network equipment, then the target video information is determined according to the target information fed back by the second network equipment and is sent to the first terminal, wherein the target video information comprises the first text information corresponding to the first voice information, so that the first terminal can display the first text information in the form of subtitles when displaying the target video information after receiving the target video information, the functions of voice transcription, translation, subtitle display and the like are realized in a native call interface of a mobile phone in the process of dialing a voice call by a user, the realization process does not need to depend on external application, barrier-free communication can be realized without switching the interface, convenience and rapidness are realized, and user experience is improved.
Drawings
FIG. 1 is a flow chart of a communication method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a communication method according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of information interaction of a communication method according to an embodiment of the present invention;
FIG. 4 is a schematic diagram illustrating information interaction of a communication method according to another embodiment of the present invention;
FIG. 5 is a flow chart of a communication method according to another embodiment of the present invention;
FIG. 6 is a flow chart of a communication method according to yet another embodiment of the present invention;
fig. 7 is a block diagram of a communication device according to an embodiment of the present invention;
fig. 8 is a block diagram of a communication device according to another embodiment of the present invention;
fig. 9 is a block diagram of a communication device according to still another embodiment of the present invention;
fig. 10 is a block diagram of a network device according to an embodiment of the present invention;
fig. 11 is a block diagram of a terminal according to an embodiment of the present invention;
fig. 12 is a block diagram of a network device according to another embodiment of the present invention;
fig. 13 is a block diagram of a terminal according to another embodiment of the present invention.
Detailed Description
In order to make the technical problems, technical solutions and advantages to be solved more apparent, the following detailed description will be given with reference to the accompanying drawings and specific embodiments.
It should be appreciated that reference throughout this specification to "one embodiment" or "an embodiment" means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present invention. Thus, the appearances of the phrases "in one embodiment" or "in an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
In various embodiments of the present application, it should be understood that the sequence numbers of the following processes do not mean the order of execution, and the order of execution of the processes should be determined by the functions and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present application.
In addition, the terms "system" and "network" are often used interchangeably herein.
In the embodiments provided herein, it should be understood that "B corresponding to a" means that B is associated with a from which B may be determined. It should also be understood that determining B from a does not mean determining B from a alone, but may also determine B from a and/or other information.
As shown in fig. 1, a call method in an embodiment of the present application is applied to a first network device, and includes:
step 101, under the condition that the first terminal and the second terminal establish audio call connection, acquiring first voice information sent by the second terminal.
Here, the first terminal may be a calling party and may be a called party.
Step 102, the first voice information is sent to a second network device;
it should be noted that, the second network device and the first network device may be two different network devices, or may be two different modules disposed on the same network device.
Step 103, receiving target information fed back by the second network device according to the first voice information;
the target information may be first text information obtained by the second network device according to the first voice information processing (for example, converting or translating), or may be first video information obtained by the second network device according to the first voice information processing (for example, converting or translating and then synthesizing).
And 104, determining target video information according to the target information, wherein the target video information comprises first text information corresponding to the first voice information.
Here, the first text information corresponding to the first voice information may be a text converted from the first voice information, or may be a text translated from the first voice information, for example, the first voice information is english, and the first text information may be a chinese text translated from english.
When the target video information is played by a terminal (for example, a first terminal), the terminal can display the first text information in the form of subtitles. Thus, the user of the receiver of the first voice information can learn the voice call content of the counterpart user in the communication process through the caption.
And step 105, the target video information is sent to the first terminal.
The existing VoLTE/VoNR system has no text transmission channel, and the terminal also does not support displaying text information pushed by a network side on a call interface.
In this embodiment, under the condition that the first terminal and the second terminal establish audio call connection, the first voice information sent by the second terminal can be acquired, the first voice information is sent to the second network device, then the target video information is determined according to the target information fed back by the second network device, and the target video information is sent to the first terminal, wherein the target video information includes first text information corresponding to the first voice information, so that the first terminal can display the first text information in the form of subtitles when displaying the target video information after receiving the target video information, thereby realizing the functions of voice transcription, translation, subtitle display and the like in a native call interface of a mobile phone in the process of dialing a voice call by a user.
It should be noted that, in the call method according to the embodiment of the present invention, the implementation may be based on that the terminal opens the barrier-free call service, for example, in the step 101, the audio call connection may be established between the first terminal and the second terminal, and the first voice information sent by the second terminal may be obtained when the first terminal opens the barrier-free call service. That is, if the first network device finds that the first terminal does not open the barrier-free call service, the subsequent steps in the call method according to the embodiment of the present invention are not executed.
Specifically, the user of the first terminal may open the barrier-free call service through an application procedure, for example, the first terminal sends a service application request to the first network device, after receiving the service application request sent by the first terminal, the first network device completes user subscription, stores relevant user subscription data, and may open the barrier-free call service for the first terminal. Thus, after the first terminal and the second terminal establish the audio call connection, the first network device may execute the steps in the call method according to the user subscription data corresponding to the first terminal, so as to provide the barrier-free call service for the first terminal.
Optionally, before the acquiring the first voice information sent by the second terminal, the method further includes at least one of the following:
transmitting first media renegotiation information to the first terminal, wherein the first media renegotiation information is used for anchoring an audio call endpoint of the first terminal;
and sending second media renegotiation information to the second terminal, wherein the second media renegotiation information is used for anchoring an audio call endpoint of the second terminal.
As shown in fig. 2, generally, during a voice call establishment between a first terminal (e.g., UE a) and a second terminal (e.g., UE B), UE a and UE B are connected to a first network device through SBC (Session Border Controller ) -U, SBC-C, CSCF (Call Session Control Function, call session control function), respectively. In the embodiment of the invention, the audio call end points of the first terminal and the first terminal can be anchored by sending the first media renegotiation information to the first terminal and sending the second media renegotiation information to the second terminal. Here, it means that the audio call endpoints of the first terminal and the second terminal are anchored on the first network device. Taking the first network device AS a network AS (Application Server, an application server) AS an example, after the audio call endpoints of the first terminal and the second terminal are anchored on the first network device, the connection is established between the SBC-U and the network AS (for example, a media module in the network AS) for transmitting voice information of the call between the terminals and text information, video information and the like obtained based on the conversion of the voice information, so that the network AS can obtain the voice information sent by the UE B, thereby realizing the conversion or translation of the voice information and other processing.
Optionally, before the acquiring the first voice information sent by the second terminal, the method further includes:
transmitting first indication information to second network equipment, wherein the first indication information is used for indicating related information of barrier-free call service;
wherein the first indication information includes at least one of:
a control event for indicating the start or end of the barrier-free talk service;
transmitting media parameters of audio and/or video;
terminal identity information;
a translation language indication.
Here, the control event may be used to instruct the second network device to start or end the barrier-free talk service; the media parameters for transmitting audio and/or video may be used to negotiate media parameters for transmitting (i.e., receiving or sending) audio and/or video media streams between the first network device and the second network device, e.g., the media parameters may be port address, frame rate, etc. information. The first indication information may further include terminal identity information, a translation language indication, etc., where the terminal identity information may be a mobile phone number, or identity information of a calling (or called) terminal, etc.; a translation language indication for indicating what form of language translation the first speech information is to be made, e.g. chinese to english, etc.
In this embodiment, by sending the first indication information to the second network device, information related to the unobstructed call service may be indicated to the second network device, so that the second network device may process (e.g., convert to the first text information or translate to the first text information, etc.) the received first voice information according to the specific content of the first indication information.
Optionally, in the case that the target information is the first text information, determining target video information according to the target information includes the following optional embodiments:
in a first embodiment, the target video information is generated according to the first text information and the first voice information.
In a second embodiment, the first video information is determined as the target video information.
That is, the process of determining the first text information according to the first voice information may be performed by the second network device, and the process of synthesizing the video information according to the first text information and the first voice information may be performed at the first network device or may be performed by the second network device.
Optionally, the sending the first voice information to the second network device includes:
Transmitting first request information to the first terminal, wherein the first request information is used for requesting to convert an audio call between the first terminal and the first network device into a video call;
receiving first information fed back by the first terminal according to the first request information;
and transmitting the first voice information to a second network device in the case that the first information indicates that the user agrees to convert the audio call between the first terminal and the first network device into the video call.
In this embodiment, if the user agrees, that is, the first information indicates that the audio call between the first terminal and the first network device is converted into the video call, the subsequent flow of the embodiment of the present invention is continuously executed; if the user does not agree, the subsequent flow is not executed. In this way, after the user agrees to convert the audio call between the first terminal and the first network device into the video call, the first network device may upgrade one end of the subscriber (for example, the first terminal that has opened the barrier-free call service) from the audio call form to the video call form, so as to implement the subtitle presentation corresponding to the first voice information by using the video stream.
For example, the first terminal opens the barrier-free call service, and the user of the first terminal agrees to switch the audio call between the first terminal and the first network device to the video call, and the second terminal does not open the barrier-free call service, so that the following technical effects can be achieved by the embodiment of the application: the form of the call between the first terminal and the first network device is switched from the form of the audio call to the form of the video call, while the form of the audio call is still maintained between the first network device and the second terminal.
In the following, the scheme provided by the embodiment of the present application is specifically illustrated by taking the first terminal AS the terminal a, the second terminal AS the terminal B, the first network device AS the network AS, and the second network device AS the barrier-free service server AS an example.
Embodiment one:
as shown in fig. 3, the method specifically includes the following steps:
step 1, a terminal A and a terminal B establish audio call connection;
and 2, the network AS judges that the terminal A opens the barrier-free call service, initiates media renegotiation to the terminal A, anchors the audio call endpoint of the terminal A to the network AS, and applies for upgrading the audio call to the video call to the terminal A.
Here, it should be noted that the network AS may send first request information to the first terminal, requesting to convert the audio call between the first terminal and the first network device into the video call. After receiving the first request information, the first terminal may display an operation control (for example, an agreeing or rejecting button) on the current call interface, so that the first terminal may receive an operation instruction sent by the user through the operation control, and send the first information to the first server according to the operation instruction, so as to indicate whether the user agrees to convert the audio call between the first terminal and the first network device into the video call.
If the user agrees, continuing to execute the subsequent flow of the embodiment of the invention; if the user does not agree, the subsequent flow is not executed.
Step 3, the network AS initiates media renegotiation to the terminal B, and anchors the audio call endpoint of the terminal B in the network AS;
and step 4, the network AS initiates barrier-free call control to the barrier-free service server.
For example, the network AS sends first indication information to the barrier-free service server, which is used for indicating related information of the barrier-free call service, so that the barrier-free service server can process the received first voice information according to the specific content of the first indication information.
Step 5, the terminal B sends the local audio stream (i.e. the first voice information) to the network AS;
step 6, the network AS forwards the first voice information to the barrier-free service server;
step 7, the barrier-free service server completes voice conversion words and/or voice translation words according to the first voice information to obtain first word information corresponding to the first voice information;
step 8, the barrier-free service server synthesizes the first text information and the first voice information into a subtitle video stream (i.e. first video information);
step 9, the barrier-free service server sends the first video information to the network AS;
In step 10, the network AS may send the first video information AS target video information to the terminal a.
It should be noted that, when the network AS sends the target video to the terminal a, the target video may be sent in the form of an audio stream and a video stream, where the audio stream partially transmits first voice information, and the video stream transmits a picture with the first text information AS a subtitle.
And 11, displaying the target video information by the terminal A.
After receiving the target video information, the terminal a may display the first text information in the form of subtitles when displaying the target video information, thereby implementing the barrier-free call function.
In this embodiment, the barrier-free service server may complete voice conversion text and/or voice translation text, obtain first text information corresponding to the first voice information, and complete synthesis of the first text information and the first voice information.
Embodiment two:
AS shown in fig. 4, the difference between the second embodiment and the first embodiment is that the above-mentioned step 8 may be performed by the network AS.
Specifically, in step 7, the barrier-free service server completes voice conversion of text and/or voice translation of text according to the first voice information, and after obtaining first text information corresponding to the first voice information, sends the first text information to the network AS, and the network AS synthesizes the first text information and the first voice information into a subtitle video stream (i.e., first video information). Referring to fig. 4, other steps are similar to those of the embodiment and will not be described again.
In this embodiment, the barrier-free service server may complete voice conversion text and/or voice translation text to obtain first text information corresponding to the first voice information, and the network AS completes synthesis of the first text information and the first voice information (AS shown in fig. 2, may be specifically completed by a media module that is the network AS).
According to the communication method, after the first terminal and the second terminal are connected in an audio communication mode, the audio communication endpoints of the first terminal and the second terminal are anchored on the first network device, and the target video information comprising the first text information corresponding to the first voice information is determined based on the first voice information, so that the target video information is sent to the first terminal, and therefore the first terminal can display the first text information in the form of subtitles when displaying the target video information after receiving the target video information, the functions of voice transcription, translation, subtitle display and the like in a native communication interface of a mobile phone in the process of dialing a voice communication by a user are achieved, the implementation process does not need to modify the terminal, barrier-free communication can be achieved without needing to switch the interface, convenience and rapidness are achieved, and user experience is improved.
As shown in fig. 5, a call method in an embodiment of the present invention is applied to a second network device, and includes:
step 501, receiving first voice information sent by a first network device;
step 502, determining target information according to the first voice information.
The target information may be first text information obtained by the second network device according to the first voice information processing (for example, converting or translating), or may be first video information obtained by the second network device according to the first voice information processing (for example, converting or translating and then synthesizing).
And step 503, sending the target information to the first network equipment.
According to the embodiment, after the first voice information sent by the first network device is received, the target information is determined according to the first voice information, and the target information is the first text information or the first video information corresponding to the first voice information, namely, the conversion of the first voice information is completed, so that the first terminal can display the first text information corresponding to the first voice information (for example, display the first text information in the form of subtitles), thereby realizing the functions of voice transcription, translation, subtitle display and the like in the native phone call interface.
Optionally, before receiving the first voice information sent by the first network device, the method further includes: receiving first indication information sent by first network equipment, wherein the first indication information is used for indicating related information of barrier-free call service; wherein the first indication information includes at least one of:
a control event for indicating the start or end of the barrier-free talk service;
transmitting media parameters of audio and/or video;
terminal identity information;
a translation language indication.
Here, the control event may be used to instruct the second network device to start or end the barrier-free telephony service: for example, if the control event in the first indication information indicates that the barrier-free call service starts, the second network device starts to perform voice conversion text processing on the first voice information so as to generate first text information corresponding to the first voice information, so as to generate target information finally; if the control event indicates that the barrier-free call service is finished, the second network equipment does not process the first voice information any more;
the media parameters for transmitting audio and/or video may be used to negotiate media parameters for transmitting (i.e. receiving or sending) audio and/or video media streams between the first network device and the second network device, e.g. the media parameters may be information of port addresses, frame rates etc. so that the second network device may learn from the media parameters in the first indication information from which port addresses the audio and/or video media streams need to be transmitted, and at which frame rate the transmission of the audio and/or video media streams with the first network device takes place, e.g. the second network device transmits the target information from the target port to the first network device at the target frame rate (indicated by the media parameters).
The first indication information may further include terminal identity information, translation language indication, etc.;
the terminal identity information may be a mobile phone number, or identity information of a calling (or called) terminal, for example, if the terminal identity information in the first indication information is a mobile phone number, the second network device generates first text information corresponding to the first voice information according to the first indication information, so as to finally generate target information, where the mobile phone number may be carried in the target information, so that the first network device knows from the target information which terminal the target information is generated for by the second network device;
and the second network equipment translates the first voice information into English version first text information according to the translation language instruction so as to finally generate target information.
In this embodiment, the information about the barrier-free call service may be obtained by receiving the first indication information sent by the first network device, so that the second network device may process (e.g., convert to the first text information or translate to the first text information) the received first voice information according to the specific content of the first indication information.
Optionally, when determining the target information according to the first voice information, the second network device may specifically include the following ways:
mode one: and generating first text information corresponding to the first voice information according to the first voice information and the first indication information, and taking the first text information as the target information.
Mode two: generating first text information corresponding to the first voice information according to the first voice information and the first indication information; and generating first video information according to the first text information and the first voice information, and taking the first video information as the target information.
According to the communication method, after the first voice information sent by the first network device is received, the target information is determined according to the first voice information, wherein the target information is the first text information or the first video information corresponding to the first voice information, namely, the conversion of the first voice information is completed, so that the first terminal can display the first text information corresponding to the first voice information (for example, display the first text information in the form of subtitles), and therefore functions of voice transcription, translation, subtitle display and the like are achieved in a native communication interface of a mobile phone.
As shown in fig. 6, a call method in an embodiment of the present invention is applied to a first terminal, and includes:
step 601, receiving target video information sent by a first server;
step 602, displaying the target video information on a current call interface;
the target video information comprises first text information corresponding to first voice information, and the first voice information is sent by a second terminal.
In this embodiment, the first terminal may display the first text information corresponding to the first voice information when displaying the target video information by receiving the target video information sent by the first server, for example, display the first text information in the form of subtitles to the user, thereby implementing the barrier-free call.
Optionally, the method further comprises:
receiving first request information sent by the first server, wherein the first request information is used for requesting to convert an audio call between the first terminal and the first network device into a video call;
displaying an operation control on the current call interface according to the first request information;
receiving an operation instruction sent by a user through the operation control;
and sending first information to the first server according to the operation instruction.
For example, after receiving the first request information, the first terminal may display an operation control (such as a button for approving or rejecting) on the current call interface, so that the first terminal may receive an operation instruction sent by the user through the operation control, and thus send the first information to the first server according to the operation instruction, so as to indicate whether the user approves the audio call between the first terminal and the first network device to be converted into the video call.
According to the communication method, the first terminal can display the first text information corresponding to the first voice information when displaying the target video information by receiving the target video information sent by the first server, so that the barrier-free communication function in the process of dialing voice communication by a user is realized, convenience and quickness are realized, and the user experience is improved.
As shown in fig. 7, a communication apparatus according to an embodiment of the present invention is applied to a first network device, and includes:
a first obtaining module 710, configured to obtain, when a first terminal and a second terminal establish an audio call connection, first voice information sent by the second terminal;
an information sending module 720, configured to send the first voice information to a second network device;
An information receiving module 730, configured to receive target information fed back by the second network device according to the first voice information;
the first processing module 740 is configured to determine target video information according to the target information, where the target video information includes first text information corresponding to the first voice information;
and a first sending module 750, configured to send the target video information to the first terminal.
In this embodiment, under the condition that the first terminal and the second terminal establish audio call connection, the first voice information sent by the second terminal can be acquired, the first voice information is sent to the second network device, then the target video information is determined according to the target information fed back by the second network device, and the target video information is sent to the first terminal, wherein the target video information includes first text information corresponding to the first voice information, so that the first terminal can display the first text information in the form of subtitles when displaying the target video information after receiving the target video information, thereby realizing the functions of voice transcription, translation, subtitle display and the like in a native call interface of a mobile phone in the process of dialing a voice call by a user.
Optionally, the apparatus further comprises:
the first media renegotiation module is used for sending first media renegotiation information to the first terminal, and the first media renegotiation information is used for anchoring an audio call endpoint of the first terminal;
and the second media renegotiation module is used for sending first media renegotiation information to the first terminal, and the first media renegotiation information is used for anchoring an audio call endpoint of the first terminal.
Optionally, the apparatus further comprises:
the first sending submodule is used for sending first indication information to the second network equipment, wherein the first indication information is used for indicating relevant information of the barrier-free call service;
wherein the first indication information includes at least one of:
a control event for indicating the start or end of the barrier-free talk service;
transmitting media parameters of audio and/or video;
terminal identity information;
a translation language indication.
Optionally, the information sending module includes:
and the first processing unit is used for generating the target video information according to the first text information and the first voice information.
Optionally, the second processing sub-module includes:
And the second processing unit is used for determining the first video information as the target video information.
Optionally, the second transmitting submodule includes:
a first sending unit, configured to send first request information to the first terminal, where the first request information is used to request conversion of an audio call between the first terminal and the first network device into a video call;
the first receiving unit is used for receiving first information fed back by the first terminal according to the first request information;
and a second sending unit, configured to send the first voice information to a second network device if the first information indicates that the user agrees to convert the audio call between the first terminal and the first network device into a video call.
It should be noted that, the communication device provided in the embodiment of the present invention can implement all the method steps implemented in the embodiment of the communication method applied to the first network device, and can achieve the same technical effects, and detailed descriptions of the same parts and beneficial effects as those in the embodiment of the method are omitted herein.
As shown in fig. 8, a communication apparatus according to an embodiment of the present invention is applied to a second network device, and includes:
A first receiving module 810, configured to receive first voice information sent by a first network device;
a second processing module 820, configured to determine target information according to the first voice information;
a second sending module 830, configured to send the target information to the first network device.
According to the embodiment, after the first voice information sent by the first network device is received, the target information is determined according to the first voice information, and the target information is the first text information or the first video information corresponding to the first voice information, namely, the conversion of the first voice information is completed, so that the first terminal can display the first text information corresponding to the first voice information (for example, display the first text information in the form of subtitles), thereby realizing the functions of voice transcription, translation, subtitle display and the like in the native phone call interface.
Optionally, the apparatus further comprises:
the third receiving module is used for receiving first indication information sent by the first network equipment and used for indicating related information of the barrier-free call service;
wherein the first indication information includes at least one of:
A control event for indicating the start or end of the barrier-free talk service;
transmitting media parameters of audio and/or video;
terminal identity information;
a translation language indication.
Optionally, the second processing module includes:
and the third processing sub-module is used for generating first text information corresponding to the first voice information according to the first voice information and the first indication information, and taking the first text information as the target information.
Optionally, the second processing module includes:
the fourth processing sub-module is used for generating first text information corresponding to the first voice information according to the first voice information and the first indication information;
and the fifth processing sub-module is used for generating first video information according to the first text information and the first voice information, and taking the first video information as the target information.
It should be noted that, the communication device provided in the embodiment of the present invention can implement all the method steps implemented in the embodiment of the communication method applied to the second network device, and can achieve the same technical effects, and detailed descriptions of the same parts and beneficial effects as those in the embodiment of the method are omitted herein.
As shown in fig. 9, a communication device according to an embodiment of the present invention is applied to a first terminal, and includes:
a second receiving module 910, configured to receive the target video information sent by the first server;
a first display module 920, configured to display the target video information on a current call interface;
the target video information comprises first text information corresponding to first voice information, and the first voice information is sent by a second terminal.
In this embodiment, the first terminal may display the first text information corresponding to the first voice information when displaying the target video information by receiving the target video information sent by the first server, for example, display the first text information in the form of subtitles to the user, thereby implementing the barrier-free call.
Optionally, the apparatus further comprises:
a fourth receiving module, configured to receive first request information sent by the first server, where the first request information is used to request conversion of an audio call between the first terminal and the first network device into a video call;
the first display module is used for displaying an operation control on the current call interface according to the first request information;
The fifth receiving module is used for receiving an operation instruction sent by a user through the operation control;
and the third sending module is used for sending the first information to the first server according to the operation instruction.
It should be noted that, the communication device provided in the embodiment of the present invention can implement all the method steps implemented in the embodiment of the communication method applied to the first terminal, and can achieve the same technical effects, and detailed descriptions of the same parts and beneficial effects as those in the embodiment of the method are omitted herein.
As shown in fig. 10, in an embodiment of the present invention, a network device 1000, where the network device 1000 is a first network device, includes a processor 1010 and a transceiver 1020, where the processor 1010 is configured to:
under the condition that the first terminal and the second terminal establish audio call connection, acquiring first voice information sent by the second terminal;
transmitting the first voice information to a second network device;
receiving target information fed back by the second network equipment according to the first voice information;
determining target video information according to the target information, wherein the target video information comprises first text information corresponding to the first voice information;
And sending the target video information to the first terminal.
In this embodiment, under the condition that the first terminal and the second terminal establish audio call connection, the first voice information sent by the second terminal can be acquired, the first voice information is sent to the second network device, then the target video information is determined according to the target information fed back by the second network device, and the target video information is sent to the first terminal, wherein the target video information includes first text information corresponding to the first voice information, so that the first terminal can display the first text information in the form of subtitles when displaying the target video information after receiving the target video information, thereby realizing the functions of voice transcription, translation, subtitle display and the like in a native call interface of a mobile phone in the process of dialing a voice call by a user.
Optionally, the processor 1010 is further configured to:
transmitting first media renegotiation information to the first terminal, wherein the first media renegotiation information is used for anchoring an audio call endpoint of the first terminal;
And sending second media renegotiation information to the second terminal, wherein the second media renegotiation information is used for anchoring an audio call endpoint of the second terminal.
Optionally, the processor 1010 is further configured to:
transmitting first indication information to second network equipment, wherein the first indication information is used for indicating related information of barrier-free call service;
wherein the first indication information includes at least one of:
a control event for indicating the start or end of the barrier-free talk service;
transmitting media parameters of audio and/or video;
terminal identity information;
a translation language indication.
Optionally, the processor 1010 is specifically configured to, when determining the target video information according to the target information:
and generating the target video information according to the first text information and the first voice information.
Optionally, the processor 1010 is specifically configured to, when determining the target video information according to the target information:
the first video information is determined as the target video information.
Optionally, the processor 1010 is specifically configured to, when sending the first voice information to a second network device:
transmitting first request information to the first terminal, wherein the first request information is used for requesting to convert an audio call between the first terminal and the first network device into a video call;
Receiving first information fed back by the first terminal according to the first request information;
and transmitting the first voice information to a second network device in the case that the first information indicates that the user agrees to convert the audio call between the first terminal and the first network device into the video call.
It should be noted that, the network device provided in the embodiment of the present invention can implement all the method steps implemented in the embodiment of the communication method applied to the first network device, and can achieve the same technical effects, and detailed descriptions of the same parts and beneficial effects as those in the embodiment of the method are omitted herein.
Referring to the structure shown in fig. 10, in an embodiment of the present invention, the network device is a second network device, including a processor and a transceiver, where the processor is configured to:
receiving first voice information sent by first network equipment;
determining target information according to the first voice information;
and sending the target information to the first network equipment.
According to the embodiment, after the first voice information sent by the first network device is received, the target information is determined according to the first voice information, and the target information is the first text information or the first video information corresponding to the first voice information, namely, the conversion of the first voice information is completed, so that the first terminal can display the first text information corresponding to the first voice information (for example, display the first text information in the form of subtitles), thereby realizing the functions of voice transcription, translation, subtitle display and the like in the native phone call interface.
Optionally, the processor is further configured to:
receiving first indication information sent by first network equipment, wherein the first indication information is used for indicating related information of barrier-free call service;
wherein the first indication information includes at least one of:
a control event for indicating the start or end of the barrier-free talk service;
transmitting media parameters of audio and/or video;
terminal identity information;
a translation language indication.
Optionally, the processor is specifically configured to, when determining the target information according to the first voice information:
and generating first text information corresponding to the first voice information according to the first voice information and the first indication information, and taking the first text information as the target information.
Optionally, the processor is specifically configured to, when determining the target information according to the first voice information:
generating first text information corresponding to the first voice information according to the first voice information and the first indication information;
and generating first video information according to the first text information and the first voice information, and taking the first video information as the target information.
It should be noted that, the network device provided in the embodiment of the present invention can implement all the method steps implemented in the embodiment of the communication method applied to the second network device, and can achieve the same technical effects, and detailed descriptions of the same parts and beneficial effects as those in the embodiment of the method are omitted herein.
As shown in fig. 11, in an embodiment of the present invention, a terminal 1100 is a first terminal, and includes a processor 1110 and a transceiver 1120, where the processor 1110 is configured to:
receiving target video information sent by a first server;
displaying the target video information on a current call interface;
the target video information comprises first text information corresponding to first voice information, and the first voice information is sent by a second terminal.
In this embodiment, the first terminal may display the first text information corresponding to the first voice information when displaying the target video information by receiving the target video information sent by the first server, for example, display the first text information in the form of subtitles to the user, thereby implementing the barrier-free call.
Optionally, the processor 1110 is further configured to:
receiving first request information sent by the first server, wherein the first request information is used for requesting to convert an audio call between the first terminal and the first network device into a video call;
displaying an operation control on the current call interface according to the first request information;
receiving an operation instruction sent by a user through the operation control;
And sending first information to the first server according to the operation instruction.
It should be noted that, the terminal provided by the embodiment of the present invention can implement all the method steps implemented by the embodiment of the communication method applied to the first terminal, and can achieve the same technical effects, and detailed descriptions of the same parts and beneficial effects as those of the embodiment of the method in the embodiment are omitted herein.
A network device according to another embodiment of the present invention is a first network device, as shown in fig. 12, including a transceiver 1210, a processor 1200, a memory 1220, and a program or an instruction stored on the memory 1220 and executable on the processor 1200; the processor 1200, when executing the program or instructions, implements the communication method described above as applied to the first network device.
The transceiver 1210 is configured to receive and transmit data under the control of the processor 1200.
Wherein in fig. 12, a bus architecture may comprise any number of interconnected buses and bridges, and in particular, one or more processors represented by processor 1200 and various circuits of memory represented by memory 1220, linked together. The bus architecture may also link together various other circuits such as peripheral devices, voltage regulators, power management circuits, etc., which are well known in the art and, therefore, will not be described further herein. The bus interface provides an interface. The transceiver 1210 may be a number of elements, i.e. include a transmitter and a receiver, providing a means for communicating with various other apparatus over a transmission medium. The processor 1200 is responsible for managing the bus architecture and general processing, and the memory 1220 may store data used by the processor 1200 in performing operations.
The network device according to still another embodiment of the present invention is a second network device, and referring to the structure shown in fig. 12, the network device includes a transceiver, a processor, a memory, and a program or instructions stored in the memory and executable on the processor; the processor, when executing the program or instructions, implements the communication method described above as applied to the second network device.
The transceiver is used for receiving and transmitting data under the control of the processor.
Where in FIG. 12, a bus architecture may comprise any number of interconnected buses and bridges, with various circuits of one or more processors, specifically represented by a processor, and memory, represented by a memory, linked together. The bus architecture may also link together various other circuits such as peripheral devices, voltage regulators, power management circuits, etc., which are well known in the art and, therefore, will not be described further herein. The bus interface provides an interface. The transceiver may be a plurality of elements, i.e. comprising a transmitter and a receiver, providing a unit for communicating with various other apparatus over a transmission medium. The processor is responsible for managing the bus architecture and general processing, and the memory may store data used by the processor in performing operations.
A terminal according to another embodiment of the present invention is a first terminal, as shown in fig. 13, and includes a transceiver 1310, a processor 1300, a memory 1320, and a program or instructions stored in the memory 1320 and executable on the processor 1300; the processor 1300, when executing the program or instructions, implements the communication method described above as applied to the first terminal.
The transceiver 1310 is configured to receive and transmit data under the control of the processor 1300.
Where in FIG. 13, a bus architecture may comprise any number of interconnected buses and bridges, with various circuits of the one or more processors, specifically represented by processor 1300, and the memory, represented by memory 1320, being linked together. The bus architecture may also link together various other circuits such as peripheral devices, voltage regulators, power management circuits, etc., which are well known in the art and, therefore, will not be described further herein. The bus interface provides an interface. The transceiver 1310 may be a number of elements, i.e., include a transmitter and a receiver, providing a means for communicating with various other apparatus over a transmission medium. The user interface 1330 may also be an interface capable of interfacing with an inscribed desired device for a different user device, including but not limited to a keypad, display, speaker, microphone, joystick, etc.
The processor 1300 is responsible for managing the bus architecture and general processing, and the memory 1320 may store data used by the processor 1300 in performing operations.
The readable storage medium of the embodiment of the present invention stores a program or an instruction, which when executed by a processor, implements the steps in the call method described above, and can achieve the same technical effects, and is not described herein again for avoiding repetition. Wherein the computer readable storage medium is selected from Read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), magnetic disk or optical disk.
It is further noted that the terminals described in this specification include, but are not limited to, smartphones, tablets, etc., and that many of the functional components described are referred to as modules in order to more particularly emphasize their implementation independence.
In an embodiment of the invention, the modules may be implemented in software for execution by various types of processors. An identified module of executable code may, for instance, comprise one or more physical or logical blocks of computer instructions which may, for instance, be organized as an object, procedure, or function. Nevertheless, the executables of an identified module need not be physically located together, but may comprise disparate instructions stored in different bits which, when joined logically together, comprise the module and achieve the stated purpose for the module.
Indeed, a module of executable code may be a single instruction, or many instructions, and may even be distributed over several different code segments, among different programs, and across several memory devices. Likewise, operational data may be identified within modules and may be embodied in any suitable form and organized within any suitable type of data structure. The operational data may be collected as a single data set, or may be distributed over different locations including over different storage devices.
Where a module may be implemented in software, taking into account the level of existing hardware technology, a module may be implemented in software, and one skilled in the art may, without regard to cost, build corresponding hardware circuitry, including conventional Very Large Scale Integration (VLSI) circuits or gate arrays, and existing semiconductors such as logic chips, transistors, or other discrete components, to achieve the corresponding functions. A module may also be implemented in programmable hardware devices such as field programmable gate arrays, programmable array logic, programmable logic devices or the like.
The exemplary embodiments described above are described with reference to the drawings, many different forms and embodiments are possible without departing from the spirit and teachings of the present invention, and therefore, the present invention should not be construed as limited to the exemplary embodiments set forth herein. Rather, these exemplary embodiments are provided so that this disclosure will be thorough and complete, and will convey the scope of the invention to those skilled in the art. In the drawings, the size of the elements and relative sizes may be exaggerated for clarity. The terminology used herein is for the purpose of describing particular example embodiments only and is not intended to be limiting. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. Unless otherwise indicated, a range of values includes the upper and lower limits of the range and any subranges therebetween.
While the foregoing is directed to the preferred embodiments of the present invention, it will be appreciated by those skilled in the art that various modifications and adaptations can be made without departing from the principles of the present invention, and such modifications and adaptations are intended to be comprehended within the scope of the present invention.

Claims (20)

1. A method of communicating, applied to a first network device, comprising:
under the condition that the first terminal and the second terminal establish audio call connection, acquiring first voice information sent by the second terminal;
transmitting the first voice information to a second network device;
receiving target information fed back by the second network equipment according to the first voice information;
determining target video information according to the target information, wherein the target video information comprises first text information corresponding to the first voice information;
and sending the target video information to the first terminal.
2. The method of claim 1, wherein prior to said obtaining the first voice information sent by the second terminal, the method further comprises at least one of:
transmitting first media renegotiation information to the first terminal, wherein the first media renegotiation information is used for anchoring an audio call endpoint of the first terminal;
And sending second media renegotiation information to the second terminal, wherein the second media renegotiation information is used for anchoring an audio call endpoint of the second terminal.
3. The method of claim 2, wherein prior to said obtaining the first voice information sent by the second terminal, the method further comprises:
transmitting first indication information to the second network equipment, wherein the first indication information is used for indicating related information of barrier-free call service;
wherein the first indication information includes at least one of:
a control event for indicating the start or end of the barrier-free talk service;
transmitting media parameters of audio and/or video;
terminal identity information;
a translation language indication.
4. The method according to claim 1, wherein, in the case that the target information is the first text information, the determining the target video information according to the target information includes:
and generating the target video information according to the first text information and the first voice information.
5. The method according to claim 1, wherein, in the case that the target information is the first video information, the determining the target video information according to the target information includes:
The first video information is determined as the target video information.
6. The method of claim 1, wherein said transmitting the first voice information to a second network device comprises:
transmitting first request information to the first terminal, wherein the first request information is used for requesting to convert an audio call between the first terminal and the first network device into a video call;
receiving first information fed back by the first terminal according to the first request information;
and transmitting the first voice information to a second network device in the case that the first information indicates agreement to convert the audio call between the first terminal and the first network device into a video call.
7. A method of communicating, applied to a second network device, comprising:
receiving first voice information sent by first network equipment;
determining target information according to the first voice information;
and sending the target information to the first network equipment.
8. The method of claim 7, wherein prior to receiving the first voice information sent by the first network device, the method further comprises:
Receiving first indication information sent by first network equipment, wherein the first indication information is used for indicating related information of barrier-free call service;
wherein the first indication information includes at least one of:
a control event for indicating the start or end of the barrier-free talk service;
transmitting media parameters of audio and/or video;
terminal identity information;
a translation language indication.
9. The method of claim 8, wherein said determining target information from said first speech information comprises:
and generating first text information corresponding to the first voice information according to the first voice information and the first indication information, and taking the first text information as the target information.
10. The method of claim 8, wherein said determining target information from said first speech information comprises:
generating first text information corresponding to the first voice information according to the first voice information and the first indication information;
and generating first video information according to the first text information and the first voice information, and taking the first video information as the target information.
11. The communication method is characterized by being applied to a first terminal and comprising the following steps:
Receiving target video information sent by a first server;
displaying the target video information on a current call interface;
the target video information comprises first text information corresponding to first voice information, and the first voice information is sent by a second terminal.
12. A telephony device, for use with a first network appliance, comprising:
the first acquisition module is used for acquiring first voice information sent by the second terminal under the condition that the first terminal and the second terminal establish audio call connection;
the information sending module is used for sending the first voice information to the second network equipment;
the information receiving module is used for receiving target information fed back by the second network equipment according to the first voice information;
the first processing module is used for determining target video information according to the target information, wherein the target video information comprises first text information corresponding to the first voice information;
and the first sending module is used for sending the target video information to the first terminal.
13. A telephony device, for use with a second network appliance, comprising:
the first receiving module is used for receiving first voice information sent by the first network equipment;
The second processing module is used for determining target information according to the first voice information;
and the second sending module is used for sending the target information to the first network equipment.
14. A telephony device, for use with a first terminal, comprising:
the second receiving module is used for receiving the target video information sent by the first server;
the first display module is used for displaying the target video information on the current call interface;
the target video information comprises first text information corresponding to first voice information, and the first voice information is sent by a second terminal.
15. A network device, the network device being a first network device, comprising: a transceiver and a processor; the processor is configured to:
under the condition that the first terminal and the second terminal establish audio call connection, acquiring first voice information sent by the second terminal;
transmitting the first voice information to a second network device;
receiving target information fed back by the second network equipment according to the first voice information;
determining target video information according to the target information, wherein the target video information comprises first text information corresponding to the first voice information;
And sending the target video information to the first terminal.
16. A network device, the network device being a second network device, comprising: a transceiver and a processor; the processor is configured to:
receiving first voice information sent by first network equipment;
determining target information according to the first voice information;
and sending the target information to the first network equipment.
17. A terminal, the terminal being a first terminal, comprising: a transceiver and a processor; the processor is configured to:
receiving target video information sent by a first server;
displaying the target video information on a current call interface;
the target video information comprises first text information corresponding to first voice information, and the first voice information is sent by a second terminal.
18. A network device, comprising: a transceiver, a processor, a memory, and a program or instructions stored on the memory and executable on the processor; a method according to any one of claims 1 to 6, or a method according to any one of claims 7 to 10, when the program or instructions are executed by the processor.
19. A terminal, the terminal being a first terminal, comprising: a transceiver, a processor, a memory, and a program or instructions stored on the memory and executable on the processor; the communication method according to claim 11 is implemented when the processor executes the program or instructions.
20. A readable storage medium having stored thereon a program or instructions which when executed by a processor performs the steps of the call method of any one of claims 1 to 6, or the steps of the call method of any one of claims 7 to 10, or the steps of the call method of claim 11.
CN202211082451.5A 2022-09-06 2022-09-06 Call method, device, network equipment and terminal Pending CN116962360A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211082451.5A CN116962360A (en) 2022-09-06 2022-09-06 Call method, device, network equipment and terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211082451.5A CN116962360A (en) 2022-09-06 2022-09-06 Call method, device, network equipment and terminal

Publications (1)

Publication Number Publication Date
CN116962360A true CN116962360A (en) 2023-10-27

Family

ID=88459107

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211082451.5A Pending CN116962360A (en) 2022-09-06 2022-09-06 Call method, device, network equipment and terminal

Country Status (1)

Country Link
CN (1) CN116962360A (en)

Similar Documents

Publication Publication Date Title
US8868430B2 (en) Methods, devices, and computer program products for providing real-time language translation capabilities between communication terminals
US6546082B1 (en) Method and apparatus for assisting speech and hearing impaired subscribers using the telephone and central office
EP1920567B1 (en) System for service sharing and controling contents in a voice session and method thereof
EP2611147B1 (en) Video communication method, video communication system and integrated media resource server
RU2408998C2 (en) Control of call establishment procedure for multimedia communication
US7142643B2 (en) Method and system for unifying phonebook for varied hearing disabilities
EP4262320A1 (en) Call processing system and call processing method
US6977911B1 (en) Scalable voice over IP system configured for dynamically switching codecs during a call
CN101292490B (en) The communication equipment of multimedia call controlling mechanism and this mechanism of use
KR20050120342A (en) Video telephony selection method of mobile station
EP1691506A1 (en) A method of implementing multi-party conference service in carrying and controlling separate networks
CN110650254B (en) Information transmission method, information reception method, terminal, and storage medium
CN116962360A (en) Call method, device, network equipment and terminal
EP2285107A1 (en) Method, conference control equipment and conference system for prompting call progress state
CN112492110B (en) Video color ring interaction method, system, electronic equipment and storage medium
CN113572749A (en) VoLTE-based video calling method and system
JP2932027B2 (en) Videophone equipment
CN113726750A (en) Voice real-time translation method, device and storage medium
KR20040022738A (en) SMS system of internet visual phone
KR20070113740A (en) Interpretation service offering system and method using by it
EP2456182A1 (en) Method, system and parlay x gateway for implementing advanced call
KR102380557B1 (en) Apparatus and method for interfacing artificial intelligence service
CN116743718A (en) Barrier-free communication method and network equipment based on IMS system native call
CN116962582A (en) Interaction method, device, network equipment and readable storage medium
KR100897898B1 (en) Method and System for Providing Subscription Information about Video Telephony Service, Mobile Communication Terminal Therefor

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination