CN106128468B - Voice communication method and device - Google Patents

Voice communication method and device Download PDF

Info

Publication number
CN106128468B
CN106128468B CN201610539161.7A CN201610539161A CN106128468B CN 106128468 B CN106128468 B CN 106128468B CN 201610539161 A CN201610539161 A CN 201610539161A CN 106128468 B CN106128468 B CN 106128468B
Authority
CN
China
Prior art keywords
voice
information
coding information
voice call
terminal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610539161.7A
Other languages
Chinese (zh)
Other versions
CN106128468A (en
Inventor
卢林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201610539161.7A priority Critical patent/CN106128468B/en
Publication of CN106128468A publication Critical patent/CN106128468A/en
Priority to PCT/CN2017/087317 priority patent/WO2018006678A1/en
Application granted granted Critical
Publication of CN106128468B publication Critical patent/CN106128468B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/07User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail characterised by the inclusion of specific contents
    • H04L51/10Multimedia information
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/0018Speech coding using phonetic or linguistical decoding of the source; Reconstruction using text-to-speech synthesis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/173Transcoding, i.e. converting between two coded representations avoiding cascaded coding-decoding

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The invention discloses a voice call method and a voice call device, and belongs to the technical field of voice calls. The method comprises the following steps: receiving a voice call request sent by a calling terminal, wherein the voice call request carries an identifier of a called terminal; acquiring first voice coding information of a calling terminal and second voice coding information of a called terminal; receiving voice call information sent by a calling terminal or a called terminal; converting the voice call information into voice call information supported by another terminal according to the first voice coding information and the second voice coding information; sending the converted voice call information to another terminal; the problem that transcoding can be realized only after a voice call client is updated when a new coding type occurs in the prior art, and the flexibility is poor is solved; the background server can directly transcode according to the voice coding information at the two ends of the call without updating the voice call client, and the flexibility is improved.

Description

Voice communication method and device
Technical Field
The embodiment of the invention relates to the technical field of voice communication, in particular to a voice communication method and device.
Background
Voice communication has become a common communication method for people to communicate.
An existing voice call method includes: the voice call client acquires voice coding information of a called terminal; receiving voice call information; if the voice call information is the voice call information from the home terminal, converting the voice call information into the voice call information which can be supported by the called terminal according to the voice coding information of the called terminal, and sending the converted voice call information to the called terminal; and if the voice call information is the voice call information from the called terminal, converting the voice call information into the voice call information which can be supported by the local terminal.
In the process of implementing the embodiment of the present invention, it is found that the prior art has at least the following problems:
in the method, the voice transcoding is required to be executed by the voice call client, and when a new voice coding format appears, the voice call client needs to be updated in order to ensure that the voice call client can transcode normally, so that the flexibility is poor.
Disclosure of Invention
In order to solve the problem of poor flexibility in the prior art, embodiments of the present invention provide a voice communication method and apparatus. The technical scheme is as follows:
in a first aspect, a voice call method is provided, where the method includes:
receiving a voice call request sent by a calling terminal, wherein the voice call request carries an identifier of a called terminal;
acquiring first voice coding information of the calling terminal and second voice coding information of the called terminal;
receiving voice call information sent by the calling terminal or the called terminal;
converting the voice call information into voice call information supported by another terminal according to the first voice coding information and the second voice coding information;
and sending the converted voice call information to the other terminal.
In a second aspect, a voice call apparatus is provided, the method including:
the system comprises a receiving module, a sending module and a receiving module, wherein the receiving module is used for receiving a voice call request sent by a calling terminal, and the voice call request carries an identifier of a called terminal;
the acquisition module is used for acquiring first voice coding information of the calling terminal and second voice coding information of the called terminal;
the receiving module is further configured to receive voice call information sent by the calling terminal or the called terminal;
the conversion module is used for converting the voice call information into voice call information supported by another terminal according to the first voice coding information and the second voice coding information;
and the sending module is used for sending the voice call information converted by the conversion module to the other terminal.
The technical scheme provided by the embodiment of the invention has the following beneficial effects:
the background server acquires first voice coding information of a calling terminal and second voice coding information of a called terminal after receiving a voice call request sent by the calling terminal, and then converts the voice call information into voice call information supported by another terminal according to the first voice coding information and the second voice coding information when receiving the voice call information sent by the calling terminal or the called terminal, and sends the converted voice call information to the other terminal; the problem that transcoding can be realized only after a voice call client is updated when a new coding type occurs in the prior art, and the flexibility is poor is solved; the background server can directly transcode according to the voice coding information at the two ends of the call without updating the voice call client, and the flexibility is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic illustration of an implementation environment in which various embodiments of the present invention are involved;
fig. 2 is a flowchart of a voice call method according to an embodiment of the present invention;
fig. 3 is a flowchart of a voice call method according to another embodiment of the present invention;
fig. 4A is a flowchart of a voice call method according to another embodiment of the present invention;
fig. 4B is a schematic diagram of a voice call method according to another embodiment of the present invention;
fig. 4C is another flowchart of a voice call method according to another embodiment of the present invention;
FIG. 4D is a diagram illustrating an update of vocoded information by a target terminal according to another embodiment of the present invention;
fig. 5 is a schematic structural diagram of a voice communicator according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a server according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.
Referring to fig. 1, a schematic diagram of an implementation environment related to a voice call method according to various embodiments of the present invention is shown. As shown in fig. 1, the implementation environment includes a calling terminal 110, a backend server 120, a telephony operator 130, and a called terminal 140.
Calling terminal 110 may be a terminal with voice call capability, for example, a mobile phone. In practical implementation, the calling terminal 110 has the voice call client 111 installed therein, and the calling terminal 110 may initiate a voice call with the called terminal 140 through the voice call client 111. Alternatively, the Voice call client may initiate a Voice call with the called terminal 140 through VoIP (Voice over Internet Protocol). The calling terminal 110 may be connected to the backend server 120 through a wireless network.
The background server 120 is a background server for providing a service to the voice call client 111. The backend server 120 may be connected to the phone carrier 130 via a wired or wireless network. In practical implementation, the background server 120 may be one server or a server cluster composed of multiple servers.
Taking the background server 120 as a server cluster as an example, the background server 120 may include an RTP (Real-time Transport Protocol) server, a transcoding server, and a call server. The RTP server is used to communicate with the telephony operator 130, the transcoding server is used to transcode voice call information, and the call server is used to receive a call from the calling terminal 110 and initiate a call to the called terminal 130 to the telephony operator 130. Optionally, the backend server 120 may further include other servers, which is not limited in this embodiment.
The telephone carrier 130 may be a mobile, a universal, a telecom, or other carrier.
The called terminal 140 may also be a terminal with voice call capability, for example, a mobile phone. In actual implementation, the called terminal 140 may or may not have a voice call client installed therein, which is not limited in this embodiment. Also, the called terminal 140 may be a terminal in a PSTN (Public Switched Telephone Network).
Referring to fig. 2, a flowchart of a method of a voice call method according to an embodiment of the present invention is shown, and the present embodiment illustrates that the voice call method is used in the backend server 120 shown in fig. 1. As shown in fig. 2, the voice call method may include:
step 201, receiving a voice call request sent by a calling terminal, where the voice call request carries an identifier of a called terminal.
Step 202, acquiring first voice coding information of the calling terminal and second voice coding information of the called terminal.
Step 203, receiving the voice call information sent by the calling terminal or the called terminal.
And step 204, converting the voice call information into the voice call information supported by the other terminal according to the first voice coding information and the second voice coding information.
Step 205, the converted voice call information is sent to another terminal.
In summary, in the voice call method provided in this embodiment, the background server obtains the first voice coding information of the calling terminal and the second voice coding information of the called terminal after receiving the voice call request sent by the calling terminal, and then converts the voice call information into the voice call information supported by another terminal according to the first voice coding information and the second voice coding information when receiving the voice call information sent by the calling terminal or the called terminal, and sends the converted voice call information to the another terminal; the problem that transcoding can be realized only after a voice call client is updated when a new coding type occurs in the prior art, and the flexibility is poor is solved; the background server can directly transcode according to the voice coding information at the two ends of the call without updating the voice call client, and the flexibility is improved.
Referring to fig. 3, a flowchart of a method of a voice call method according to an embodiment of the present invention is shown, and the present embodiment illustrates that the voice call method is used in the backend server 120 shown in fig. 1. As shown in fig. 3, the voice call method may include:
step 301, receiving a voice call request sent by a calling terminal, where the voice call request carries an identifier of a called terminal.
The calling terminal is provided with a voice call client, and when a user needs to perform voice call with other users, the user can initiate a voice call request for calling the called terminal through the voice call client in the calling terminal.
After a voice call client in a calling terminal initiates a voice call request, a background server may correspondingly receive the voice call request. The voice call request carries the identifier of the called terminal. For example, the mobile phone number carries the called terminal.
In practical implementation, the voice call request may further include first voice coding information supported by the calling terminal. Wherein the first speech coding information may include: the type of encoding of the encoder, or, the type of encoding and the encoding parameters used by the encoder. The coding type may be: talk, g711a, g729a, etc., and the encoding parameters may include at least one of a sampling rate, an encoding complexity, and a transmission interval for transmitting adjacent packets.
Optionally, since some encoders do not configure encoding parameters in actual implementation, the speech encoding information may only include the encoding type for this case. If the encoder configures the encoding parameters, the speech encoding newly includes the encoding type and the encoding parameters.
It should be noted that, the above is only exemplified by the encoding type and the encoding parameter being the above contents, optionally, the encoding type may also be another type, and the encoding parameter may also include other contents, which is not limited in this embodiment.
Step 302, extracting the first voice coding information carried in the voice call request.
Step 303, obtaining the second voice coding information from the operator corresponding to the called terminal according to the identifier of the called terminal in the voice call request.
After receiving the voice call request, the background server may extract the identifier of the called terminal carried in the voice call request, determine an operator corresponding to the called terminal according to the identifier of the called terminal, and then obtain second voice encoding information of the called terminal from the determined operator. Optionally, the background server may send an information obtaining request to the operator, and receive second speech coding information returned by the operator, where the information obtaining request is used to request to obtain the second speech coding information of the called terminal.
For example, if the identifier of the called terminal is 158616xxx12, the background server may determine that the called terminal is a mobile user, and at this time, the background server may send an information acquisition request to the mobile operator and receive the second speech encoding information returned by the mobile operator.
It should be noted that the speech coding information of each user in the same operator may be the same or different, and the speech coding information of each user in different operators may also be the same or different, which is not limited in this embodiment. And when the voice coding information of each user in the same operator is different, the information acquisition request sent by the background server may include the identifier of the called terminal, and after receiving the information acquisition request, the operator determines the second voice coding information of the called terminal according to the identifier of the called terminal and returns the determined second voice coding information to the background server.
And step 304, receiving voice call information sent by the calling terminal or the called terminal.
In the voice communication process of the calling terminal and the called terminal, if the calling terminal sends out voice, the background server can correspondingly send voice communication information to a voice communication client in the calling terminal; if the called terminal sends out voice, the called terminal sends the voice call information to the operator, and then the operator can forward the voice call information to the background server, and the background server correspondingly receives the voice call information.
And 305, converting the voice call information into the voice call information supported by the other terminal according to the first voice coding information and the second voice coding information.
After the background server receives the voice call information, the background server may convert the voice call information into voice call information supported by another terminal.
For example, if the voice call information is information sent by the calling terminal, the background server converts the voice call information into voice call information corresponding to second voice coding information of the called terminal; and if the voice call information is the information sent by the called terminal, the background server converts the voice call information into the voice call information corresponding to the first voice coding information of the calling terminal.
It should be noted that, if the first speech encoding information is the same as the second speech encoding information, the background server does not need to perform conversion, and the background server directly forwards the first speech encoding information and the second speech encoding information, which is not described herein again.
Step 306, the converted voice call information is sent to another terminal.
After the conversion, the background server may send the converted voice call information to another terminal.
After the other terminal receives the converted voice call information, the other terminal can successfully analyze the voice call information, and normal call is guaranteed.
Step 307, in the voice communication process, receiving a coding information update request sent by a target terminal, where the target terminal is a calling terminal or a called terminal, and the coding information update request carries updated voice coding information.
In the voice call process, problems such as network delay, network jitter or network packet loss may occur along with changes of the call network, and in order to avoid the problems, either party of the two parties of the call can automatically update the own voice coding information and send a coding information updating request to the background server. Correspondingly, the background server can receive the coding information updating request sent by the target terminal.
In the process of communication, the two parties of the communication can monitor the tone quality of the communication in real time, obtain the voice coding information corresponding to the current tone quality according to the corresponding relation between the tone quality and the voice coding information, and send a coding information updating request to the background server if the obtained voice coding information is different from the currently used voice coding information.
Since the coding type does not usually change, the speech coding information to be updated may be the coding parameters in practical implementation. When the encoding parameter is encoding complexity, the tone quality and the encoding complexity are in positive correlation; when the coding parameter is a packet transmission interval, the tone quality and the packet transmission interval are in a negative correlation relationship; when the coding parameters include a sampling rate, the sound quality is positively correlated with the sampling rate. Optionally, the sound quality within a certain range may correspond to the same speech coding information, which is not limited in this embodiment.
And 308, updating the voice coding information corresponding to the target terminal according to the updated voice coding information.
And after the background server receives the coding information updating request, updating the corresponding voice coding information. Thereafter, the background server may transcode according to the updated speech coding information, which is not described herein again in this embodiment.
It should be noted that step 307 and step 308 are optional steps, and may or may not be executed in actual implementation, and this embodiment is only executed after step 306, and optionally may also be executed after step 302, which is not described herein again.
Another point to be described is that after the call is ended, the calling terminal may send a call end instruction to the background server, and after the background server receives the call end instruction, the background server deletes the previously received first speech coding information of the calling terminal and the second speech coding information of the called terminal.
In summary, in the voice call method provided in this embodiment, the background server obtains the first voice coding information of the calling terminal and the second voice coding information of the called terminal after receiving the voice call request sent by the calling terminal, and then converts the voice call information into the voice call information supported by another terminal according to the first voice coding information and the second voice coding information when receiving the voice call information sent by the calling terminal or the called terminal, and sends the converted voice call information to the another terminal; the problem that transcoding can be realized only after a voice call client is updated when a new coding type occurs in the prior art, and the flexibility is poor is solved; the background server can directly transcode according to the voice coding information at the two ends of the call without updating the voice call client, and the flexibility is improved.
Meanwhile, the target terminal can update the corresponding voice coding information in the transcoding server, so that both parties in the call can successfully analyze the voice coding information after receiving the voice call information of the opposite terminal, and the call can be normally carried out.
The above embodiment is only an example in which the voice call method is used for a backend server, and the backend server is a server. Optionally, the background server may also be a server cluster including an RTP server, a transcoding server, and a call server, at this time, referring to fig. 4A, the voice call method may include:
step 401, the call server receives a voice call request sent by a calling terminal.
After the voice call client in the calling terminal sends out the voice call request, the call server may correspondingly receive the voice call request. The voice call request carries first voice coding information of a calling terminal and an identifier of a called terminal. Optionally, the voice call client may send the voice call request through SIP (Session Initiation Protocol) signaling.
As shown in fig. 4B, the voice call client may access to the call server through signaling. Accordingly, the call server receives the voice call request.
Step 402, the call server sends the first voice coding information carried in the voice call request to the RTP server.
Optionally, the call server may send the address of the RTP server to the calling terminal while sending the first voice coding information to the RTP server, so that the subsequent calling terminal may send the voice call information to the RTP server according to the address of the RTP server.
In step 403, the RTP server receives the first speech coding information.
Step 404, the call server obtains the second voice coding information from the operator according to the identifier of the called terminal carried in the voice call request.
This step is similar to step 303 in the above embodiment, and this embodiment is not described herein again.
In step 405, the call server synchronizes the second vocoded information to the RTP server.
The RTP server receives the second vocoded information, step 406.
Step 407, the RTP server sends the first voice coding information and the second voice coding information to the transcoding server.
After the RTP server obtains the first speech coding information and the second speech coding information, the RTP server may send the first speech coding information and the second speech coding information to the transcoding server.
It should be noted that, above is only an example that the RTP server acquires the first speech coding information first and then acquires the second speech coding information, optionally, the RTP server may also acquire the first speech coding information first and then acquire the second speech coding information, or the RTP server acquires both of the first speech coding information and the second speech coding information at the same time, which is not limited in this embodiment.
Step 408, the transcoding server feeds back the identification information to the RTP server.
After receiving the first voice coding information and the second voice coding information, the transcoding server uniquely allocates one piece of identification information to the first voice coding information and the second voice coding information, and feeds back the identification information to the RTP server. The identification information is used for uniquely identifying the corresponding relation between the first voice coding information and the second voice coding information.
In step 409, the RTP server receives the identification information fed back by the transcoding server.
In step 410, the RTP server receives voice call information sent by the calling terminal or the called terminal.
In the process of communication, the calling terminal or the called terminal can send voice communication information, and correspondingly, the RTP server can receive the voice communication information.
Specifically, when the calling terminal sends the voice call information, the voice call client in the calling terminal may directly send the voice call information to the RTP server. And when the called terminal sends the voice call information, the called terminal can send the voice call information to the RTP server through the operator.
In step 411, the RTP server sends the voice call information and the identification information to the transcoding server.
After the RTP server receives the voice call information, the RTP server may send the voice call information and the identification information to the transcoding server.
In step 412, the transcoding server converts the voice call information into the voice call information supported by the other terminal according to the identification information.
Step 413, the transcoding server sends the converted voice call information to the RTP server.
In step 414, the RTP server sends the converted voice call information to another terminal.
Step 415, after the call is ended, the RTP server sends a call end instruction to the transcoding server, where the call end instruction includes the identification information.
And step 416, the transcoding server deletes the first voice coding information and the second voice coding information corresponding to the identification information.
After receiving the call ending instruction, the transcoding server extracts the identification information in the call ending instruction, deletes the first voice coding information and the second voice coding information corresponding to the identification information, and releases the storage space required for storing the information.
In addition, similar to the above embodiment, the calling terminal or the called terminal may request to update its own vocoded information, and at this time, referring to fig. 4C, the voice call method may further include the following steps:
in step 417, the RTP server receives the coding information update request sent by the target terminal.
Optionally, when the target terminal is the calling terminal, after the calling terminal sends the coding information update request to the call server through signaling access, the call server may forward the coding information update request to the RTP server, and the RTP server correspondingly receives the coding information update request sent by the call server. When the target terminal is the called terminal, the called terminal may send the coding information update request to the call server, and the call server forwards the coding information update request to the RTP server, and accordingly, the RTP server receives the coding information update request forwarded by the call server.
In step 418, the RTP server forwards the coding information update request to the transcoding server.
And step 419, the transcoding server updates the voice coding information of the target terminal according to the updated voice coding information in the coding information updating request.
Please refer to fig. 4D, which illustrates a schematic diagram of the update process of the speech coding information.
In summary, in the voice call method provided in this embodiment, the background server obtains the first voice coding information of the calling terminal and the second voice coding information of the called terminal after receiving the voice call request sent by the calling terminal, and then converts the voice call information into the voice call information supported by another terminal according to the first voice coding information and the second voice coding information when receiving the voice call information sent by the calling terminal or the called terminal, and sends the converted voice call information to the another terminal; the problem that transcoding can be realized only after a voice call client is updated when a new coding type occurs in the prior art, and the flexibility is poor is solved; the background server can directly transcode according to the voice coding information at the two ends of the call without updating the voice call client, and the flexibility is improved.
After receiving the first voice coding information and the second voice coding information, the transcoding server allocates identification information for representing the corresponding relationship between the first voice coding information and the second voice coding information, and feeds the identification information back to the RTP server, so that after the RTP server receives the voice call information at one end, transcoding can be realized only by sending the voice call information and the identification information to the transcoding server, the first voice coding information and the second voice coding information do not need to be sent to the transcoding server every time, and transmission resources required to be consumed in the transmission process are reduced.
Meanwhile, the target terminal can update the corresponding voice coding information in the transcoding server, so that both parties in the call can successfully analyze the voice coding information after receiving the voice call information of the opposite terminal, and the call can be normally carried out.
Referring to fig. 5, a schematic structural diagram of a voice call apparatus according to an embodiment of the present invention is shown, and as shown in fig. 5, the voice call apparatus may include: a receiving module 510, an obtaining module 520, a converting module 530, and a sending module 540.
A receiving module 510, configured to receive a voice call request sent by a calling terminal, where the voice call request carries an identifier of a called terminal;
an obtaining module 520, configured to obtain first speech coding information of the calling terminal and second speech coding information of the called terminal;
the receiving module 510 is further configured to receive voice call information sent by the calling terminal or the called terminal;
a conversion module 530, configured to convert the voice call information into voice call information supported by another terminal according to the first voice encoding information and the second voice encoding information;
a sending module 540, configured to send the voice call information converted by the converting module 530 to the other terminal.
In summary, the voice call apparatus provided in this embodiment obtains the first voice encoding information of the calling terminal and the second voice encoding information of the called terminal after receiving the voice call request sent by the calling terminal, and then converts the voice call information into the voice call information supported by another terminal according to the first voice encoding information and the second voice encoding information when receiving the voice call information sent by the calling terminal or the called terminal, and sends the converted voice call information to the another terminal; the problem that transcoding can be realized only after a voice call client is updated when a new coding type occurs in the prior art, and the flexibility is poor is solved; the background server can directly transcode according to the voice coding information at the two ends of the call without updating the voice call client, and the flexibility is improved.
Based on the voice call apparatus provided in the foregoing embodiment, optionally, the obtaining module 520 is further configured to extract the first voice coding information carried in the voice call request.
Optionally, the obtaining module 520 is further configured to obtain the second speech coding information from an operator corresponding to the called terminal according to the identifier of the called terminal in the speech call request.
Optionally, the apparatus is used in a backend server, and the backend server includes: a real-time transport protocol RTP module and a transcoding module;
the obtaining module 520 is further configured to:
acquiring the first voice coding information and the second voice coding information through the RTP module, and sending the first voice coding information and the second voice coding information to the transcoding module;
feeding back identification information to the RTP module through the transcoding module, wherein the identification information is used for uniquely identifying the corresponding relation between the first voice coding information and the second voice coding information;
the receiving module 510 is further configured to receive the voice call information through the RTP module;
the transcoding module 530 is further configured to:
sending the voice call information and the identification information to the transcoding module through the RTP module;
and converting the voice call information into the voice call information supported by the other terminal through the transcoding module according to the identification information.
Optionally, the sending module 540 is further configured to send a call ending instruction to the transcoding server through the RTP server after the call is ended, where the call ending instruction includes the identification information;
the device further comprises:
and the deleting module is used for deleting the first voice coding information and the second voice coding information corresponding to the identification information through the transcoding server.
Optionally, the receiving module 510 is further configured to receive, during a voice call, a coding information update request sent by a target terminal, where the target terminal is the calling terminal or the called terminal, and the coding information update request carries updated voice coding information;
the device further comprises:
and the updating module is used for updating the voice coding information corresponding to the target terminal according to the updated voice coding information.
It should be noted that the RTP module in this embodiment may be formed as an RTP server, and the transcoding module may be formed as a transcoding server, which is not limited in this embodiment.
It should be noted that, the voice communicator provided in the above embodiment is only illustrated by dividing the functional modules, and in practical application, the functions may be distributed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above. In addition, the embodiments of the voice communication apparatus and the method of the voice communication method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are detailed in the embodiments of the method and will not be described herein again.
Referring to fig. 6, a schematic structural diagram of a server according to an embodiment of the present invention is shown. The server is used for implementing the voice call method provided in the above embodiment. Specifically, the method comprises the following steps:
the server 600 includes a Central Processing Unit (CPU)601, a system memory 604 including a Random Access Memory (RAM)602 and a Read Only Memory (ROM)603, and a system bus 605 connecting the system memory 604 and the central processing unit 601. The server 600 also includes a basic input/output system (I/O system) 606, which facilitates the transfer of information between devices within the computer, and a mass storage device 607, which stores an operating system 613, application programs 614, and other program modules 615.
The basic input/output system 606 includes a display 608 for displaying information and an input device 609 such as a mouse, keyboard, etc. for a user to input information. Wherein the display 608 and the input device 609 are connected to the central processing unit 601 through an input output controller 610 connected to the system bus 605. The basic input/output system 606 may also include an input/output controller 610 for receiving and processing input from a number of other devices, such as a keyboard, mouse, or electronic stylus. Similarly, input/output controller 610 may also provide output to a display screen, a printer, or other type of output device.
The mass storage device 607 is connected to the central processing unit 601 through a mass storage controller (not shown) connected to the system bus 605. The mass storage device 607 and its associated computer-readable media provide non-volatile storage for the server 600. That is, the mass storage device 607 may include a computer-readable medium (not shown) such as a hard disk or CD-ROM drive.
Without loss of generality, the computer-readable media may comprise computer storage media and communication media. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes RAM, ROM, EPROM, EEPROM, flash memory or other solid state memory technology, CD-ROM, DVD, or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices. Of course, those skilled in the art will appreciate that the computer storage media is not limited to the foregoing. The system memory 604 and mass storage device 607 described above may be collectively referred to as memory.
The server 600 may also operate as a remote computer connected to a network via a network, such as the internet, in accordance with various embodiments of the present invention. That is, the server 600 may be connected to the network 612 through the network interface unit 611 connected to the system bus 605, or may be connected to other types of networks or remote computer systems (not shown) using the network interface unit 611.
The memory also includes one or more programs stored in the memory and configured to be executed by one or more processors. The one or more programs include instructions for performing the server-side method.
It should be understood that, as used herein, the singular forms "a," "an," "the" are intended to include the plural forms as well, unless the context clearly supports the exception. It should also be understood that "and/or" as used herein is meant to include any and all possible combinations of one or more of the associated listed items.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (12)

1. A voice call method, comprising:
receiving a voice call request sent by a calling terminal, wherein the voice call request carries an identifier of a called terminal, the calling terminal is a terminal using a network telephone, and the called terminal is a terminal in a public switched telephone network;
acquiring first voice coding information of the calling terminal and second voice coding information of the called terminal;
receiving voice call information sent by the calling terminal or the called terminal;
converting the voice call information into voice call information supported by another terminal according to the first voice coding information and the second voice coding information, wherein in the voice call process, a coding information updating request sent by the calling terminal or the called terminal is received, so that the voice coding information corresponding to the terminal sending the coding information updating request is updated according to the updated voice coding information carried in the coding information updating request, wherein the calling terminal or the called terminal monitors the call tone quality, and sends the coding information updating request when the voice coding information corresponding to the current call tone quality acquired according to the corresponding relationship between the call tone quality and the voice coding information is different from the currently used voice coding information;
and sending the converted voice call information to the other terminal.
2. The method of claim 1, wherein the obtaining the first vocoded information of the calling terminal comprises:
and extracting the first voice coding information carried in the voice call request.
3. The method of claim 1, wherein the obtaining the second vocoded information of the called terminal comprises:
and acquiring the second voice coding information from an operator corresponding to the called terminal according to the identifier of the called terminal in the voice call request.
4. The method of claim 1, wherein the method is used in a backend server, and wherein the backend server comprises: a real-time transport protocol (RTP) server and a transcoding server;
the acquiring of the first speech coding information of the calling terminal and the second speech coding information of the called terminal includes:
acquiring the first voice coding information and the second voice coding information through the RTP server, and sending the first voice coding information and the second voice coding information to the transcoding server;
feeding back identification information to the RTP server through the transcoding server, wherein the identification information is used for uniquely identifying the corresponding relation between the first voice coding information and the second voice coding information;
the receiving the voice call information sent by the calling terminal or the called terminal includes:
receiving the voice call information through the RTP server;
the converting the voice call information into the voice call information supported by another terminal according to the first voice encoding information and the second voice encoding information includes:
sending the voice call information and the identification information to the transcoding server through the RTP server;
and converting the voice call information into the voice call information supported by the other terminal through the transcoding server according to the identification information.
5. The method of claim 4, further comprising:
after the call is finished, sending a call finishing instruction to the transcoding server through the RTP server, wherein the call finishing instruction comprises the identification information;
deleting the first voice coding information and the second voice coding information corresponding to the identification information through the transcoding server.
6. A voice call apparatus, comprising:
the system comprises a receiving module, a sending module and a receiving module, wherein the receiving module is used for receiving a voice call request sent by a calling terminal, and the voice call request carries an identifier of a called terminal, the calling terminal is a terminal using a network telephone, and the called terminal is a terminal in a public switched telephone network;
the acquisition module is used for acquiring first voice coding information of the calling terminal and second voice coding information of the called terminal;
the receiving module is further configured to receive voice call information sent by the calling terminal or the called terminal;
a conversion module, configured to convert the voice call information into voice call information supported by another terminal according to the first voice coding information and the second voice coding information, where in a voice call process, a coding information update request sent by the calling terminal or the called terminal is received, so as to update, according to the updated voice coding information carried in the coding information update request, the voice coding information corresponding to the terminal that sent the coding information update request, where the calling terminal or the called terminal monitors call tone quality, and sends the coding information update request when the voice coding information corresponding to the current call tone quality obtained according to a correspondence between the call tone quality and the voice coding information is different from the currently used voice coding information;
and the sending module is used for sending the voice call information converted by the conversion module to the other terminal.
7. The apparatus of claim 6,
the obtaining module is further configured to extract the first speech coding information carried in the speech call request.
8. The apparatus of claim 6,
the obtaining module is further configured to obtain the second speech coding information from an operator corresponding to the called terminal according to the identifier of the called terminal in the speech call request.
9. The apparatus of claim 7, wherein the apparatus is used in a backend server, and the backend server comprises: a real-time transport protocol RTP module and a transcoding module;
the obtaining module is further configured to:
acquiring the first voice coding information and the second voice coding information through the RTP module, and sending the first voice coding information and the second voice coding information to the transcoding module;
feeding back identification information to the RTP module through the transcoding module, wherein the identification information is used for uniquely identifying the corresponding relation between the first voice coding information and the second voice coding information;
the receiving module is further configured to receive the voice call information through the RTP module;
the transcoding module is further configured to:
sending the voice call information and the identification information to the transcoding module through the RTP module;
and converting the voice call information into the voice call information supported by the other terminal through the transcoding module according to the identification information.
10. The apparatus of claim 9,
the sending module is further configured to send a call ending instruction to the transcoding module through the RTP module after the call is ended, where the call ending instruction includes the identification information;
the device further comprises:
and the deleting module is used for deleting the first voice coding information and the second voice coding information corresponding to the identification information through the transcoding module.
11. A server, comprising a memory and a processor;
the memory stores at least one program executable by the processor to implement the voice call method according to any one of claims 1 to 5.
12. A computer-readable storage medium, in which a computer program is stored, the computer program being executable by a processor to implement the voice call method according to any one of claims 1 to 5.
CN201610539161.7A 2016-07-08 2016-07-08 Voice communication method and device Active CN106128468B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201610539161.7A CN106128468B (en) 2016-07-08 2016-07-08 Voice communication method and device
PCT/CN2017/087317 WO2018006678A1 (en) 2016-07-08 2017-06-06 Voice call method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610539161.7A CN106128468B (en) 2016-07-08 2016-07-08 Voice communication method and device

Publications (2)

Publication Number Publication Date
CN106128468A CN106128468A (en) 2016-11-16
CN106128468B true CN106128468B (en) 2021-02-12

Family

ID=57283682

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610539161.7A Active CN106128468B (en) 2016-07-08 2016-07-08 Voice communication method and device

Country Status (2)

Country Link
CN (1) CN106128468B (en)
WO (1) WO2018006678A1 (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106128468B (en) * 2016-07-08 2021-02-12 腾讯科技(深圳)有限公司 Voice communication method and device
CN108986828B (en) * 2018-08-31 2021-05-28 北京中兴高达通信技术有限公司 Call establishment method and device, storage medium and electronic device
CN113923065B (en) * 2021-09-06 2023-11-24 贵阳语玩科技有限公司 Cross-version communication method, system, medium and server based on chat room audio
CN114760273A (en) * 2022-04-14 2022-07-15 深圳震有科技股份有限公司 Voice forwarding method, system, server and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101547343A (en) * 2009-03-06 2009-09-30 深圳市融创天下科技发展有限公司 System and method for remote video monitoring
CN103414697A (en) * 2013-07-22 2013-11-27 中国联合网络通信集团有限公司 VOIP self-adaptation speech coding method and system and SIP server
CN103428284A (en) * 2013-08-07 2013-12-04 合肥迈腾信息科技有限公司 Cloud technology based on-board Internet phoning method
CN103916678A (en) * 2012-12-31 2014-07-09 中国移动通信集团广东有限公司 Multimedia data transcoding method, transcoding device and multimedia data play system
CN105491044A (en) * 2015-12-11 2016-04-13 中青冠岳科技(北京)有限公司 Instant voice messaging method and device based on mobile terminal

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6529602B1 (en) * 1997-08-19 2003-03-04 Walker Digital, Llc Method and apparatus for the secure storage of audio signals
CN1937663A (en) * 2006-09-30 2007-03-28 华为技术有限公司 Method, system and device for realizing variable voice telephone business
US20080310612A1 (en) * 2007-06-15 2008-12-18 Sony Ericsson Mobile Communications Ab System, method and device supporting delivery of device-specific data objects
JP2011166660A (en) * 2010-02-15 2011-08-25 Nec Access Technica Ltd Voice recording device, voice recording method, and voice recording program
CN103581129A (en) * 2012-07-30 2014-02-12 中兴通讯股份有限公司 Conversation processing method and device
CN104125138B (en) * 2013-04-28 2017-07-25 腾讯科技(深圳)有限公司 A kind of speech communication and device, system
CN105374359B (en) * 2014-08-29 2019-05-17 中国电信股份有限公司 The coding method and system of voice data
CN104580166B (en) * 2014-12-19 2018-08-31 大唐移动通信设备有限公司 A kind of method and apparatus based on the conversion of CSCF media coding formats
CN104994245A (en) * 2015-05-08 2015-10-21 小米科技有限责任公司 Conversation realization method and apparatus
CN106128468B (en) * 2016-07-08 2021-02-12 腾讯科技(深圳)有限公司 Voice communication method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101547343A (en) * 2009-03-06 2009-09-30 深圳市融创天下科技发展有限公司 System and method for remote video monitoring
CN103916678A (en) * 2012-12-31 2014-07-09 中国移动通信集团广东有限公司 Multimedia data transcoding method, transcoding device and multimedia data play system
CN103414697A (en) * 2013-07-22 2013-11-27 中国联合网络通信集团有限公司 VOIP self-adaptation speech coding method and system and SIP server
CN103428284A (en) * 2013-08-07 2013-12-04 合肥迈腾信息科技有限公司 Cloud technology based on-board Internet phoning method
CN105491044A (en) * 2015-12-11 2016-04-13 中青冠岳科技(北京)有限公司 Instant voice messaging method and device based on mobile terminal

Also Published As

Publication number Publication date
WO2018006678A1 (en) 2018-01-11
CN106128468A (en) 2016-11-16

Similar Documents

Publication Publication Date Title
US11196783B2 (en) Method, device, and system for facilitating group conference communication
CN106209592B (en) WeChat customer service system and customer service message interaction method thereof
CN106128468B (en) Voice communication method and device
CN110418098B (en) Method and device for starting video networking conference
CN106921613B (en) Method and system for signaling transmission
RU2012106659A (en) METHOD FOR TRANSFERRING A COMMUNICATION SESSION IN A TELECOMMUNICATION NETWORK OF THE FIRST CONNECTION TO THE SECOND CONNECTION
CN113099055A (en) Communication method, system, device, electronic equipment and storage medium
US9503583B2 (en) Peer-to-peer, internet protocol telephone system with proxy interface for configuration data
US8379629B2 (en) Data session handling
CN114979133B (en) Deployment method and device for converged communication cloud platform
CN104580247A (en) Information synchronization method and information synchronization device based on IMS multi-party calls
US8804936B2 (en) Shared media access for real time first and third party media control
US8983043B2 (en) Data communication
US11070665B2 (en) Voice over internet protocol processing method and related network device
CN111835674A (en) Communication method, communication device, first network element and communication system
CN107852577B (en) Supplementary service implementation method, terminal equipment and IMS server
CN111726762B (en) Method, device, equipment and storage medium for initiating MCPTT group call
WO2012052710A1 (en) Concurrent voice and data communication
US11178006B2 (en) Replacement of collaboration endpoints
WO2012052705A1 (en) Data communication
CN112637676B (en) Multimedia file processing method, system, communication device and readable storage medium
WO2016177235A1 (en) Method of interaction between smart terminal and multimedia terminal, and terminal, system and computer storage medium
GB2553725A (en) Data communication
CN106453265B (en) IP call scheduling method and system, IPPBX and server
CN116155867A (en) Communication method, system, electronic device and computer readable storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant