WO2023197593A1 - 多媒体会议的控制方法及装置、通信系统 - Google Patents

多媒体会议的控制方法及装置、通信系统 Download PDF

Info

Publication number
WO2023197593A1
WO2023197593A1 PCT/CN2022/131215 CN2022131215W WO2023197593A1 WO 2023197593 A1 WO2023197593 A1 WO 2023197593A1 CN 2022131215 W CN2022131215 W CN 2022131215W WO 2023197593 A1 WO2023197593 A1 WO 2023197593A1
Authority
WO
WIPO (PCT)
Prior art keywords
terminal
audio stream
audio
gateway
called
Prior art date
Application number
PCT/CN2022/131215
Other languages
English (en)
French (fr)
Inventor
廖涛
Original Assignee
华为云计算技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN202210666906.1A external-priority patent/CN116962364A/zh
Application filed by 华为云计算技术有限公司 filed Critical 华为云计算技术有限公司
Publication of WO2023197593A1 publication Critical patent/WO2023197593A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/10Architectures or entities
    • H04L65/102Gateways
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications
    • H04L65/403Arrangements for multi-party communication, e.g. for conferences
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems

Definitions

  • the present application relates to the field of communication technology, and in particular to a multimedia conference control method and device, and a communication system.
  • Multimedia conferences refer to virtual conferences realized through communication technology, which can allow geographically dispersed individuals or groups to gather together and exchange information through graphics, sound and other methods.
  • telephone terminals When conducting multimedia conferences, there are often some mobile phones, fixed-line terminals, etc. (hereinafter referred to as telephone terminals) that access the multimedia conference system through the telephone switching network.
  • the telephone switching network is the public switched telephone network (public switched telephone network). network (PSTN) or a private network based on a private branch exchange (PBX).
  • PSTN public switched telephone network
  • PBX private branch exchange
  • a call connection (for example, called call connection 1) is established between the telephone terminal and other terminals (for example, a conference terminal) in the multimedia conference system.
  • the call connection 1 includes the telephone terminal and the telephone.
  • the call connection between the switching networks and the call connection between the telephone switching network and the other terminal, and the media stream (eg audio stream) transmitted between the telephone terminal and the other terminal is forwarded through the telephone switching network.
  • the telephone terminal connected to the multimedia conference system carries out other telephone services (such as answering new phone calls, making new calls)
  • the telephone switching network will connect the call between the telephone terminal and other terminals in the multimedia conference system.
  • call hold that is, controls the call connection 1 to be in the call hold state
  • called prompt tone or call hold prompt tone
  • the other terminal will play the called prompt tone to alert the user of the other terminal.
  • the present application provides a multimedia conference control method and device, and a communication system, which helps to avoid the noise of the telephone terminal being called (such as the called prompt tone) from affecting the development of the multimedia conference.
  • the technical solutions of this application are as follows:
  • a multimedia conference control method includes: the target gateway receives an audio stream sent to the first terminal through the telephone switching system; the target gateway performs denoising processing on the audio stream, and the denoising processing is: Remove the noise of the second terminal being called in the audio stream.
  • the noise when the second terminal is called may be a called prompt sound when the second terminal is called by a terminal other than the first terminal (for example, a third terminal).
  • the called prompt sound is noise for the first terminal.
  • the target gateway after the target gateway receives the audio stream sent to the first terminal through the telephone switching system, the target gateway removes the noise of the second terminal being called in the audio stream, so it can avoid the noise of the second terminal being called.
  • the noise interferes with the first terminal, thereby preventing the second terminal from being called by noise from affecting the development of the multimedia conference.
  • the method before the target gateway performs denoising processing on the audio stream, the method also includes: the target gateway determines that the second terminal is in a called state based on the characteristic parameters of the audio stream. For example, the target gateway determines that the second terminal is in a state of being called by a terminal other than the first terminal (for example, a third terminal) according to the characteristic parameters of the audio stream.
  • the target gateway can determine that the second terminal is in the called state based on the characteristic parameters of the audio stream, that is, the target gateway can sense that the second terminal is in the called state based on the characteristic parameters of the audio stream, so the target gateway The audio stream may be denoised to remove the noise of the second terminal being called in the audio stream.
  • the characteristic parameters include at least one of the following: audio characteristics of the audio stream, and data characteristics of the data packets of the audio stream.
  • the characteristic parameters include audio characteristics of the audio stream, and the audio characteristics include voice segments contained in the audio stream.
  • the target gateway determines that the second terminal is in the called state based on the characteristic parameters of the audio stream, including: target gateway comparison
  • the audio stream contains a voice segment and a designated voice segment, which determines that the second terminal is in a called state, and the designated voice segment is used to describe that the second terminal is in a called state.
  • the target gateway compares the voice segment contained in the audio stream with the specified voice segment, and determines that the voice segment included in the audio stream includes the specified voice segment. Therefore, the target gateway determines that the second terminal is in a called state.
  • the target gateway compares the voice segment contained in the audio stream with the specified voice segment, and determines that the similarity between the voice segment contained in the audio stream and the specified voice segment is greater than the similarity threshold. Therefore, the target gateway determines that the second terminal is in the called state. .
  • the method also includes: the target gateway sends the audio stream to the audio recognition device, and the audio recognition device is used to perform audio recognition on the audio stream to obtain the audio characteristics of the audio stream; the target gateway receives the audio stream sent by the audio recognition device. The audio characteristics of this audio stream.
  • the target gateway sends an audio stream to the audio recognition device, which can facilitate the audio recognition device to perform audio recognition on the audio stream and obtain the audio characteristics of the audio stream.
  • the audio recognition device sends the audio of the audio stream to the target gateway.
  • Features which can facilitate the target gateway to obtain the audio features of the audio stream.
  • the characteristic parameter includes the data characteristic of the data packet of the audio stream.
  • the target gateway determines that the second terminal is in the called state according to the characteristic parameter of the audio stream, including: the target gateway determines that the second terminal is in the called state according to the data packet of the audio stream.
  • the identifier determines that the second terminal is in the called state, and the first identifier is used to indicate that the second terminal is in the called state.
  • the method before the target gateway performs denoising processing on the audio stream, the method also includes: the target gateway receives the target signaling message; the target gateway determines that the second terminal is in a state based on the target signaling message containing specified signaling information. Call state, this designated signaling information is used to indicate that the second terminal is in the called state.
  • the target gateway determines that the second terminal is in the called state based on the target signaling message containing designated signaling information, that is, the target gateway can sense that the second terminal is in the called state based on the target signaling message containing designated signaling information.
  • the target gateway determines that the audio stream sent to the first terminal may contain the noise of the second terminal being called.
  • the target gateway performs denoising processing on the audio stream sent to the first terminal to remove the second noise in the audio stream. Terminal being called noise.
  • the denoising process includes: intercepting the audio stream. That is, the target gateway does not forward the audio stream to the first terminal.
  • the target gateway intercepts the audio stream, which can prevent the audio stream from reaching the first terminal, thereby preventing the first terminal from playing the audio stream, and thereby preventing the noise of the second terminal being called in the audio stream from affecting the third terminal.
  • One terminal causes interference.
  • the denoising process includes: replacing data packets of the audio stream with silence packets, and sending the silence packets to the first terminal.
  • the silent package meets any of the following conditions: does not include audio data, includes audio data, and the audio data cannot trigger physical sound perception.
  • a silence packet is a data packet encapsulated according to the audio protocol and format, and the payload of the silence packet is empty.
  • the payload of the silent package is empty. This means that the silent package does not include the payload, or the silent package includes the payload but the data in the payload is 0.
  • the silent package plays without any sound and cannot induce physical sound perception.
  • the target gateway since the target gateway replaces the data packet of the audio stream with a silence packet and sends the silence packet to the first terminal, that is, the target gateway does not send the audio stream to the first terminal, therefore, the audio stream can be avoided.
  • the stream reaches the first terminal, thereby preventing the first terminal from playing the audio stream, and thereby preventing the noise of the second terminal being called in the audio stream from interfering with the first terminal.
  • the denoising process includes: adding a second identifier to the data packet of the audio stream and sending the data packet of the audio stream to the first terminal, where the second identifier is used to instruct the first terminal not to play the audio stream.
  • the target gateway since the target gateway adds the second identifier to the data packet of the audio stream and sends the data packet of the audio stream including the second identifier to the first terminal, after the first terminal receives the audio stream , the first terminal does not play the audio stream, which can avoid the noise of the second terminal being called in the audio stream from interfering with the first terminal.
  • a second aspect provides a multimedia conference control device, including various modules for executing the method provided in the above-mentioned first aspect or any optional manner of the first aspect.
  • the modules can be implemented based on software, hardware, or a combination of software and hardware, and the modules can be arbitrarily combined or divided based on specific implementations.
  • a multimedia conference control device including a memory and a processor; the memory is used to store a computer program; the processor is used to execute the computer program stored in the memory so that the control device performs the above-mentioned first aspect or the third aspect. Methods provided by either alternative on the one hand.
  • a communication system in a fourth aspect, includes a target gateway, a first terminal and a second terminal.
  • the first terminal is communicatively connected to the target gateway.
  • the second terminal is communicatively connected to the target gateway through a telephone switching system.
  • the target gateway It includes a multimedia conference control device as provided in the above second or third aspect.
  • a computer-readable storage medium is provided.
  • a computer program is stored in the computer-readable storage medium.
  • the implementation is as provided in the above-mentioned first aspect or any optional manner of the first aspect. Methods.
  • a computer program product in a sixth aspect, includes a program or code.
  • the program or code When the program or code is executed, the method provided by the above-mentioned first aspect or any optional manner of the first aspect is implemented.
  • a chip in a seventh aspect, includes programmable logic circuits and/or program instructions. When the chip is run, it is used to implement the method provided by the above-mentioned first aspect or any optional manner of the first aspect.
  • the communication system includes a target gateway, a first terminal, and a second terminal.
  • the first terminal is communicatively connected to the target gateway, and the second terminal communicates with the target gateway through a telephone switching system.
  • the target gateway performs denoising processing on the audio stream to remove the noise of the second terminal being called in the audio stream, so that the second terminal can be avoided
  • the called noise interferes with the first terminal, thereby preventing the called noise of the second terminal from affecting the development of the multimedia conference.
  • a call connection (for example, called call connection 1) is established between the second terminal and the first terminal.
  • the second terminal On the basis that the call connection 1 is established between the second terminal and the first terminal, if the second terminal is based on There are various possible reasons for establishing a call connection 2 with the third terminal (or the second terminal needs to establish a call connection 2 with the third terminal, where the third terminal can be any terminal other than the first terminal), for example, the second terminal needs to establish a call connection 2 with the third terminal.
  • the terminal calls the third terminal, or the second terminal answers the third terminal's phone call, or the second terminal communicates with the third terminal, etc., which will cause the second terminal to establish a call connection 2 with the third terminal, and the telephone switching system will The call connection 1 is on hold, and the telephone switching system will send the called prompt tone to the first terminal.
  • the called prompt tone is likely to cause interference to the first terminal, and the called prompt tone is due to the second terminal establishing a new call.
  • the called prompt sound is generated by the call connection 2, so the called prompt sound is noise for the first terminal, and the called prompt sound can be called the noise of the second terminal being called. Since the target gateway removes the noise of the second terminal being called from the audio stream sent by the telephone switching system to the first terminal, it can avoid the noise of the second terminal being called from interfering with the first terminal.
  • Figure 1 is a schematic diagram of an application scenario provided by an embodiment of the present application.
  • Figure 2 is a schematic diagram of another application scenario provided by the embodiment of the present application.
  • Figure 3 is a schematic diagram of yet another application scenario provided by the embodiment of the present application.
  • Figure 4 is a flow chart of a multimedia conference control method provided by an embodiment of the present application.
  • Figure 5 is a flow chart of another multimedia conference control method provided by an embodiment of the present application.
  • Figure 6 is a flow chart of a second terminal accessing a multimedia conference system provided by an embodiment of the present application.
  • Figure 7 is a flow chart of a second terminal in a called state provided by an embodiment of the present application.
  • Figure 8 is a flow chart of another second terminal in a called state provided by an embodiment of the present application.
  • Figure 9 is a flow chart of a second terminal canceling the called state provided by an embodiment of the present application.
  • Figure 10 is a schematic structural diagram of a multimedia conference control device provided by an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of another multimedia conference control device provided by an embodiment of the present application.
  • the telephone switching network is PSTN or a private network based on PBX.
  • PSTN is a telecommunications network that provides telephone services to public users.
  • PSTN includes access systems, telephone switches, relays, etc.
  • PSTN is also called plain old telephone service (POTS).
  • POTS plain old telephone service
  • PBX is a computer-based digital telephone exchange that can be connected to the public switched telephone network and is usually used by businesses.
  • the telephone switching network is also called a telephone switching system, a telephone access system, etc.
  • the telephone terminal communicates with the media gateway in the multimedia conference system through the Internet protocol multimedia subsystem (IMS) and the access gateway in turn.
  • IMS Internet protocol multimedia subsystem
  • the media gateway can process the audio stream from the telephone terminal or the IMS. Code and decode, and send the audio stream to other terminals (such as conference terminals) in the multimedia conference system, so that other terminals in the multimedia conference system can play the audio stream from the phone terminal or from the IMS.
  • other terminals such as conference terminals
  • a call connection (for example, called call connection 1) is established between the telephone terminal and other terminals (for example, a conference terminal) in the multimedia conference system.
  • the call connection 1 includes the telephone terminal and the telephone.
  • the call connection between the switching networks and the call connection between the telephone switching network and the other terminal, and the media stream (eg audio stream) transmitted between the telephone terminal and the other terminal is forwarded through the telephone switching network.
  • 4G 4th generation mobile communication technology
  • 5G 5th generation mobile communication technology
  • a telephone terminal connected to the multimedia conference system carries out other telephone services, such as answering phone calls from terminals outside the multimedia conference system, making calls to terminals outside the multimedia conference system, etc.
  • the telephone switching network will The call connection 1 between the telephone terminal and other terminals in the multimedia conference system performs call hold (that is, controls the call connection 1 to be in the call hold state), and sends the called prompt tone to the other terminal, for example, the called The prompt tone is "The user you dialed is currently on a call" etc.
  • the other terminal will play the called prompt tone to alert the user of the other terminal.
  • the called prompt tone can easily cause interference to other terminals, affecting the development of multimedia conferences.
  • call hold is a type of service that allows the established call connection (such as the aforementioned call connection 1) to be maintained, that is, the transmission of the media stream (such as audio stream) between the calling terminal and the called terminal is stopped.
  • the session resources are not released and the call connection is not removed.
  • the call connection can be restored (or reactivated) when the call hold ends or based on other needs.
  • the user of the telephone terminal can operate the mute key of the telephone terminal or control the local muting of the telephone terminal based on other muting methods provided by the multimedia conferencing system.
  • the telephone terminal After the telephone terminal is muted locally, the telephone terminal does not send media streams to the multimedia conference system, but the call connection (for example, call connection 1) between the telephone terminal and other terminals in the multimedia conference system still exists.
  • the call connection for example, call connection 1
  • the user of the telephone terminal first reacts that the telephone terminal is muted in the multimedia conference system, and may answer the new telephone call.
  • the telephone terminal After the telephone terminal answers the new telephone call, the telephone terminal establishes a call connection (for example, call connection 2) with the calling party of the new telephone call.
  • a call connection for example, call connection 2
  • the telephone switching network defaults to the existence of two call connections for the telephone terminal. Since the telephone terminal answers a new telephone call (that is, call connection 2 is established), the telephone switching network will put the call on hold for call connection 1, and The called prompt tone (or call holding prompt tone) is sent cyclically to other terminals in the multimedia conference system, and other terminals in the multimedia conference system will play the called prompt tone in a loop, seriously interfering with the development of the multimedia conference.
  • Embodiments of the present application provide a multimedia conference control method, device and communication system.
  • the communication system includes a target gateway, a first terminal and a second terminal.
  • the first terminal is communicatively connected to the target gateway
  • the second terminal is communicatively connected to the target gateway through a telephone switching system.
  • the target gateway is a media gateway in a multimedia conference system.
  • One terminal is a conference terminal in the multimedia conference system
  • the second terminal is a telephone terminal connected to the multimedia conference system.
  • the target gateway After the target gateway receives the audio stream sent to the first terminal through the telephone switching system, the target gateway performs denoising processing on the audio stream to remove the noise of the second terminal being called in the audio stream, such as the noise of the second terminal being called. It is the called prompt tone of the second terminal.
  • the target gateway Since the target gateway removes the noise of the second terminal being called in the audio stream sent to the first terminal, it can avoid the noise of the second terminal being called from interfering with the first terminal, thereby preventing the noise of the second terminal being called from affecting the multimedia conference. carry out.
  • FIG. 1 shows a schematic diagram of an application scenario provided by an embodiment of the present application.
  • the application scenario provides a communication system, which includes a target gateway, a first terminal and a second terminal.
  • the first terminal is communicatively connected with the target gateway
  • the second terminal is communicatively connected with the target gateway through the telephone switching system.
  • the communication system includes a multimedia conference system and a telephone switching system, and both the target gateway and the first terminal can be located in the multimedia conference system.
  • the target gateway may receive the audio stream sent to the first terminal through the telephone switching system, and perform denoising processing on the audio stream to remove the noise of the second terminal being called in the audio stream.
  • the target gateway first determines that the second terminal is in a called state, and then removes the noise of the second terminal being called from the audio stream sent to the first terminal.
  • the "second terminal is in the called state" mentioned in this application refers to the state in which the second terminal establishes a call connection with a terminal other than the first terminal and the call connection is activated. For example, "the second terminal is in the called state".
  • “Status” may refer to a state in which the second terminal is called by a terminal other than the first terminal (for example, a third terminal, and the third terminal may be any terminal other than the first terminal), or the second terminal is in a state calling the first terminal.
  • the status of a terminal other than the terminal for example, a third terminal, which may be any terminal other than the first terminal.
  • a call connection for example, called call connection 1
  • call connection 1 is established between the second terminal and the first terminal in the multimedia conference system.
  • the target gateway can determine whether the second terminal is in the called state by determining whether the call connection 1 between the second terminal and the first terminal is in the call hold state (for example, determine whether the second terminal is in the called state).
  • the state of the third terminal call for example, if the target gateway determines that the call connection 1 between the second terminal and the first terminal is in the call hold state, the target gateway determines that the second terminal is in the called state; if the target gateway determines that the second terminal is in the called state; If the call connection 1 between the terminal and the first terminal is not in the call hold state, the target gateway determines that the second terminal is not in the called state.
  • the audio stream sent to the first terminal may be an audio stream sent from the second terminal to the first terminal, or may be an audio stream sent from the telephone switching system to the first terminal.
  • the telephone switching system can send the called prompt tone to the first terminal, and then the call sent to the first terminal
  • the audio stream may be the called prompt tone sent by the telephone switching system to the first terminal.
  • the communication system may include at least one conference terminal and at least one telephone terminal; the at least one conference terminal is communicatively connected with the target gateway, and the at least one conference terminal and the target gateway are both located in the multimedia conference system; At least one telephone terminal is communicatively connected to the target gateway through the telephone switching system, and the at least one telephone terminal can access the multimedia conference system through the telephone switching system.
  • the at least one telephone terminal is a plurality of telephone terminals
  • the plurality of telephone terminals may be communicatively connected to the target gateway through one telephone switching system, or may be communicatively connected to the target gateway through at least two telephone switching systems.
  • the conference terminal refers to a terminal that accesses the multimedia conference system through a conference application (or conference client).
  • the conference terminal can be a mobile phone, a netbook, a laptop, a tablet, etc.
  • the telephone terminal refers to the terminal that accesses the multimedia conference system through the telephone switching system.
  • the telephone terminal can be a mobile phone, a fixed-line terminal, etc.
  • the telephone switching system can be PSTN or a private network based on PBX.
  • the telephone switching network is also called telephone switching system, telephone access system, etc.
  • the second terminal may be any one of the at least one telephone terminal, and the number of the first terminals may be one or more.
  • the at least one conference terminal may be the first terminal.
  • Multimedia conference systems usually include media gateways, signaling gateways, media servers, conference terminals and other equipment.
  • the media server can be a selective forwarding unit (SFU).
  • SFU selective forwarding unit
  • the target gateway described in this application may be a media gateway, or a gateway that integrates the functions of a media gateway, a signaling gateway, and a media server.
  • the target gateway integrates both a signaling gateway and a media server. functions of the media gateway.
  • the telephone switching system is communicatively connected to the target gateway through an access gateway, and the access gateway is used for the telephone switching system to access the target gateway, thereby allowing the telephone switching system to access the multimedia conference system.
  • the access gateway may be a PSTN access gateway or a PBX, which is not limited in this embodiment of the present application.
  • Figure 2 shows a schematic diagram of another application scenario provided by the embodiment of this application.
  • This application scenario is illustrated by taking a media gateway that integrates the functions of a signaling gateway and a media server (that is, the target gateway is a media gateway that integrates the functions of both a signaling gateway and a media server).
  • the communication system provided by this application scenario includes a multimedia conference system, a telephone switching system 101, an access gateway 102 and at least one telephone terminal 103 ( Figure 2 takes a telephone terminal 103 as an example).
  • the multimedia conference system includes a media gateway 104 and at least one conference terminal 105 ( Figure 2 takes two conference terminals 105 as an example).
  • the at least one conference terminal 105 is communicatively connected with the media gateway 104.
  • the telephone terminal 103, the telephone switching system 101, and the access gateway 102 are connected in sequence.
  • the access gateway 102 is communicatively connected with the media gateway 104.
  • the access gateway 102 is used for the telephone switching system 101 to access the media gateway 104, thereby enabling communication with the telephone switching system.
  • the telephone terminal 103 connected through 101 communication accesses the multimedia conference system, and the access gateway 102 is used to forward media streams (eg audio streams) between the telephone switching system 101 and the media gateway 104 .
  • the first terminal may be the conference terminal 105, and the second terminal may be the telephone terminal 103.
  • the communication system shown in FIG. 2 includes two first terminals.
  • the media gateway 104 can receive the audio stream sent to the conference terminal 105 through the telephone switching system 101 and the access gateway 102, and perform denoising processing on the audio stream to remove the noise of the telephone terminal 103 being called in the audio stream.
  • the media gateway 104 first determines that the telephone terminal 103 is in a called state (for example, determines that the telephone terminal 103 is in a state of being called by a terminal other than the conference terminal 105), and then the media gateway 104 processes the audio stream sent to the conference terminal 105. Denoising processing is performed to remove the noise of the telephone terminal 103 being called in the audio stream.
  • the multimedia conference system also includes a conference management device 106.
  • the conference management device 106 is used to manage media resources (such as conference numbers, etc.) used in multimedia conferences, control telephone terminals, conference terminals, etc. to access the multimedia conference, and control Routing of media streams, scheduling instructions for media streams, etc.
  • the conference management device 106 is communicatively connected to the media gateway 104, and the conference management device 106 can send detection indication information to the media gateway 104 (for example, the detection indication information includes the phone number and detection identification of the telephone terminal 103) to instruct the media gateway 104 to
  • the media stream eg audio stream
  • the media stream is subjected to denoising processing to remove the noise of the telephone terminal 103 being called in the audio stream.
  • the call connection between the telephone terminal 103 and the conference terminal 105 is in the call hold state.
  • the call connection between the telephone terminal 103 and the conference terminal 105 is in a call hold state, possibly because the telephone terminal 103 carries out other telephone services.
  • the telephone terminal 103 receives a telephone call from another terminal, and the telephone terminal 103 is in a call hold state. The status of being called by this other terminal.
  • Figure 2 only shows the connection relationship between the conference management device 106 and the media gateway 104.
  • the conference management device 106 can also be connected to the conference terminal 105, the telephone terminal 103, etc. Which specifically the conference management device 106 is connected to.
  • the device or terminal connection can be set according to actual needs, which is not limited in the embodiments of this application.
  • the telephone switching system 101 can trigger the access gateway 102 to send a notification message to the media gateway 104.
  • the media gateway 104 determines that the call connection between the telephone terminal 103 and the conference terminal is in the call hold state based on the notification message, thereby determining that the telephone terminal 103 is in the call hold state. Called status.
  • the notification message may be session initiation protocol (SIP) signaling, private protocol signaling, or a media data packet carrying a special identifier or special information.
  • SIP session initiation protocol
  • the media gateway 104 determines that the telephone terminal 103 is in a called state based on characteristic parameters of the audio stream related to the telephone terminal 103 sent to the conference terminal 105 . For example, when the telephone terminal 103 is in a called state (for example, the telephone terminal 103 is in a state of being called by a terminal other than the conference terminal 105), and the call connection between the telephone terminal 103 and the conference terminal 105 is in a call hold state, the telephone switching system 101 can send the called prompt tone of the telephone terminal 103 (or the call holding prompt tone for holding the call connection between the telephone terminal 103 and the conference terminal 105) to the multimedia conference system, that is, the telephone switching system 101 can Send the audio stream to the multimedia conference system; the media gateway 104 can determine that the call connection between the telephone terminal 103 and the conference terminal 105 is in the call hold state according to the characteristic parameters of the audio stream, thereby determining that the telephone terminal 103 is in the called state.
  • a called state for example, the telephone terminal 103 is in a state of being
  • the characteristic parameters of the audio stream include at least one of the following: audio characteristics of the audio stream, and data characteristics of the data packets of the audio stream.
  • the media gateway 104 determines that the call connection between the telephone terminal 103 and the conference terminal 105 is in the call hold state according to the data characteristics of the data packet of the audio stream, thereby determining that the telephone terminal 103 is in the called state.
  • the media gateway 104 determines that the call connection between the telephone terminal 103 and the conference terminal 105 is in the call hold state based on the first identifier contained in the data packet of the audio stream, thereby determining that the telephone terminal 103 is in the called state.
  • the media gateway 104 determines that the call connection between the telephone terminal 103 and the conference terminal 105 is in the call hold state according to the audio characteristics of the audio stream, thereby determining that the telephone terminal 103 is in the called state. For example, the media gateway 104 determines that the call connection between the telephone terminal 103 and the conference terminal 105 is in the call hold state according to the audio stream including the specified voice segment, thereby determining that the telephone terminal 103 is in the called state; or, the media gateway 104 determines that the audio stream includes the designated voice segment.
  • the similarity between the included voice segment and the specified voice segment is greater than the similarity threshold, it is determined that the call connection between the telephone terminal 103 and the conference terminal 105 is in the call hold state, thereby determining that the telephone terminal 103 is in the called state.
  • the communication system also includes an audio recognition device 107.
  • the audio recognition device 107 is communicatively connected to the media gateway 104.
  • the media gateway 104 can send the audio stream related to the phone terminal 103 to the conference terminal 105.
  • the audio recognition device 107 can perform audio recognition on the audio stream sent by the media gateway 104 to obtain the audio characteristics of the audio stream, and send the audio characteristics of the audio stream to the media gateway 104, so that the media gateway 104 can identify the audio stream based on the audio stream.
  • the audio characteristics of the stream determine that the telephone terminal 103 is in a called state.
  • the media gateway 104 decodes the audio stream into an audio naked code stream file, usually a pulse code modulation (PCM) file, and sends the PCM file of the audio stream to the audio recognition device 107, and the audio recognition device 107 Perform audio identification on the audio stream based on its PCM file.
  • the audio recognition device 107 can be an automatic speech recognition (automatic speech recognition, ASR) device or other audio recognition devices.
  • the media gateway 104 integrates the functions of the signaling gateway and the media server (that is, the media gateway 104 includes the functions of the media gateway, the signaling gateway and the media server).
  • the media gateway 104 includes a signaling module, a media module, an audio processing module, a storage module, etc.
  • the signaling module is used for signaling interaction between the media gateway 104 and the conference management device 106, the access gateway 102, etc., for example,
  • the signaling module is used for the media gateway 104 to receive the scheduling signaling sent by the conference management device 106, and report the access status of the conference terminal 105, the telephone terminal 103, etc.
  • the access gateway 102 interacts with SIP signaling or private protocol signaling; the media module is used for the media gateway 104 to interact with the conference terminal 105, the access gateway 102, etc., for example, the media module is used for the media gateway 104 based on the real-time transmission protocol (real-time transport protocol, RTP) receives the audio stream sent by the access gateway 102, and sends the audio stream to the conference terminal 105 based on RTP, and receives the media stream sent by any conference terminal 105 based on RTP, and sends the audio stream to the conference terminal 105 based on RTP.
  • real-time transport protocol real-time transport protocol
  • the access gateway 102 and other conference terminals 105 send the media stream;
  • the audio processing module is used by the media gateway 104 to decode the audio stream sent to the conference terminal 105 into a PCM file, and send the PCM file of the audio stream to the audio recognition device 107.
  • receive the audio characteristics sent by the audio recognition device 107 and determine that the call connection between the telephone terminal 103 and the conference terminal 105 is in the call hold state according to the audio characteristics sent by the audio recognition device 107;
  • the storage module is used to store the specified audio characteristics (such as specifying Speech segment or similarity threshold), the specified audio feature can be stored in the storage module in the form of text.
  • the audio recognition device 107 may include an audio transceiver module and an audio recognition module.
  • the audio transceiver module is used for the audio recognition device 107 to receive the audio stream sent by the media gateway 104 and provide the audio stream to the audio recognition module for analysis and identification.
  • the audio recognition module The module is used to analyze and identify the received audio stream and obtain the audio characteristics of the audio stream.
  • the access gateway 102 may include a signaling module and a media module.
  • the signaling module is used for signaling interaction between the access gateway 102 and the telephone switching system 101, the media gateway 104, etc.
  • the media module is used for the access gateway 102 to interact with the telephone switching system. System 101, media gateway 104, etc. perform media interaction.
  • the media module is used for media interaction between access gateway 102 and telephone switching system 101, media gateway 104, etc. based on RTP.
  • the division of the functional modules of the media gateway 104, audio recognition device 107, access gateway 102, etc. in this application is only exemplary.
  • the media gateway 104, audio recognition device 107, and access gateway 102 may also include other functional modules. This application The embodiment does not limit this.
  • Figure 2 takes the media gateway integrating the functions of the signaling gateway and the media server as an example.
  • at least two of the signaling gateway, the media gateway, and the media server are deployed independently.
  • FIG. 3 shows a schematic diagram of yet another application scenario provided by the embodiment of the present application.
  • This application scenario takes the signaling gateway, media gateway, and media server as three independent devices as an example.
  • the multimedia conference system also includes a signaling gateway 108 and a media server 109.
  • the signaling gateway 108, the access gateway 102, and the conference management device 106 are respectively connection, the signaling gateway 108 is used for signaling interaction between the access gateway 102 and the conference management device 106, the media server 109 is connected to the media gateway 104 and the conference terminal 105 respectively, and the media server 109 is used to communicate with the conference between the media gateway 104 and the conference terminal 105. Media interaction is performed between terminals 105.
  • the media gateway 104 may include a media module and an audio processing module, but not a signaling module. Signaling-related processing functions may be implemented by the signaling gateway 108, which is not done in this embodiment of the application. limited.
  • the application scenarios shown in Figures 1 to 3 are only examples and are not intended to limit the technical solution of the present application.
  • the number of conference terminals and telephone terminals can be configured as needed, and the application scenario may also include other devices, or the application scenario may include fewer devices than those shown in Figures 1 to 3.
  • Embodiments of the present application There is no restriction on this.
  • FIG. 4 shows a flow chart of a multimedia conference control method provided by an embodiment of the present application.
  • the multimedia conference control method is applied to the target gateway.
  • the target gateway is the media gateway in Figure 2 or Figure 3.
  • the multimedia conference control method includes the following S401 to S402.
  • the target gateway receives the audio stream A sent to the first terminal through the telephone switching system.
  • the target gateway is communicatively connected to the first terminal, and the target gateway is communicatively connected to the second terminal of the telephone switching system.
  • the target gateway can receive audio stream A sent to the first terminal through the telephone switching system.
  • Audio stream A is an audio stream related to the second terminal. Audio stream A may carry the identifier of the second terminal.
  • the audio stream A carries the stream number of the second terminal.
  • the stream number of the second terminal is the stream number assigned to the second terminal, for example, the stream number assigned by the conference management device to the second terminal, or the stream number of the second terminal is the link between the target gateway and the second terminal, the telephone switching system, The stream number of the second terminal determined by the access gateway and others through signaling negotiation.
  • the target gateway is the media gateway 104
  • the first terminal is the conference terminal 105
  • the second terminal is the telephone terminal 103
  • the target gateway ie, the media gateway 104
  • the gateway 102 receives the audio stream A sent to the conference terminal 105.
  • the audio stream A is related to the telephone terminal 103.
  • the audio stream A reaches the target gateway (ie, the media gateway 104) through at least forwarding by the access gateway 102.
  • audio stream A may be an audio stream sent by the second terminal to the first terminal, or may be an audio stream sent by the telephone switching system to the first terminal. That is, the audio stream A may come from the second terminal or the telephone switching system. In other words, the audio stream A may be generated by the second terminal or the telephone switching system.
  • a call connection 1 is established between the second terminal and the first terminal.
  • the second terminal can collect the audio stream (for example, collect the second The voice of the user of the terminal speaking, collecting the sounds in the environment of the second terminal, etc.), and sending the audio stream to the first terminal through the call connection 1.
  • the audio stream A can be sent by the second terminal to the third terminal.
  • a call connection 1 is established between the second terminal and the first terminal.
  • the terminal for example, a third terminal
  • the second terminal is in a called state (for example, the second terminal is called by a third terminal).
  • the telephone switching system will hold the call connection 1 (that is, control the call connection 1).
  • the call connection 1 is in the call hold state (the call hold state can also be called the deactivated state), and the telephone switching system can send the called prompt tone of the second terminal to the first terminal (or the call hold prompt tone of the call connection 1). , that is, the telephone switching system can send the audio stream to the first terminal.
  • the audio stream A can be the audio stream sent by the telephone switching system to the first terminal.
  • audio stream A is the audio stream sent by the telephone switching system to the first terminal when the second terminal is in the called state and call connection 1 is in the call hold state, and the data of audio stream A received by the target gateway.
  • the packet may carry a first identifier, and the first identifier is used to indicate that the second terminal is in a called state.
  • the first identifier is used to indicate that the call connection 1 is in the call hold state, thereby indicating that the second terminal is in the called state.
  • the first identifier may be carried by the telephone switching system in the data packet of audio stream A, or may be added by the access gateway in the received data packet of audio stream A.
  • the called prompt tone of the second terminal may be "the user you dialed is currently on a call”
  • the first identifier may be "holdflag”
  • holdflag is To indicate that the call connection 1 is in the call hold state, thereby indicating that the second terminal is in the called state.
  • the target gateway performs denoising processing on the audio stream A.
  • the denoising processing is to remove the noise of the second terminal being called in the audio stream A.
  • the audio stream A includes the called prompt tone of the second terminal.
  • the call connection 1 between the second terminal and the first terminal is in the call hold state.
  • the second terminal does not conduct a multimedia conference in the multimedia conference system (that is, there is no media transmission between the second terminal and the first terminal in the multimedia conference system), but the first terminal may still conduct a multimedia conference in the multimedia conference system.
  • Audio stream A reaches the first terminal and is played by the first terminal.
  • the called prompt sound of the second terminal in audio stream A is likely to interfere with the first terminal. Therefore, the called prompt sound can be called noise.
  • audio stream A is the called prompt sound of the second terminal, and audio stream A may be called noise for the first terminal.
  • the target gateway can perform denoising processing on the audio stream A to remove the noise of the second terminal being called in the audio stream A (that is, the called prompt tone of the second terminal, or called call connection 1 call hold prompt tone), which can avoid the noise of the second terminal being called in audio stream A from causing interference to the first terminal.
  • the target gateway performs denoising on audio stream A, including the following three possible implementation methods.
  • the first implementation method the target gateway intercepts audio stream A.
  • the target gateway does not forward audio stream A to the first terminal. For example, the target gateway discards the packets of audio stream A.
  • the target gateway intercepts audio stream A to prevent audio stream A from reaching the first terminal, thereby preventing the first terminal from playing audio stream A, and thereby preventing the noise of the second terminal being called in audio stream A from interfering with the first terminal.
  • the second implementation method the target gateway replaces the data packet of audio stream A with a silence packet and sends the silence packet to the first terminal.
  • the target gateway is the media gateway 104
  • the first terminal is the conference terminal 105
  • the target gateway ie, the media gateway 104
  • sends a mute message to the first terminal ie, the conference terminal 105) through the media server 109. Bag.
  • the silent package meets any of the following conditions: does not include audio data, includes audio data, and the audio data cannot trigger physical sound perception.
  • a silence packet is a data packet encapsulated according to the audio protocol and format, and the payload of the silence packet is empty.
  • the payload of the silent package is empty. This means that the silent package does not include the payload, or the silent package includes the payload but the data in the payload is 0.
  • the silent package plays without any sound and cannot induce physical sound perception.
  • the target gateway Since the target gateway replaces the data packet of audio stream A with a silence packet and sends the silence packet to the first terminal, the target gateway does not send audio stream A to the first terminal, which can prevent audio stream A from reaching the first terminal, thereby avoiding the One terminal plays the audio stream A, thereby preventing the noise of the second terminal being called in the audio stream A from interfering with the first terminal.
  • the third implementation method the target gateway adds a second identifier to the data packet of audio stream A, and sends the data packet of audio stream A containing the second identifier to the first terminal.
  • the second identifier is used to instruct the first terminal not to play. Audio stream A.
  • the second identifier is used to instruct the first terminal not to play audio stream A. Therefore, the first terminal does not play audio stream A, and can avoid audio stream A being interrupted. The noise of the second terminal being called interferes with the first terminal.
  • the target gateway after the target gateway receives the audio stream sent to the first terminal through the telephone switching system, the target gateway performs denoising processing on the audio stream to remove the audio stream. Therefore, the noise of the second terminal being called can be prevented from interfering with the first terminal, thereby preventing the noise of the second terminal being called from affecting the development of the multimedia conference.
  • the target gateway may determine that the second terminal is in a called state.
  • the target gateway determines that the second terminal is in the called state, which may include the following two optional embodiments.
  • step S403a is also included.
  • the target gateway determines that the second terminal is in the called state based on the characteristic parameters of audio stream A.
  • the characteristic parameters of the audio stream A include at least one of the following: audio characteristics of the audio stream A, and data characteristics of the data packets of the audio stream A.
  • the characteristic parameter of audio stream A includes the data characteristic of the data packet of audio stream A
  • the target gateway determines that the second terminal is in the called state according to the data characteristic of the data packet of audio stream A.
  • the target gateway determines that the second terminal is in the called state based on the data packet of the audio stream A containing the first identifier, and the first identifier is used to indicate that the second terminal is in the called state.
  • the first identifier is used to indicate that the call connection 1 between the second terminal and the first terminal is in the call hold state, thereby indicating that the second terminal is in the called state.
  • the target gateway can determine whether the data packet of the audio stream A contains The first identifier.
  • the target gateway determines that the call connection 1 is in the call hold state, thereby determining that the second terminal is in the called state; if the data packet of audio stream A does not contain the first identifier, The target gateway determines that the call connection 1 is not in the call hold state, thereby determining that the second terminal is not in the called state.
  • the characteristic parameters of audio stream A include audio characteristics of audio stream A, which audio characteristics include voice segments included in audio stream A.
  • the target gateway determines that the second terminal is in a called state based on the voice segments included in audio stream A.
  • the target gateway compares the voice segment contained in the audio stream A with the specified voice segment, and determines that the second terminal is in the called state based on the comparison result.
  • the target gateway compares the voice segment contained in the audio stream A with the specified voice segment to determine whether the voice segment included in the audio stream A includes the specified voice segment. If the voice segment included in the audio stream A includes the specified voice segment, the target gateway determines whether the second voice segment included in the audio stream A includes the specified voice segment. The terminal is in the called state.
  • the target gateway determines that the second terminal is not in the called state. Alternatively, the target gateway compares the voice segment contained in audio stream A with the specified voice segment to determine whether the similarity between the voice segment contained in audio stream A and the specified voice segment is greater than the similarity threshold. If the voice segment contained in audio stream A is similar to the specified voice segment, If the similarity of the segment is greater than the similarity threshold, the target gateway determines that the second terminal is in the called state. If the similarity between the voice segment contained in the audio stream A and the specified voice segment is not greater than the similarity threshold, the target gateway determines that the second terminal is not in the called state. Call status.
  • the specified voice segment is used to describe that the second terminal is in the called state.
  • the specified voice segment is used to describe that the call connection 1 between the second terminal and the first terminal is in the call hold state, thereby describing the second terminal.
  • the specified voice segment is "The user you are calling is currently on the call.”
  • the target gateway before the target gateway determines that the second terminal is in the called state based on the audio characteristics of audio stream A, the target gateway obtains the audio characteristics of audio stream A. For example, the target gateway sends audio stream A to the audio recognition device, and receives the audio characteristics of audio stream A sent by the audio recognition device. The audio recognition device is used to perform audio recognition on audio stream A to obtain the audio characteristics of audio stream A.
  • the target gateway decodes audio stream A to obtain the audio bare code stream file of audio stream A, such as a PCM file, and then the target gateway sends the PCM file of audio stream A to the audio recognition device, and the audio recognition device Perform audio recognition on the PCM file to obtain the audio characteristics of audio stream A.
  • the audio recognition device performs audio recognition based on the PCM file of audio stream A through the audio recognition model. That is, the audio recognition device can input the PCM file of audio stream A into the audio recognition model, and the audio recognition model can perform audio recognition on the audio stream A. Calculate the PCM file of A to obtain the audio characteristics of audio stream A, and output the audio characteristics of audio stream A.
  • the target gateway receives the target signaling message.
  • the target signaling message is a signaling message related to the second terminal, and the target signaling message may carry the identity of the second terminal.
  • the target signaling message carries the phone number of the second terminal.
  • the target gateway, the access gateway, the telephone switching system, and the second terminal are connected in sequence, and the target gateway receives the target signaling message sent by the access gateway.
  • the target gateway is the media gateway 104
  • the second terminal is the phone terminal 103.
  • the media gateway 104 receives the target signaling message sent by the access gateway 102, and the target signaling message carries the phone number of the phone terminal 103. Number.
  • the target signaling message may be a SIP message or a signaling message based on a private protocol, which is not limited in this embodiment of the present application.
  • the target signaling message may be a negotiation message.
  • a call connection 1 is established between the second terminal and the first terminal in the multimedia conference system.
  • the second terminal is engaged in other activities, if the second terminal is engaged in other activities, The second terminal is in the called state (that is, the second terminal is in the state of being called by the other terminal), and the second terminal is in the state of being called by the other terminal. It is necessary to hold the call on the call connection 1, so the second terminal sends a first negotiation message to the telephone switching system to negotiate with the telephone switching system to put the call on the call connection 1 on hold.
  • the telephone switching system may send a second negotiation message to the access gateway according to the first negotiation message to negotiate with the access gateway to hold the call connection 1.
  • the access gateway may send a target signaling message to the target gateway according to the second negotiation message to negotiate with the access gateway to hold the call connection 1.
  • the target gateway can receive the target signaling message sent by the access gateway.
  • the first negotiation message, the second negotiation message and the target signaling message all carry designated signaling information and the identifier of the second terminal to indicate that the second terminal is in a called state.
  • the calling party still has media to send, for example, indicating that when the call connection between the calling party and the called party is on call hold, the calling party is prompted by voice that the called party is in the called state (or in other words, the calling party and the called party
  • the call connection between the second terminal and the first terminal is in the call hold state).
  • the second terminal is in the called state.
  • the first negotiation message, the second negotiation message, and the target signaling message may be the same signaling message, or they may be three different signaling messages. It can be understood that if the first negotiation message, the second negotiation message, and the target signaling message are the same signaling message, and the signaling message comes from the second terminal, the telephone switching system and the access gateway will receive the signaling message. , relevant processing can be performed according to the signaling message, and the signaling message can be forwarded.
  • the target gateway determines that the second terminal is in the called state according to the target signaling message containing designated signaling information, and the designated signaling information is used to indicate that the second terminal is in the called state.
  • the target gateway determines whether the target signaling message contains the specified signaling information; if the target signaling message contains the specified signaling information, the target gateway determines that the second terminal is in the called state; if the called state does not contain the specified signaling information , the target gateway determines that the second terminal is not in the called state.
  • the designated signaling information is used to indicate that the second terminal is in a called state.
  • the embodiment shown in Figure 5 takes the access gateway notifying the target gateway that the second terminal is in the called state through the target signaling message as an example.
  • the access gateway may also notify the target gateway that the second terminal is in the called state through other methods.
  • the access gateway uses an interface callback or a publish-subscribe method to notify the target gateway that the second terminal is in the called state. That is, the access gateway can call the interface that communicates with the target gateway to notify the target gateway that the second terminal is in the called state, or, in the case where the target gateway subscribes to relevant notifications from the access gateway, the access gateway notifies the target gateway
  • the second terminal is in a called state, which is not limited in the embodiment of the present application.
  • the second terminal when the call connection 1 is established between the second terminal and the first terminal in the multimedia conference system, and the call connection 1 is in the active state, if the second terminal communicates with the first terminal based on various possible reasons, If an external terminal (such as a third terminal) establishes call connection 2, then the second terminal is in the called state (for example, the second terminal is in the state of being called by the third terminal), and the call connection between the second terminal and the first terminal 1 is called on hold (or deactivated). When the second terminal and the third terminal disconnect the call connection 2 or the call connection 2 is held by a call, the second terminal can cancel the called state (for example, the second terminal cancels the state of being called by the third terminal), and can be reactivated at this time.
  • the call connection 1 enables the second terminal and the first terminal to perform media transmission through the call connection 1.
  • the second terminal when the second terminal cancels the called state, the second terminal sends a third negotiation message to the telephone switching system to negotiate with the telephone switching system to cancel the call connection 1 between the second terminal and the first terminal.
  • the call is on hold; after the telephone switching system receives the third negotiation message, the telephone switching system sends a fourth negotiation message to the access gateway according to the third negotiation message to negotiate with the access gateway to cancel the negotiation between the second terminal and the first terminal.
  • the call connection 1 is in the call hold state; after the access gateway receives the fourth negotiation message, the access gateway sends a fifth negotiation message to the target gateway according to the fourth negotiation message to negotiate with the target gateway to cancel the connection between the second terminal and the first terminal.
  • the target gateway determines according to the fifth negotiation message that the second terminal cancels the call hold state of the call connection 1 between the second terminal and the first terminal, thereby determining that the second terminal cancels the called state.
  • the third negotiation message, the fourth negotiation message and the fifth negotiation message all carry the identifier of the second terminal and do not carry designated signaling information to instruct the second terminal to cancel the call connection between the second terminal and the first terminal. 1's call hold status, thereby instructing the second terminal to cancel the called status.
  • the second terminal can collect the audio stream (for example, called audio stream B) and send it to the first through call connection 1.
  • the terminal sends audio stream B.
  • the target gateway can determine that the second terminal cancels the called state (or determines that the second terminal is not in the called state) based on the characteristic parameters of audio stream B.
  • the target gateway can Forward audio stream B to the first terminal.
  • the characteristic parameters of audio stream B include at least one of the following: audio characteristics of audio stream B and data characteristics of the data packets of audio stream B.
  • the target gateway may determine the first identifier based on the fact that the data packets of audio stream B do not contain the first identifier.
  • the second terminal cancels the called state; alternatively, the target gateway can determine that the second terminal cancels the called state based on the fact that the voice fragments contained in audio stream B do not include the specified voice fragment; or, the target gateway can determine that the second terminal cancels the called state based on the fact that the voice fragments contained in audio stream B do not include the specified voice fragment. If the similarity of the voice fragment is not greater than the similarity threshold, it is determined that the second terminal cancels the called state.
  • the multimedia conference system also includes a conference management device.
  • the conference management device is communicatively connected with the target gateway (such as a media gateway).
  • the conference management device can send detection indication information to the target gateway. , to instruct the target gateway to detect the audio stream sent to the first terminal and related to the second terminal.
  • the target gateway can determine whether the second terminal is in a called state based on the characteristic parameters of the audio stream related to the second terminal.
  • the detection indication information includes the identification of the second terminal and the detection identification to instruct the target gateway to detect the audio stream related to the second terminal.
  • the detection indication information can also instruct the target gateway to detect the audio stream related to the second terminal in other ways. The audio stream is detected, and there is no limit here.
  • the target gateway as a media gateway as an example, and describes the technical solution of this application in conjunction with the interaction between different devices in Figure 2.
  • a call connection 1 is established between the second terminal and the first terminal (such as the conference terminal 105) in the multimedia conference system.
  • connection 1 When connection 1 is active, the second terminal and the first terminal perform media transmission through call connection 1.
  • the call connection 1 When the call connection 1 is in the active state, if the second terminal carries out other telephone services (such as answering a phone call from a third terminal), the second terminal is in the called state (that is, in the state of being called by the third terminal), and the second terminal is called.
  • the second terminal establishes a call connection 2 with the third terminal, and the telephone switching system puts the call connection 1 on hold so that the call connection 1 is in a call hold state.
  • the target gateway ie, the media gateway 104
  • the target gateway can perform denoising processing on the media stream sent to the first terminal to remove the noise in the media stream that the second terminal is called.
  • the call connection 2 between the second terminal and the third terminal is disconnected or the call is held.
  • the second terminal cancels the called state and can reactivate the call between the second terminal and the first terminal.
  • Connection 1 enables the second terminal and the first terminal to conduct media transmission through call connection 1. Therefore, the technical solution of the embodiment of the present application involves the stage when the second terminal accesses the multimedia conference system and the stage when the second terminal is in the called state (or in other words, the call connection 1 between the second terminal and the first terminal is in the call hold state. stage) and the stage when the second terminal cancels the called state (or the stage when the call connection 1 between the second terminal and the first terminal is in the active state).
  • FIG. 6 shows a flow chart of a second terminal accessing a multimedia conference system according to an embodiment of the present application.
  • the process of the second terminal accessing the multimedia conference system includes the following steps S601 to S614.
  • the conference management device instructs the target gateway to connect the second terminal to the multimedia conference system.
  • the conference management device sends access instruction information to the target gateway to instruct the target gateway to access the second terminal to the multimedia conference system.
  • the access indication information may include an identity of the second terminal.
  • the access indication information may include a phone number of the second terminal or other identification information used to indicate the second terminal.
  • the conference management device sends access indication information to the target gateway through SIP signaling or private protocol signaling.
  • the target gateway sends call request 1 to the access gateway according to the instruction of the conference management device.
  • the target gateway determines to access the second terminal to the multimedia conference system according to the instruction of the conference management device. Therefore, the target gateway sends call request 1 to the access gateway to request the access gateway to call the second terminal.
  • the call request 1 includes the identity of the second terminal, and the call request 1 may be SIP signaling or private protocol signaling.
  • the access gateway sends call ring response 1 corresponding to call request 1 to the target gateway.
  • the access gateway After the access gateway receives the call request 1, the access gateway determines to call the second terminal according to the call request 1, and the access gateway sends the call ring response 1 corresponding to the call request 1 to the target gateway to inform the target gateway of the access.
  • the gateway is about to call the second terminal, please wait for the subsequent response from the target gateway.
  • the call ring response 1 may include the identification of the second terminal, for example, include the phone number of the second terminal.
  • the call ring response 1 may also include the signaling content of the call ring. For example, the signaling content of the call ring is "180".
  • Call Ring Response 1 can be SIP signaling or proprietary protocol signaling.
  • the access gateway sends call request 2 to the telephone switching system according to call request 1.
  • the access gateway After the access gateway determines to call the second terminal, the access gateway sends a call request 2 to the telephone switching system according to the call request 1 to request the telephone switching system to call the second terminal.
  • the call request 2 includes the identification of the second terminal, for example, the phone number of the second terminal.
  • Call request 2 may be SIP signaling or signaling of a private protocol.
  • the telephone switching system sends the call ring response 2 corresponding to the call request 2 to the access gateway.
  • the telephone switching system After the telephone switching system receives the call request 2, the telephone switching system determines to call the second terminal according to the call request 2, and the telephone switching system sends the call ring response 2 corresponding to the call request 2 to the access gateway to inform the access gateway of the call The switching system is about to call the second terminal. Please access the gateway and wait for the subsequent response.
  • the call ring response 2 may include the identification of the second terminal.
  • the call ring response 2 may also include the signaling content of the call ring. For example, the signaling content of the call ring is "180".
  • Call Ring Response 2 may be SIP signaling or proprietary protocol signaling.
  • the telephone switching system sends call request 3 to the telephone switching system according to call request 2.
  • the telephone switching system After the telephone switching system determines to call the second terminal, the telephone switching system sends a call request 3 to the second terminal according to the call request 2 to call the second terminal.
  • the call request 3 includes the identification of the second terminal, for example, the phone number of the second terminal.
  • Call request 3 may be SIP signaling or proprietary protocol signaling.
  • the second terminal sends the call ring response 3 corresponding to the call request 3 to the telephone switching system.
  • the second terminal After the second terminal receives the call request 3, the second terminal sends the call ring response 3 corresponding to the call request 3 to the telephone switching system to inform the telephone switching system to wait for a subsequent response.
  • the second terminal may also ring according to the call request 3 to prompt the user of the second terminal to answer the phone call.
  • the call ring response 3 includes the identification of the second terminal, for example, the phone number of the second terminal.
  • the call ringing response 3 may also include the signaling content of the call ringing, for example, the signaling content of the call ringing is "180".
  • the call ring response 3 may be SIP signaling or proprietary protocol signaling.
  • the second terminal sends the call connection response 3 corresponding to the call request 3 to the telephone switching system.
  • the second terminal may send a call connection response 3 corresponding to the call request 3 to the telephone switching system to inform the telephone switching system that the second terminal has connected the telephone call to the telephone switching system.
  • the call connection response 3 may include the identification of the second terminal, and may also include the signaling content of the call connection. For example, the signaling content of the call connection is "100".
  • the call connection response 3 may be SIP signaling or signaling of a proprietary protocol.
  • the telephone switching system sends the connection confirmation response 3 corresponding to the call connection response 3 to the second terminal.
  • connection confirmation response 3 After the telephone switching system receives the call connection response 3, the telephone switching system sends the connection confirmation response 3 corresponding to the call connection response 3 to the second terminal to inform the second terminal that the telephone switching system has received the call connection response 3. .
  • the connection confirmation response 3 may be SIP signaling or signaling of a private protocol.
  • the telephone switching system sends the call connection response 2 corresponding to the call request 2 to the access gateway.
  • the telephone switching system After the telephone switching system receives the call connection response 3, the telephone switching system sends the call connection response 2 corresponding to the call request 2 to the access gateway according to the call connection response 3 to inform the access gateway that the telephone switching system is connected. Telephone calls to the gateway.
  • the call connection response 2 may be SIP signaling or signaling of a proprietary protocol.
  • the access gateway sends the connection confirmation response 2 corresponding to the call connection response 2 to the telephone switching system.
  • connection confirmation response 2 corresponding to the call connection response 2 to the telephone switching system to inform the telephone switching system that the access gateway has received the call connection response 2.
  • the connection confirmation response 2 may be SIP signaling or signaling of a private protocol.
  • the access gateway sends the call connection response 1 corresponding to the call request 1 to the target gateway.
  • the telephone switching system After the access gateway receives the call connection response 2, the telephone switching system sends the call connection response 1 corresponding to the call request 1 to the target gateway according to the call connection response 2, so as to inform the target gateway that the telephone access gateway has connected the target. Gateway phone call.
  • the call connection response 1 may be SIP signaling or signaling of a proprietary protocol.
  • the target gateway sends the connection confirmation response 1 corresponding to the call connection response 1 to the access gateway.
  • the target gateway After the target gateway receives the call connection response 1, the target gateway sends the connection confirmation response 1 corresponding to the call connection response 1 to the access gateway to inform the access gateway that the target gateway has received the call connection response 1.
  • the connection confirmation response 1 may be SIP signaling or signaling of a private protocol.
  • the call connection 1 is successfully established between the second terminal and the first terminal, and the second terminal successfully accesses the multimedia conference system.
  • the call connection 1 includes a call connection 11 between the telephone switching system and the second terminal, a call connection 12 between the access gateway and the telephone switching system, a call connection 13 between the target gateway and the access gateway, and, A call connection 10 between a terminal and the target gateway.
  • the call connection 10 between the first terminal and the target gateway is established between the first terminal and the target gateway when the first terminal accesses the multimedia conference system.
  • the target gateway notifies the conference management device that the second terminal has successfully accessed the multimedia conference system.
  • the target gateway sends the access result of the second terminal to the conference management device, so as to notify the conference management device that the second terminal has successfully accessed the multimedia conference system.
  • the access result of the second terminal may be "access successful”.
  • the second terminal and the first terminal in the multimedia conference system can transmit the audio stream through the call connection 1. That is, the second terminal can transmit the audio stream to the first terminal through the call connection 1, and the first terminal can also transmit the audio stream to the second terminal through the call connection 1.
  • FIG. 7 shows a flow chart in which the second terminal is in a called state according to an embodiment of the present application.
  • Figure 7 mainly introduces the process of the second terminal entering the called state and the processing process of the audio stream related to the second terminal after the second terminal enters the called state.
  • the target gateway determines that the second terminal is called based on the signaling message. The status is explained as an example.
  • the process in which the second terminal is in the called state includes the following steps S701 to S715.
  • the second terminal sends a renegotiation request 1 to the telephone switching system.
  • a call connection 1 is established between the second terminal and the first terminal in the multimedia conference system.
  • the call connection 1 is activated, if the second terminal communicates with the first terminal due to other telephone services, Other terminals establish call connection 2.
  • the second terminal answers a phone call from a third terminal.
  • the second terminal is in the called state, and the second terminal can put call connection 1 on hold.
  • the second terminal sends a renegotiation request 1 to the telephone switching system to negotiate with the telephone switching system to hold the call connection 1 on hold.
  • the renegotiation request 1 may be SIP signaling or private protocol signaling.
  • the renegotiation request 1 may include the identity of the second terminal and may also include designated signaling information.
  • the telephone switching system sends a renegotiation request 2 to the access gateway based on the renegotiation request 1.
  • the telephone switching system After the telephone switching system receives the renegotiation request 1, the telephone switching system determines to hold the call connection 1 between the second terminal and the first terminal according to the renegotiation request 1, and the telephone switching system sends a request to the access gateway according to the renegotiation request 1.
  • Send renegotiation request 2 to negotiate with the access gateway to put call connection 1 on hold.
  • renegotiation request 2 is SIP signaling or private protocol signaling, and renegotiation request 2 and renegotiation request 1 are the same signaling.
  • the access gateway sends a renegotiation request 3 to the target gateway based on the renegotiation request 2.
  • the access gateway After the access gateway receives the renegotiation request 2, the access gateway determines to hold the call connection 1 between the second terminal and the first terminal based on the renegotiation request 2, and the access gateway sends a message to the target gateway based on the renegotiation request 2. Renegotiation request 3 to negotiate with the target gateway to put call connection 1 on hold.
  • renegotiation request 3 is SIP signaling or private protocol signaling, and renegotiation request 3 and renegotiation request 2 are the same signaling.
  • the target gateway sends the renegotiation response 3 corresponding to the renegotiation request 3 to the access gateway.
  • the target gateway After the target gateway receives the renegotiation request 3, the target gateway determines to hold the call connection 1 between the second terminal and the first terminal according to the renegotiation request 3, and the target gateway sends a message corresponding to the renegotiation request 3 to the access gateway. Renegotiation response 3.
  • the first terminal When the call connection 1 with the first terminal is in the call hold state (when the call connection 1 between the second terminal and the first terminal is in the call hold state, and the second terminal is in the called state), the first terminal only receives Media streams without sending media streams.
  • the renegotiation response 3 may be SIP signaling or private protocol signaling.
  • the access gateway sends the renegotiation response 2 corresponding to the renegotiation request 2 to the telephone switching system.
  • the access gateway After the access gateway receives the renegotiation response 3, the access gateway sends the renegotiation response 2 corresponding to the renegotiation request 2 to the telephone switching system according to the renegotiation response 3.
  • renegotiation response 2 is SIP signaling or private protocol signaling, and renegotiation response 2 and renegotiation response 3 may be the same signaling.
  • the telephone switching system sends the renegotiation response 1 corresponding to the renegotiation request 1 to the second terminal.
  • the telephone switching system After the telephone switching system receives the renegotiation response 2, the telephone switching system sends the renegotiation response 1 corresponding to the renegotiation request 1 to the second terminal according to the renegotiation response 2.
  • renegotiation response 1 is SIP signaling or private protocol signaling, and renegotiation response 1 and renegotiation response 2 may be the same response.
  • the second terminal sends the renegotiation confirmation 1 corresponding to the renegotiation response 1 to the telephone switching system.
  • the second terminal may send the renegotiation confirmation 1 corresponding to the renegotiation response 1 to the telephone switching system to inform the telephone switching system that the second terminal has received the renegotiation response 1.
  • the two-way call connection 11 between the second terminal and the telephone switching system is adjusted to a one-way call connection 11 from the second terminal to the telephone switching system.
  • the audio stream can be sent to the telephone switching system through the one-way call connection 11 , but the telephone switching system does not send the audio stream to the second terminal through the one-way call connection 11 .
  • the renegotiation confirmation 1 may be SIP signaling or private protocol signaling.
  • the telephone switching system sends the renegotiation confirmation 2 corresponding to the renegotiation response 2 to the access gateway.
  • the telephone switching system may send the renegotiation confirmation 2 corresponding to the renegotiation response 2 to the access gateway according to the renegotiation confirmation 1 to inform the access gateway that the telephone switching system has received the renegotiation. Response 2.
  • the telephone switching system sends the renegotiation confirmation 2 to the access gateway, the two-way call connection 12 between the telephone switching system and the access gateway is adjusted to a one-way call connection 12 from the telephone switching system to the access gateway.
  • the telephone switching system The audio stream can be sent to the access gateway through the one-way call connection 12, but the access gateway does not send the audio stream to the telephone switching system through the one-way call connection 12.
  • renegotiation confirmation 2 is SIP signaling or private protocol signaling
  • renegotiation confirmation 2 and renegotiation confirmation 1 may be the same signaling.
  • the access gateway sends the renegotiation confirmation 3 corresponding to the renegotiation response 3 to the target gateway.
  • the access gateway may send the renegotiation confirmation 3 corresponding to the renegotiation response 3 to the target gateway according to the renegotiation confirmation 2 to inform the target gateway that the access gateway has received the renegotiation response 3. .
  • the two-way call connection 13 between the access gateway and the target gateway is adjusted to a one-way call connection 13 from the access gateway to the target gateway.
  • the access gateway can use this
  • the one-way call connection 13 sends the audio stream to the target gateway, but the target gateway does not send the audio stream to the access gateway through the one-way call connection 13 .
  • renegotiation confirmation 3 is SIP signaling or private protocol signaling, and renegotiation confirmation 3 and renegotiation confirmation 2 may be the same signaling.
  • the target gateway determines that the second terminal is in the called state according to the renegotiation request 3.
  • the target gateway may determine that the second terminal is in the called state according to the specified signaling information carried in the renegotiation request 3.
  • the designated signaling information indicates that the first terminal is prompted by voice when the call connection between the second terminal and the first terminal is in the call hold state.
  • the designated signaling information is used to indicate that the call connection between the second terminal and the first terminal is in the call hold state, thereby indicating that the second terminal is in the called state.
  • the target gateway determines that the second terminal is in the called state based on the designated signaling information. state.
  • the target gateway sends status information 1 to the conference management device.
  • the status information 1 indicates that the second terminal is in the called state.
  • the target gateway can send status information 1 to the conference management device through SIP signaling or private protocol signaling.
  • the status information 1 may include the identity of the second terminal and the called identity to indicate that the second terminal is in the called identity.
  • the conference management device controls the first terminal to display that the second terminal is in a called state.
  • the conference management device may determine that the second terminal is in the called state according to the status information 1, and then the conference management device may send control indication information to the first terminal to indicate that the second terminal is in the called state.
  • the first terminal may display information or an identification indicating that the second terminal is in a called state on the conference interface of the first terminal according to the control instruction information.
  • the telephone switching system sends audio stream 1 to the access gateway.
  • the telephone switching system When the second terminal is in the called state, the telephone switching system generates the called prompt tone and sends the audio stream 1 to the access gateway based on the called prompt tone.
  • the called prompt sound is used to remind the second terminal that it is in a called state.
  • the access gateway forwards audio stream 1 to the target gateway.
  • the target gateway performs denoising processing on the audio stream 1 to remove the noise of the second terminal being called in the audio stream 1.
  • the target gateway intercepts audio stream 1, or the target gateway replaces the data packet of audio stream 1 with a silence packet and sends the silence packet to the first terminal, or the target gateway adds a second packet to the data packet of audio stream 1.
  • the data packet of audio stream 1 is sent to the first terminal, where the second identification is used to instruct the first terminal not to play audio stream 1.
  • the target gateway can prevent audio stream 1 from reaching the first terminal, or even if audio stream 1 reaches the first terminal, it can prevent the first terminal from playing audio stream 1, so it can avoid audio stream 1 from interfering with the first terminal.
  • S701 to S709 describe the process of the second terminal entering the called state
  • S713 to S715 describe the processing process of the audio stream related to the second terminal after the second terminal enters the call hold state.
  • the renegotiation signaling used for the call hold negotiation of the telephone terminal (for example, the second terminal) is terminated at the access gateway. That is, the access gateway receives the renegotiation signaling used for the call hold negotiation of the telephone terminal. After the command, the access gateway does not send renegotiation signaling to the media gateway (such as the target gateway). Therefore, the called status of the phone terminal will not be passed to the media gateway, causing the media gateway to be unable to sense the called status of the phone terminal. When the telephone terminal is in the called state, the media gateway still forwards the called prompt tone of the telephone terminal, causing interference to other terminals in the multimedia conference system.
  • the access gateway after the access gateway receives the renegotiation signaling used for the telephone terminal call hold negotiation, the access gateway sends the renegotiation signaling to the media gateway to negotiate with the media gateway, so that the media gateway can sense the telephone terminal.
  • the called state so that when the phone terminal is in the called state, the media gateway performs denoising processing on the audio streams related to the phone terminal sent to other terminals to remove the called prompt tone of the phone terminal in the audio stream. , to prevent the called prompt tone of the telephone terminal from interfering with other terminals in the multimedia conference system, and to achieve the effect of accurately suppressing unnecessary interfering audio.
  • FIG. 8 shows another flow chart in which the second terminal is in a called state according to an embodiment of the present application.
  • Figure 8 mainly introduces the process of the second terminal entering the called state and the processing process of the audio stream related to the second terminal after the second terminal enters the called state.
  • the target gateway determines that the second terminal is in a state based on the characteristic parameters of the audio stream. Take the called status as an example.
  • the process in which the second terminal is in the called state includes the following steps S801 to S817.
  • the second terminal sends a renegotiation request 1 to the telephone switching system.
  • the telephone switching system sends a renegotiation request 2 to the access gateway based on the renegotiation request 1.
  • the implementation process of S801 to S802 can refer to the implementation process of S701 to 702, which will not be described again here.
  • the access gateway sends the renegotiation response 2 corresponding to the renegotiation request 2 to the telephone switching system.
  • the telephone switching system sends the renegotiation response 1 corresponding to the renegotiation request 1 to the second terminal.
  • the second terminal sends the renegotiation confirmation 1 corresponding to the renegotiation response 1 to the telephone switching system.
  • the two-way call connection 11 between the second terminal and the telephone switching system is adjusted to a one-way call connection 11 from the second terminal to the telephone switching system.
  • the telephone switching system sends the renegotiation confirmation 2 corresponding to the renegotiation response 2 to the access gateway.
  • the two-way call connection 12 between the telephone switching system and the access gateway is adjusted to a one-way call connection 12 from the telephone switching system to the access gateway.
  • the implementation process of S803 to S806 can refer to the implementation process of S705 to 708, and will not be described again here.
  • the conference management device sends detection indication information to the target gateway.
  • the detection indication information is used to instruct detection of the audio stream related to the second terminal.
  • the detection indication information includes the identification of the second terminal and the detection identification to instruct the target gateway to detect the audio stream related to the second terminal.
  • the identifier of the second terminal may be a phone number of the second terminal or other identification information used to indicate the second terminal, which is not limited in this embodiment of the present application.
  • the conference management device sends detection indication information to the target gateway through SIP signaling or private protocol signaling.
  • the telephone switching system sends audio stream 1 to the access gateway.
  • the telephone switching system When the second terminal is in the called state, the telephone switching system generates the called prompt tone and sends the audio stream 1 to the access gateway based on the called prompt tone.
  • the called prompt sound is used to remind the second terminal that it is in a called state.
  • the access gateway forwards audio stream 1 to the target gateway.
  • the target gateway decodes audio stream 1 into a PCM file.
  • the target gateway After the target gateway receives audio stream 1, the target gateway determines that audio stream 1 is related to the second terminal. Since in S807, the conference management device instructs the target gateway to detect the audio stream related to the second terminal, the target gateway determines that it is necessary to detect the audio stream 1 related to the second terminal, and the target gateway decodes the audio stream 1 into a PCM file. .
  • the target gateway sends the PCM file of audio stream 1 to the audio recognition device.
  • the audio recognition device performs audio recognition based on the PCM file of audio stream 1, and obtains the audio characteristics of audio stream 1.
  • the audio recognition device sends the audio characteristics of audio stream 1 to the target gateway.
  • the target gateway determines that the second terminal is in the called state based on the audio characteristics of audio stream 1.
  • the target gateway compares the voice segment contained in the audio stream 1 with the specified voice segment, and determines that the second terminal is in the called state based on the comparison result. For example, the target gateway compares the voice segment included in audio stream 1 with the specified voice segment to determine whether the voice segment included in audio stream 1 includes the specified voice segment. If the voice segment included in audio stream 1 includes the specified voice segment, the target gateway determines whether the second voice segment included in audio stream 1 includes the specified voice segment. The terminal is in the called state. If the voice segment contained in the audio stream 1 does not include the specified voice segment, the target gateway determines that the second terminal is not in the called state.
  • the target gateway compares the voice segment contained in audio stream 1 with the specified voice segment to determine whether the similarity between the voice segment contained in audio stream 1 and the specified voice segment is greater than the similarity threshold. If the voice segment contained in audio stream 1 is similar to the specified voice segment, If the similarity of the segment is greater than the similarity threshold, the target gateway determines that the second terminal is in the called state. If the similarity between the voice segment contained in audio stream 1 and the specified voice segment is not greater than the similarity threshold, the target gateway determines that the second terminal is not in the called state. Call status.
  • the target gateway sends status information 1 to the conference management device.
  • the status information 1 indicates that the second terminal is in the call hold state.
  • the conference management device controls the first terminal to display that the second terminal is in the call hold state.
  • the implementation process of S815 to S816 can refer to the implementation process of S711 to S712, which will not be described again here.
  • the target gateway performs denoising processing on the audio stream 1 to remove the noise of the second terminal being called in the audio stream 1.
  • the implementation process of S817 can refer to the implementation process of S715, and will not be repeated here.
  • S801 to S806 describe the process of the second terminal entering the called state
  • S808 to S817 describe the processing process of the audio stream related to the second terminal after the second terminal enters the called state.
  • the media gateway (such as the target gateway) cannot sense the called status of the phone terminal. Therefore, when the phone terminal is in the called state, the media gateway still forwards the called prompt tone of the phone terminal, resulting in the multimedia conference Other terminals in the system create interference.
  • the media gateway can detect the audio stream related to the telephone terminal to determine that the telephone terminal is in the called state. When the telephone terminal is in the called state, the media gateway detects the audio stream related to the telephone terminal and is sent to other terminals.
  • the audio stream related to the terminal is denoised to remove the called prompt tone of the telephone terminal in the audio stream to prevent the called prompt tone of the telephone terminal from interfering with other terminals in the multimedia conference system to achieve precise suppression of unnecessary The effect of interfering with the audio.
  • FIG. 9 shows a flow chart for a second terminal to cancel the called state provided by an embodiment of the present application.
  • Figure 9 mainly introduces the process of the second terminal canceling the called state and the processing process of the audio stream related to the second terminal after the second terminal cancels the called state.
  • the process for the second terminal to cancel the called state includes the following steps S901 to S917.
  • the second terminal sends a renegotiation request 4 to the telephone switching system.
  • a call connection 1 is established between the second terminal and the first terminal in the multimedia conference system.
  • the call connection 1 is activated, if the second terminal communicates with the first terminal due to other telephone services, When another terminal establishes call connection 2, the second terminal is in the called state, and the call connection 1 between the second terminal and the first terminal is held by the call. If the second terminal ends other telephone services, the call connection 2 between the second terminal and the third terminal is disconnected or the call is on hold, the second terminal can cancel the called state, and at this time, the call between the second terminal and the first terminal can be reactivated. call connection 1 between the second terminal and the first terminal (or cancel the call hold state of call connection 1 between the second terminal and the first terminal).
  • the second terminal when the second terminal cancels the called state, the second terminal sends a renegotiation request 4 to the telephone switching system to negotiate with the telephone switching system to cancel the call hold state of the call connection 1 between the second terminal and the first terminal.
  • the renegotiation request 4 may include the identity of the second terminal, and the renegotiation request 4 does not include designated signaling information.
  • the telephone switching system sends a renegotiation request 5 to the access gateway based on the renegotiation request 4.
  • the telephone switching system After the telephone switching system receives the renegotiation request 4, the telephone switching system determines to cancel the call hold state of the call connection 1 between the second terminal and the first terminal according to the renegotiation request 4, and the telephone switching system accesses the call according to the renegotiation request 4.
  • the gateway sends a renegotiation request 5 to negotiate with the access gateway to cancel the call hold state of the call connection 1 between the second terminal and the first terminal.
  • renegotiation request 5 is SIP signaling or private protocol signaling, and renegotiation request 5 and renegotiation request 4 may be the same signaling.
  • the access gateway sends a renegotiation request 6 to the target gateway based on the renegotiation request 5.
  • the access gateway After the access gateway receives the renegotiation request 5, the access gateway determines to cancel the call hold state of the call connection 1 between the second terminal and the first terminal according to the renegotiation request 5, and the access gateway sends a request to the target gateway according to the renegotiation request 5.
  • a renegotiation request 6 is sent to negotiate with the target gateway to cancel the call hold state of the call connection 1 between the second terminal and the first terminal.
  • renegotiation request 6 is SIP signaling or private protocol signaling, and renegotiation request 6 and renegotiation request 5 may be the same signaling.
  • the target gateway sends the renegotiation response 6 corresponding to the renegotiation request 6 to the access gateway.
  • the target gateway After the target gateway receives the renegotiation request 6, the target gateway determines to cancel the call hold state of the call connection 1 between the second terminal and the first terminal according to the renegotiation request 6, thereby determining that the second terminal cancels the called state, and the target gateway
  • the access gateway sends a renegotiation response 6 corresponding to the renegotiation request 6.
  • the access gateway sends the renegotiation response 5 corresponding to the renegotiation request 5 to the telephone switching system.
  • the access gateway After the access gateway receives the renegotiation response 6, the access gateway sends the renegotiation response 5 corresponding to the renegotiation request 5 to the telephone switching system according to the renegotiation response 6.
  • renegotiation response 5 is SIP signaling or private protocol signaling, and renegotiation response 6 and renegotiation response 5 may be the same signaling.
  • the telephone switching system sends the renegotiation response 4 corresponding to the renegotiation request 4 to the second terminal.
  • the telephone switching system After the telephone switching system receives the renegotiation response 5, the telephone switching system sends the renegotiation response 4 corresponding to the renegotiation request 4 to the second terminal according to the renegotiation response 5.
  • renegotiation response 4 is SIP signaling or private protocol signaling, and renegotiation response 5 and renegotiation response 4 may be the same response.
  • the second terminal sends the renegotiation confirmation 4 corresponding to the renegotiation response 4 to the telephone switching system.
  • the second terminal may send the renegotiation confirmation 4 corresponding to the renegotiation response 4 to the telephone switching system to inform the telephone switching system that the second terminal has received the renegotiation response 4.
  • the renegotiation confirmation 4 may be SIP signaling or signaling of a private protocol.
  • the telephone switching system sends the renegotiation confirmation 5 corresponding to the renegotiation response 5 to the access gateway.
  • the telephone switching system may send the renegotiation confirmation 5 corresponding to the renegotiation response 5 to the access gateway according to the renegotiation confirmation 4 to inform the access gateway that the telephone switching system has received the renegotiation. Response 5.
  • the telephone switching system sends the renegotiation confirmation 5 to the access gateway, the one-way call connection 12 between the telephone switching system and the access gateway is adjusted to a two-way call connection 12 .
  • renegotiation confirmation 5 is SIP signaling or private protocol signaling, and renegotiation confirmation 5 and renegotiation confirmation 4 may be the same signaling.
  • the access gateway sends the renegotiation confirmation 6 corresponding to the renegotiation response 6 to the target gateway.
  • the access gateway can send the renegotiation confirmation 6 corresponding to the renegotiation response 6 to the target gateway according to the renegotiation confirmation 5 to inform the target gateway that the access gateway has received the renegotiation response 6. .
  • the one-way call connection 13 between the access gateway and the target gateway is adjusted to a two-way call connection 13 .
  • the renegotiation confirmation 6 is SIP signaling or private protocol signaling, and the renegotiation confirmation 6 and the renegotiation confirmation 6 may be the same signaling.
  • the target gateway determines that the second terminal cancels the called state according to the renegotiation request 6.
  • the target gateway sends status information 2 to the conference management device.
  • the status information 2 instructs the second terminal to cancel the called state.
  • the target gateway can send status information 2 to the conference management device through SIP signaling or private protocol signaling.
  • the state information 2 includes the identifier of the second terminal and does not include the called identifier, so as to instruct the second terminal to cancel the called state.
  • the conference management device controls the first terminal to display that the second terminal is not in a called state.
  • the conference management device may determine that the second terminal cancels the called state according to status information 2, and then the conference management device sends control indication information to the first terminal to indicate that the second terminal is not in the called state.
  • the first terminal may display an identification or information indicating that the second terminal is not in a called state in the conference interface of the first terminal according to the control instruction information.
  • the second terminal sends audio stream 2 to the telephone switching system.
  • the second terminal can collect the audio stream (for example, called audio stream 2) and send the audio stream 2 to the telephone switching system.
  • the audio stream for example, called audio stream 2
  • the telephone switching system sends audio stream 2 to the access gateway.
  • the access gateway forwards audio stream 2 to the target gateway.
  • the target gateway forwards audio stream 2 to the first terminal.
  • the audio stream 2 does not include the noise that the second terminal is in the called state.
  • the audio stream 2 will not interfere with the first terminal, so the target gateway forwards the audio stream 2 to the first terminal.
  • the first terminal plays audio stream 2.
  • audio stream 1 and the aforementioned audio stream A may be the same audio stream
  • audio stream 2 and the aforementioned audio stream B may be the same audio stream.
  • audio stream 1 and audio stream A may not be the same audio stream.
  • the audio stream, audio stream 2 and the aforementioned audio stream B may not be the same audio stream, and this is not limited in the embodiment of the present application.
  • FIG. 10 shows a schematic structural diagram of a multimedia conference control device 1000 provided by an embodiment of the present application.
  • the control device 1000 may be a target gateway or a functional component in the target gateway, and the target gateway may be a media gateway.
  • the control device 1000 includes: a receiving module 1010 and a processing module 1020.
  • the receiving module 1010 is used to receive the audio stream sent to the first terminal through the telephone switching system; the processing module 1020 is used to denoise the audio stream, and the denoising process is to remove the second terminal being called in the audio stream. noise.
  • the functional implementation of the receiving module 1010 may refer to the above-mentioned implementation process of S401, and the functional implementation of the processing module 1020 may refer to the above-mentioned implementation process of S402.
  • the processing module 1020 is also configured to determine that the second terminal is in a called state according to the characteristic parameters of the audio stream.
  • the function implementation of the processing module 1020 may also refer to the implementation process of S403a mentioned above.
  • the characteristic parameters include at least one of the following: audio characteristics of the audio stream, and data characteristics of the data packets of the audio stream.
  • the characteristic parameters include audio characteristics of the audio stream, and the audio characteristics include voice segments contained in the audio stream.
  • the processing module 1020 is configured to compare the voice segments contained in the audio stream with specified voice segments, and determine that the second terminal is in the state of being called. Call state, this specified voice segment is used to describe that the second terminal is in a called state.
  • the control device 1000 also includes: a sending module 1030, used to send the audio stream to the audio recognition device, and the audio recognition device is used to perform audio recognition on the audio stream to obtain the audio characteristics of the audio stream. ;
  • the receiving module 1010 is also used to receive the audio characteristics of the audio stream sent by the audio recognition device.
  • the sending module 1030 and the functional implementation of the receiving module 1010 please refer to the relevant description in S403a above.
  • the characteristic parameters include data characteristics of the data packet of the audio stream.
  • the processing module 1020 is configured to determine that the second terminal is in the called state according to the first identifier contained in the data packet of the audio stream. The first identifier is used to indicate that the second terminal is in a called state. The second terminal is in the called state.
  • the receiving module 1010 is also configured to receive the target signaling message; the processing module 1020 is also configured to determine that the second terminal is in the called state according to the target signaling message containing designated signaling information.
  • the designated signaling information is used To indicate that the second terminal is in the called state.
  • the functional implementation of the receiving module 1010 may also refer to the relevant description in the above S403b, and the functional implementation of the processing module 1020 may also refer to the relevant description in the above S404b.
  • the denoising process includes: intercepting the audio stream. That is, the target gateway does not forward the audio stream.
  • the target gateway intercepts the audio stream to denoise the audio stream, preventing the audio stream from reaching the first terminal, thereby preventing the first terminal from playing the audio stream, and thereby preventing the second terminal from being called in the audio stream.
  • the noise affects the first terminal.
  • the denoising process includes: replacing data packets of the audio stream with silence packets, and sending the silence packets to the first terminal.
  • the target gateway replaces the data packet of the audio stream with a silence packet and sends the silence packet to the first terminal, which can denoise the audio stream and prevent the audio stream from reaching the first terminal, thereby preventing the first terminal from playing the audio stream. audio stream, thereby preventing the noise of the second terminal being called in the audio stream from affecting the first terminal.
  • the denoising process includes: adding a second identifier to the data packet of the audio stream and sending the data packet of the audio stream to the first terminal, where the second identifier is used to instruct the first terminal not to play the audio stream.
  • the target gateway adds the second identifier to the data packet of the audio stream and sends the data packet of the audio stream with the second identifier added to the first terminal. After the first terminal receives the audio stream, the first terminal does not play the audio. stream, therefore it is possible to avoid the noise caused by the second terminal being called in the audio stream from affecting the first terminal, thereby realizing the denoising process of the audio stream.
  • the processing module performs denoising processing on the audio stream to remove the audio stream. Therefore, the noise of the second terminal being called can be prevented from interfering with the first terminal, thereby preventing the noise of the second terminal being called from affecting the development of the multimedia conference.
  • the multimedia conference control device provided by the embodiment of the present application can also be implemented using an application-specific integrated circuit (ASIC) or a programmable logic device (PLD).
  • ASIC application-specific integrated circuit
  • PLD programmable logic device
  • the above-mentioned PLD can be a complex programmable logical device (CPLD), a field-programmable gate array (FPGA), a general array logic (generi carray logic, GAL) or any combination thereof.
  • CPLD complex programmable logical device
  • FPGA field-programmable gate array
  • GAL general array logic
  • the multimedia conference control method provided in the above method embodiment can also be implemented through software.
  • each module in the multimedia conference control device can also be a software module.
  • FIG 11 shows a schematic structural diagram of another multimedia conference control device 1100 provided by an embodiment of the present application.
  • the control device 1100 may be a target gateway or a functional component in the target gateway.
  • the target gateway may be a media gateway.
  • the control device 1100 includes a processor 1102, a memory 1104, a communication interface 1106 and a bus 1108.
  • the processor 1102, the memory 1104 and the communication interface 1106 are communicatively connected to each other through the bus 1108.
  • the connection between the processor 1102, the memory 1104 and the communication interface 1106 shown in Figure 11 is only exemplary.
  • the processor 1102, the memory 1104 and the communication interface 1106 can also communicate with each other using other connection methods besides the bus 1108. connect.
  • the memory 1104 can be used to store a computer program 11042, and the computer program 11042 can include instructions and data.
  • the memory 1104 can be various types of storage media, such as random access memory (random access memory, RAM), read-only memory (read-only memory, ROM), non-volatile RAM (non-volatile RAM), etc. -volatile RAM, NVRAM), programmable ROM (programmable ROM, PROM), erasable PROM (erasable PROM, EPROM), electrically erasable PROM (electrically erasable PROM, EEPROM), flash memory, optical memory and registers, etc.
  • storage 1104 may include a hard disk and/or memory.
  • the processor 1102 may be a general-purpose processor, and the general-purpose processor may be a processor that performs specific steps and/or operations by reading and executing a computer program (eg, computer program 11042) stored in a memory (eg, memory 1104), A general-purpose processor may use data stored in a memory (eg, memory 1104) in performing the above steps and/or operations.
  • the stored computer program can, for example, be executed to implement the related functions of the aforementioned processing module 1020 .
  • a general-purpose processor may be, for example but not limited to, a central processing unit (CPU).
  • the processor 1102 may also be a special-purpose processor.
  • the special-purpose processor may be a processor specially designed to perform specific steps and/or operations.
  • the special-purpose processor may be, for example, but not limited to, a digital signal processor. signal processor (DSP), ASIC, FPGA, etc.
  • the processor 1102 may also be a combination of multiple processors, such as a multi-core processor.
  • the processor 1102 may include at least one circuit to execute all or part of the steps of the multimedia conference control method provided in the above embodiments.
  • the communication interface 1106 may include an input/output (I/O) interface, a physical interface, a logical interface, and other interfaces for interconnecting devices within the control device 1100, and for realizing the interconnection between the control device 1100 and other devices. Interfaces for interconnection of devices (such as terminal devices, servers, gateways, etc.).
  • the physical interface may be a gigabit Ethernet (GE) interface, which may be used to interconnect the control device 1100 with other devices.
  • the logical interface may be an internal interface of the control device 1100 , which may be used to implement the internal interface of the control device 1100 . device interconnection. It is easy to understand that the communication interface 1106 can be used to control the communication between the device 1100 and other devices. For example, the communication interface 1106 is used to send and receive signaling between the control device 1100 and other devices, send and receive audio streams, etc., the communication interface 1106
  • the related functions of the aforementioned receiving module 1010 and sending module 1030 can be implemented.
  • the bus 1108 may be any type of communication bus used to interconnect the processor 1102, the memory 1104, and the communication interface 1106, such as a system bus.
  • the above-mentioned devices may be arranged on separate chips, or at least part or all of them may be arranged on the same chip. Whether each device is independently installed on different chips or integrated on one or more chips often depends on the needs of product design.
  • the embodiments of this application do not limit the specific implementation forms of the above devices.
  • the control device 1100 shown in FIG. 11 is only exemplary. During the implementation process, the control device 1100 may also include other components, which are not listed here.
  • the control device 1100 shown in FIG. 11 can control a multimedia conference by executing all or part of the steps of the multimedia conference control method provided in the above embodiments.
  • the embodiment of the present application provides a communication system.
  • the communication system includes a target gateway, a first terminal and a second terminal.
  • the first terminal is communicatively connected to the target gateway.
  • the second terminal is communicatively connected to the target gateway through a telephone switching system.
  • the target gateway It may include a multimedia conference control device as shown in Figure 10 or Figure 11 .
  • the communication system is as shown in any one of Figures 1 to 3.
  • the target gateway may be a media gateway.
  • Embodiments of the present application provide a computer-readable storage medium.
  • a computer program is stored in the computer-readable storage medium.
  • the computer program is executed (for example, executed by a target gateway, one or more processors, etc.), All or part of the steps of the method provided by the above method embodiments.
  • Embodiments of the present application provide a computer program product.
  • the computer program product includes a program or code.
  • the program or code is executed (for example, executed by a target gateway, one or more processors, etc.), the above method is implemented. Examples provide all or part of the steps of the method.
  • Embodiments of the present application provide a chip that includes programmable logic circuits and/or program instructions. When the chip is run, it is used to implement all or part of the steps of the method provided in the above method embodiments.
  • the above embodiments it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof.
  • software it may be implemented in whole or in part in the form of a computer program product including one or more computer instructions.
  • the computer program instructions When the computer program instructions are loaded and executed on a computer, the processes or functions described in the embodiments of the present application are generated in whole or in part.
  • the computer may be a general purpose computer, a computer network, or other programmable device.
  • the computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another, for example, the computer instructions may be transferred from a website, computer, server, or data
  • the center transmits to another website, computer, server or data center through wired (such as coaxial cable, optical fiber, digital subscriber line) or wireless (such as infrared, wireless, microwave, etc.) means.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage device such as a server, data center, or the like that includes one or more available media integrated therein.
  • the available media may be magnetic media (eg, floppy disks, hard disks, tapes), optical media, or semiconductor media (eg, solid state drives), etc.
  • the term “at least one” in this application refers to one or more
  • the term “plurality” refers to two or more
  • the term “at least two” refers to two or more.
  • the symbol “/” means or, for example, A/B means A or B.
  • the term “and/or” in this application is only an association relationship describing related objects, indicating that there can be three relationships, for example, A and/or B, which can mean: A alone exists, A and B exist simultaneously, alone There are three situations B.
  • words such as “first”, “second” and “third” are used to distinguish the same or similar items with basically the same functions and effects. Those skilled in the art can understand that words such as “first”, “second” and “third” do not limit the number and execution order.
  • the disclosed devices can be implemented in other configurations.
  • the device embodiments described above are only illustrative.
  • the division of modules is only a logical function division. In actual implementation, there may be other division methods.
  • multiple modules or components may be combined or integrated into Another system, or some features can be ignored, or not implemented.
  • Modules described as separate components may or may not be physically separate.
  • Components described as modules may or may not be physical modules, which may be located in one place or distributed to multiple devices (such as terminal devices, gateways). )superior. Some or all modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Telephonic Communication Services (AREA)

Abstract

本申请提供了一种多媒体会议的控制方法及装置、通信系统,属于通信技术领域。该方法包括:目标网关通过电话交换系统接收发往第一终端的音频流,目标网关对该音频流进行去噪处理,该去噪处理为去除该音频流中第二终端被呼叫的噪音。由于目标网关去除了发往第一终端的音频流中第二终端被呼叫的噪音,因此可以避免第二终端被呼叫的噪音干扰第一终端,从而避免第二终端被呼叫的噪音影响多媒体会议的开展。

Description

多媒体会议的控制方法及装置、通信系统
本申请要求申请日为2022年04月14日,申请号为202210395237.9,名称为“一种会议中语音干扰抑制方法、系统及设备”的中国专利申请的优先权,以及,申请日为2022年06月13日,申请号为202210666906.1,名称为“多媒体会议的控制方法及装置、通信系统”的中国专利申请的优先权,这两件专利申请的全部内容通过引用结合在本申请中。
技术领域
本申请涉及通信技术领域,特别涉及一种多媒体会议的控制方法及装置、通信系统。
背景技术
多媒体会议(例如音频会议、视频会议)是指通过通信技术实现的虚拟会议,其可以使地理上分散的个人或群体共聚一处,通过图形、声音等多种方式交流信息。在开展多媒体会议时,经常会有一些手机、固话终端等(以下将这些终端称为电话终端)通过电话交换网络接入多媒体会议系统,例如,电话交换网络为公共交换电话网络(public switched telephone network,PSTN)或基于专用分组交换机(private branch exchange,PBX)构建的私有网络。
电话终端接入多媒体会议系统之后,该电话终端与该多媒体会议系统中的其他终端(例如会议终端)之间建立一路通话连接(例如称为通话连接1),通话连接1包括该电话终端与电话交换网络之间的通话连接以及该电话交换网络与该其他终端之间的通话连接,该电话终端与该其他终端之间传输的媒体流(例如音频流)经由该电话交换网络转发。若接入多媒体会议系统的该电话终端开展其他电话业务(例如接听新的电话呼叫、拨打新的电话),电话交换网络会对该电话终端与该多媒体会议系统中的其他终端之间的通话连接1进行呼叫保持(call hold),也即,控制通话连接1处于呼叫保持状态,并向该多媒体会议系统中的其他终端发送被呼叫提示音(或称为呼叫保持提示音),例如,该被呼叫提示音为“您所拨打的用户正在通话中”,该其他终端会播放该被呼叫提示音以提示该其他终端的用户。
发明内容
本申请提供了一种多媒体会议的控制方法及装置、通信系统,有助于避免电话终端被呼叫的噪音(例如被呼叫提示音)影响多媒体会议的开展。本申请的技术方案如下:
第一方面,提供了一种多媒体会议的控制方法,该方法包括:目标网关通过电话交换系统接收发往第一终端的音频流;目标网关对该音频流进行去噪处理,该去噪处理为去除该音频流中第二终端被呼叫的噪音。第二终端被呼叫的噪音可以是第二终端被除第一终端之外的终端(例如第三终端)呼叫的被呼叫提示音,该被呼叫提示音对于第一终端而言属于噪音。
本申请提供的技术方案,由于目标网关通过电话交换系统接收发往第一终端的音频流之后,目标网关去除了该音频流中第二终端被呼叫的噪音,因此可以避免第二终端被呼叫的噪音干扰第一终端,从而避免第二终端被呼叫的噪音影响多媒体会议的开展。
可选的,在目标网关对该音频流进行去噪处理之前,该方法还包括:目标网关根据该音 频流的特征参数,确定第二终端处于被呼叫状态。例如,目标网关根据该音频流的特征参数,确定第二终端处于被除第一终端之外的终端(例如第三终端)呼叫的状态。
本申请提供的技术方案,由于目标网关可以根据音频流的特征参数确定第二终端处于被呼叫状态,也即,目标网关可以根据音频流的特征参数感知第二终端处于被呼叫状态,因此目标网关可以对该音频流进行去噪处理,以去除该音频流中第二终端被呼叫的噪音。
可选的,特征参数包括以下至少一种:音频流的音频特征、音频流的数据包的数据特征。
可选的,特征参数包括该音频流的音频特征,该音频特征包括该音频流包含的语音片段,目标网关根据该音频流的特征参数,确定第二终端处于被呼叫状态,包括:目标网关比较该音频流包含的语音片段与指定语音片段,确定第二终端处于被呼叫状态,该指定语音片段用于描述第二终端处于被呼叫状态。例如,目标网关比较该音频流包含的语音片段与指定语音片段,确定该音频流包含的语音片段包括该指定语音片段,因此目标网关确定第二终端处于被呼叫状态。再例如,目标网关比较该音频流包含的语音片段与指定语音片段,确定该音频流包含的语音片段与该指定语音片段的相似度大于相似度阈值,因此目标网关确定第二终端处于被呼叫状态。
可选的,该方法还包括:目标网关向音频识别设备发送该音频流,音频识别设备用于对该音频流进行音频识别,得到该音频流的音频特征;目标网关接收该音频识别设备发送的该音频流的音频特征。
本申请提供的技术方案,目标网关向音频识别设备发送音频流,可以便于音频识别设备对该音频流进行音频识别,得到该音频流的音频特征,音频识别设备向目标网关发送该音频流的音频特征,可以便于目标网关获取该音频流的音频特征。
可选的,特征参数包括该音频流的数据包的数据特征,目标网关根据该音频流的特征参数,确定第二终端处于被呼叫状态,包括:目标网关根据该音频流的数据包包含第一标识,确定第二终端处于被呼叫状态,第一标识用于指示第二终端处于被呼叫状态。
可选的,在目标网关对该音频流进行去噪处理之前,该方法还包括:目标网关接收目标信令消息;目标网关根据该目标信令消息包含指定信令信息,确定第二终端处于被呼叫状态,该指定信令信息用于指示第二终端处于被呼叫状态。
本申请提供的技术方案,由于目标网关根据目标信令消息包含指定信令信息确定第二终端处于被呼叫状态,也即,目标网关可以根据目标信令消息包含指定信令信息感知第二终端处于被呼叫状态,因此目标网关确定发往第一终端的音频流可能包含第二终端被呼叫的噪音,目标网关对发往第一终端的音频流进行去噪处理,以去除该音频流中第二终端被呼叫的噪音。
可选的,去噪处理包括:对音频流进行拦截。也即,目标网关不向第一终端转发音频流。
本申请提供的技术方案,目标网关对音频流进行拦截,可以避免该音频流到达第一终端,从而避免第一终端播放该音频流,进而避免该音频流中第二终端被呼叫的噪音对第一终端产生干扰。
可选的,去噪处理包括:将音频流的数据包替换为静音包,并向第一终端发送该静音包。
其中,静音包满足以下任一种;不包括音频数据、包括音频数据且音频数据无法引发物理声音感知。例如,静音包是按照音频协议、格式封装的数据包,并且静音包的有效载荷为空。静音包的有效载荷为空可以是静音包不包括有效载荷,或者是静音包包括有效载荷但有效载荷中的数据为0,静音包播放出来没有任何声音,无法引发物理声音感知。
本申请提供的技术方案,由于目标网关将音频流的数据包替换为静音包并向第一终端发送该静音包,也即目标网关未向第一终端发送该音频流,因此,可以避免该音频流到达第一终端,从而避免第一终端播放该音频流,进而避免该音频流中第二终端被呼叫的噪音对第一终端产生干扰。
可选的,去噪处理包括:在音频流的数据包中添加第二标识,并向第一终端发送该音频流的数据包,第二标识用于指示第一终端不播放该音频流。
本申请提供的技术方案,由于目标网关在音频流的数据包中添加第二标识,并向第一终端发送该音频流的包括第二标识的数据包,因此第一终端接收到该音频流之后,第一终端不播放该音频流,可以避免该音频流中第二终端被呼叫的噪音对第一终端产生干扰。
第二方面,提供了一种多媒体会议的控制装置,包括用于执行如上述第一方面或第一方面的任一可选方式所提供的方法的各个模块。所述模块可以基于软件、硬件或软件和硬件的结合实现,且所述模块可以基于具体实现进行任意组合或分割。
第三方面,提供了一种多媒体会议的控制装置,包括存储器和处理器;存储器用于存储计算机程序;处理器用于执行存储器中存储的计算机程序以使得该控制装置执行如上述第一方面或第一方面的任一可选方式所提供的方法。
第四方面,提供了一种通信系统,该通信系统包括目标网关、第一终端和第二终端,第一终端与目标网关通信连接,第二终端通过电话交换系统与目标网关通信连接,目标网关包括如上述第二方面或第三方面所提供的多媒体会议的控制装置。
第五方面,提供了一种计算机可读存储介质,该计算机可读存储介质中存储有计算机程序,该计算机程序被执行时实现如上述第一方面或第一方面的任一可选方式所提供的方法。
第六方面,提供了一种计算机程序产品,该计算机程序产品包括程序或代码,该程序或代码被执行时实现如上述第一方面或第一方面的任一可选方式所提供的方法。
第七方面,提供了一种芯片,该芯片包括可编程逻辑电路和/或程序指令,该芯片运行时用于实现如上述第一方面或第一方面的任一可选方式所提供的方法。
本申请提供的技术方案带来的有益效果是:
本申请提供的多媒体会议的控制方法及装置、通信系统,该通信系统包括目标网关、第一终端和第二终端,第一终端与目标网关通信连接,第二终端通过电话交换系统与目标网关通信连接,目标网关通过电话交换系统接收发往第一终端的音频流之后,目标网关对该音频流进行去噪处理,以去除该音频流中第二终端被呼叫的噪音,因此可以避免第二终端被呼叫的噪音干扰第一终端,从而避免第二终端被呼叫的噪音影响多媒体会议的开展。示例的,第二终端与第一终端之间建立有一路通话连接(例如称为通话连接1),在第二终端与第一终端之间建立有通话连接1的基础上,若第二终端基于各种可能的原因与第三终端建立通话连接2(或者说第二终端需要与第三终端建立通话连接2,其中,第三终端可以是第一终端之外的任意终端),例如,第二终端呼叫第三终端,或者第二终端接听第三终端的电话呼叫,或者第二终端与第三终端进行通话等均会使得第二终端与第三终端建立通话连接2,则电话交换系统会对通话连接1进行呼叫保持,并且电话交换系统会向第一终端发送被呼叫提示音,该被呼叫提示音容易对第一终端产生干扰,并且该被呼叫提示音是由于第二终端建立了新的通话连接2而产生的,因此该被呼叫提示音对于第一终端而言属于噪音,该被呼叫提示音可以称为第二终端被呼叫的噪音。由于目标网关去除了电话交换系统发往第一终端的音频流中 第二终端被呼叫的噪音,所以可以避免第二终端被呼叫的噪音干扰第一终端。
附图说明
图1是本申请实施例提供的一种应用场景的示意图;
图2是本申请实施例提供的另一种应用场景的示意图;
图3是本申请实施例提供的再一种应用场景的示意图;
图4是本申请实施例提供的一种多媒体会议的控制方法的流程图;
图5是本申请实施例提供的另一种多媒体会议的控制方法的流程图;
图6是本申请实施例提供的一种第二终端接入多媒体会议系统的流程图;
图7是本申请实施例提供的一种第二终端处于被呼叫状态的流程图;
图8是本申请实施例提供的另一种第二终端处于被呼叫状态的流程图;
图9是本申请实施例提供的一种第二终端取消被呼叫状态的流程图;
图10是本申请实施例提供的一种多媒体会议的控制装置的结构示意图;
图11是本申请实施例提供的另一种多媒体会议的控制装置的结构示意图。
具体实施方式
下面将结合附图对本申请实施方式作进一步地详细描述。
在开展多媒体会议时,经常会有一些手机、固话终端等电话终端通过电话交换网络接入多媒体会议系统。例如,电话交换网络为PSTN或基于PBX构建的私有网络。PSTN是为公共用户提供电话业务的电信网络,PSTN中包括接入系统、电话交换机以及中继等,PSTN也称为普通老式电话业务(plain old telephone service,POTS)。PBX是基于计算机的数字电话交换机,PBX可以接入公共电话交换网,通常供企业使用。在一些实施场景中,电话交换网络也称为电话交换系统、电话接入系统等。示例的,电话终端依次通过互联网协议多媒体子系统(internet protocol multimedia subsystem,IMS)、接入网关与多媒体会议系统中的媒体网关通信连接,媒体网关可以对来自该电话终端或该IMS的音频流进行编解码,并向该多媒体会议系统中的其他终端(例如会议终端)发送该音频流,使得该多媒体会议系统中的其他终端能够播放来自该电话终端或来自该IMS的音频流。
电话终端接入多媒体会议系统之后,该电话终端与该多媒体会议系统中的其他终端(例如会议终端)之间建立一路通话连接(例如称为通话连接1),通话连接1包括该电话终端与电话交换网络之间的通话连接以及该电话交换网络与该其他终端之间的通话连接,该电话终端与该其他终端之间传输的媒体流(例如音频流)经由该电话交换网络转发。随着第四代移动通信技术(the 4 generation mobile communication technology,4G)网络、第五代移动通信技术(5th generation mobile communication technology,5G)网络的普及,越来越多的手机开通了VoLTE IP通话,支持多方通话功能,若接入多媒体会议系统的某一电话终端开展其他电话业务,例如接听多媒体会议系统外的终端的电话呼叫、向多媒体会议系统外的终端拨打电话等,电话交换网络会对该电话终端与该多媒体会议系统中的其他终端之间的通话连接1进行呼叫保持(也即,控制通话连接1处于呼叫保持状态),并向该其他终端发送被呼叫提示音,例如该被呼叫提示音为“您所拨打的用户正在通话中”等。该其他终端会播放该被呼叫提示音以提示该其他终端的用户。但是,该被呼叫提示音容 易对该其他终端造成干扰,影响多媒体会议的开展。其中,呼叫保持是一种类业务,这类业务允许将已经建立的通话连接(例如前述通话连接1)保持,即,停止主叫终端与被叫终端之间媒体流(例如音频流)的传输,但不释放会话资源,不拆除通话连接,在呼叫保持结束时或者基于其他需求,可以恢复(或称为重新激活)这个通话连接。
示例的,电话终端接入多媒体会议系统之后,该电话终端的用户可以操作该电话终端的静音键或者基于该多媒体会议系统提供的其他静音方式,控制该电话终端本地静音。该电话终端本地静音之后,该电话终端不向该多媒体会议系统中发送媒体流,但是该电话终端与该多媒体会议系统中的其他终端之间的通话连接(例如通话连接1)仍然存在。本地静音的电话终端在接收到新的电话呼叫时,该电话终端的用户第一反应觉得该电话终端在多媒体会议系统中静音了,可能就接听了新的电话呼叫。该电话终端接听新的电话呼叫后,该电话终端与该新的电话呼叫的呼叫方建立通话连接(例如通话连接2)。按照现有机制,电话交换网络默认该电话终端存在两路通话连接,既然该电话终端接听了新的电话呼叫(即建立了通话连接2),电话交换网络会对通话连接1进行呼叫保持,并循环向该多媒体会议系统中的其他终端发送被呼叫提示音(或称为呼叫保持提示音),该多媒体会议系统中的其他终端会循环播放该被呼叫提示音,严重干扰多媒体会议的开展。
本申请实施例提供一种多媒体会议的控制方法、装置及通信系统。该通信系统包括目标网关、第一终端和第二终端,第一终端与目标网关通信连接,第二终端通过电话交换系统与目标网关通信连接,例如目标网关是多媒体会议系统中的媒体网关,第一终端是该多媒体会议系统中的会议终端,第二终端是接入该多媒体会议系统的电话终端。目标网关通过电话交换系统接收发往第一终端的音频流之后,目标网关对该音频流进行去噪处理,以去除该音频流中第二终端被呼叫的噪音,例如第二终端被呼叫的噪音为第二终端的被呼叫提示音。由于目标网关去除了发往第一终端的音频流中第二终端被呼叫的噪音,因此可以避免第二终端被呼叫的噪音干扰第一终端,从而避免第二终端被呼叫的噪音影响多媒体会议的开展。
下面结合附图对本申请的技术方案进行介绍。首先介绍本申请的应用场景。
请参考图1,其示出了本申请实施例提供的一种应用场景的示意图。该应用场景提供一种通信系统,该通信系统包括目标网关、第一终端和第二终端。第一终端与目标网关通信连接,第二终端通过电话交换系统与目标网关通信连接。例如,该通信系统中包括多媒体会议系统和电话交换系统,目标网关和第一终端都可以位于该多媒体会议系统中。
目标网关可以通过电话交换系统接收发往第一终端的音频流,并对该音频流进行去噪处理,以去除该音频流中第二终端被呼叫的噪音。可选的,目标网关首先确定第二终端处于被呼叫状态,然后目标网关去除发往第一终端的音频流中第二终端被呼叫的噪音。本申请所述的“第二终端处于被呼叫状态”指的是第二终端与第一终端之外的终端建立通话连接,并且该通话连接被激活的状态,例如,“第二终端处于被呼叫状态”可以指第二终端处于被第一终端之外的终端(例如第三终端,第三终端可以是除第一终端之外的任一终端)呼叫的状态,或者第二终端处于呼叫第一终端之外的终端(例如第三终端,第三终端可以是除第一终端之外的任一终端)的状态。示例的,第二终端接入多媒体会议系统之后,第二终端与该多媒体会议系统中的第一终端之间建立一路通话连接(例如称为通话连接1),在通话连接1处于 激活状态时,若第二终端基于各种可能的原因与第三终端建立通话连接2,第二终端处于被呼叫状态(第二终端处于被第三终端呼叫的状态),则电话交换系统会对通话连接1进行呼叫保持(也即,控制通话连接1处于呼叫保持状态)。因此,在本申请中,目标网关可以通过判断第二终端与第一终端之间的通话连接1是否处于呼叫保持状态,来判断第二终端是否处于被呼叫状态(例如判断第二终端是否处于被第三终端呼叫的状态);例如,如果目标网关确定第二终端与第一终端之间的通话连接1处于呼叫保持状态,则目标网关确定第二终端处于被呼叫状态;如果目标网关确定第二终端与第一终端之间的通话连接1未处于呼叫保持状态,则目标网关确定第二终端未处于被呼叫状态。其中,发往第一终端的音频流可以是第二终端发往第一终端的音频流,也可以是电话交换系统发往第一终端的音频流。例如,在第二终端处于被呼叫状态时,第二终端与第一终端之间的通话连接处于呼叫保持状态,电话交换系统可以向第一终端发送被呼叫提示音,则发往第一终端的音频流可以是电话交换系统发往第一终端的该被呼叫提示音。
在本申请实施例中,通信系统可以包括至少一个会议终端和至少一个电话终端;该至少一个会议终端与目标网关通信连接,并且该至少一个会议终端和该目标网关都位于多媒体会议系统中;该至少一个电话终端通过电话交换系统与目标网关通信连接,该至少一个电话终端可以通过电话交换系统接入多媒体会议系统。其中,当该至少一个电话终端为多个电话终端时,该多个电话终端可以通过一个电话交换系统与目标网关通信连接,也可以通过至少两个电话交换系统与目标网关通信连接。其中,会议终端指的是通过会议应用(或称为会议客户端)接入该多媒体会议系统的终端,会议终端可以是手机、上网本、笔记本电脑、平板电脑等。电话终端指的是通过电话交换系统接入该多媒体会议系统的终端,电话终端可以是手机、固话终端等。电话交换系统可以是PSTN或基于PBX构建的私有网络,电话交换网络也称为电话交换系统、电话接入系统等。第二终端可以是该至少一个电话终端中的任意一个,第一终端的数量可以是一个或多个,例如该至少一个会议终端均为第一终端。
多媒体会议系统中通常包括媒体网关、信令网关、媒体服务器、会议终端等设备,媒体服务器可以是选择性转发单元(selective forwarding unit,SFU)。本申请所述的目标网关可以是媒体网关,或者,是集成有媒体网关、信令网关和媒体服务器这三者的功能的网关,例如,目标网关是集成有信令网关和媒体服务器这两者的功能的媒体网关。可选的实施例中,电话交换系统通过接入网关与目标网关通信连接,该接入网关用于电话交换系统接入目标网关,从而使得该电话交换系统接入多媒体会议系统。根据电话交换系统的类型的不同,该接入网关可以是PSTN接入网关或PBX,本申请实施例对此不做限定。
作为本申请的一个示例,请参考图2,其示出了本申请实施例提供的另一种应用场景的示意图。该应用场景以媒体网关集成有信令网关和媒体服务器的功能(也即,目标网关是集成有信令网关和媒体服务器这两者的功能的媒体网关)为例说明。如图2所示,该应用场景提供的通信系统包括多媒体会议系统、电话交换系统101、接入网关102和至少一个电话终端103(图2以一个电话终端103为例说明)。该多媒体会议系统中包括媒体网关104和至少一个会议终端105(图2以两个会议终端105为例说明),该至少一个会议终端105与媒体网关104通信连接。电话终端103、电话交换系统101、接入网关102依次连接,接入网关102与媒体网关104通信连接,接入网关102用于电话交换系统101接入媒体网关104,从而使得与该电话交换系统101通信连接的电话终端103接入多媒体会议系统,接入网关102 用于在电话交换系统101与媒体网关104之间进行媒体流(例如音频流)转发。其中,第一终端可以是会议终端105,第二终端可以是电话终端103,图2所示的通信系统包括两个第一终端。媒体网关104可以通过电话交换系统101和接入网关102接收发往会议终端105的音频流,并对该音频流进行去噪处理,以去除该音频流中电话终端103被呼叫的噪音。可选的,媒体网关104先确定电话终端103处于被呼叫状态(例如确定电话终端103处于被除会议终端105之外的终端呼叫的状态),之后媒体网关104对发往会议终端105的音频流进行去噪处理,以去除该音频流中电话终端103被呼叫的噪音。
可选的实施例中,多媒体会议系统还包括会议管理设备106,会议管理设备106用于管理多媒体会议所用的媒体资源(例如会议号等),控制电话终端、会议终端等接入多媒体会议,控制媒体流的路由,进行媒体流的调度指示等。例如,会议管理设备106与媒体网关104通信连接,会议管理设备106可以向媒体网关104发送检测指示信息(例如该检测指示信息包括电话终端103的电话号码和检测标识),以指示媒体网关104对发往会议终端105的与电话终端103相关的媒体流(例如音频流)进行检测,从而媒体网关104在电话终端103处于被呼叫状态时,对发往会议终端105的与电话终端103相关的音频流进行去噪处理,以去除该音频流中电话终端103被呼叫的噪音。其中,电话终端103处于被呼叫状态时,电话终端103与会议终端105之间的通话连接处于呼叫保持状态。换句话说,电话终端103与会议终端105之间的通话连接处于呼叫保持状态,可能是由于电话终端103开展了其他电话业务,例如,电话终端103接收了其他终端的电话呼叫,电话终端103处于被该其他终端呼叫的状态。需要说明的是,图2为了简洁,仅示出了会议管理设备106与媒体网关104的连接关系,会议管理设备106还可以与会议终端105、电话终端103等连接,会议管理设备106具体与哪些设备或终端连接可以根据实际需要设置,本申请实施例对此不做限定。
在一个可选实施例中,电话终端103处于被呼叫状态(例如电话终端103处于被会议终端105之外的终端呼叫的状态)时,电话终端103与会议终端105之间的通话连接处于呼叫保持状态,电话交换系统101可以触发接入网关102向媒体网关104发送通知消息,媒体网关104根据该通知消息确定电话终端103与会议终端之间的通话连接处于呼叫保持状态,从而确定电话终端103处于被呼叫状态。其中,该通知消息可以是会话初始协议(session initiation protocol,SIP)信令、私有协议信令或携带特殊标识或特殊信息的媒体数据包。
在另一个可选实施例中,媒体网关104根据发往会议终端105的与电话终端103相关的音频流的特征参数确定电话终端103处于被呼叫状态。例如,电话终端103处于被呼叫状态(例如电话终端103处于被会议终端105之外的终端呼叫的状态)时,电话终端103与会议终端105之间的通话连接处于呼叫保持状态时,电话交换系统101可以向多媒体会议系统发送电话终端103的被呼叫提示音(或者说是对电话终端103与会议终端105之间的通话连接进行呼叫保持的呼叫保持提示音),也即,电话交换系统101可以向多媒体会议系统发送音频流;媒体网关104可以根据该音频流的特征参数确定电话终端103与会议终端105之间的通话连接处于呼叫保持状态,从而确定电话终端103处于被呼叫状态。示例的,该音频流的特征参数包括以下至少一种:该音频流的音频特征、该音频流的数据包的数据特征。作为一个示例,媒体网关104根据该音频流的数据包的数据特征确定电话终端103与会议终端105之间的通话连接处于呼叫保持状态,从而确定电话终端103处于被呼叫状态。例如,媒体网关104根据该音频流的数据包包含第一标识确定电话终端103与会议终端105之间的通话连 接处于呼叫保持状态,从而确定电话终端103处于被呼叫状态。作为另一个示例,媒体网关104根据该音频流的音频特征确定电话终端103与会议终端105之间的通话连接处于呼叫保持状态,从而确定电话终端103处于被呼叫状态。例如,媒体网关104根据该音频流包括指定语音片段确定电话终端103与会议终端105之间的通话连接处于呼叫保持状态,从而确定电话终端103处于被呼叫状态;或者,媒体网关104根据该音频流包括的语音片段与指定语音片段的相似度大于相似度阈值确定电话终端103与会议终端105之间的通话连接处于呼叫保持状态,从而确定电话终端103处于被呼叫状态。
可选的实施例中,所述通信系统还包括音频识别设备107,音频识别设备107与媒体网关104通信连接,媒体网关104可以将发往会议终端105的与电话终端103相关的音频流发送给音频识别设备107,音频识别设备107可以对媒体网关104发送的音频流进行音频识别得到该音频流的音频特征,并将该音频流的音频特征发送给媒体网关104,使得媒体网关104根据该音频流的音频特征确定电话终端103处于被呼叫状态。例如,媒体网关104将该音频流解码成音频裸码流文件,通常是脉冲编码调制(pulse code modulation,PCM)文件,并将该音频流的PCM文件发送给音频识别设备107,音频识别设备107根据该音频流的PCM文件对该音频流进行音频识别。其中,音频识别设备107可以是自动语音识别(automatic speech recognition,ASR)设备,也可以是其他的音频识别设备。
在图2所示的通信系统中,媒体网关104集成有信令网关和媒体服务器的功能(也即,媒体网关104包括媒体网关的功能、信令网关的功能和媒体服务器的功能)。可选的,媒体网关104包括信令模块、媒体模块、音频处理模块、存储模块等,信令模块用于媒体网关104与会议管理设备106、接入网关102等进行信令交互,例如,信令模块用于媒体网关104接收会议管理设备106发送的调度信令,并向会议管理设备106上报会议终端105、电话终端103等的接入情况,以及,信令模块还用于媒体网关104与接入网关102进行SIP信令或者私有协议信令的交互;媒体模块用于媒体网关104与会议终端105、接入网关102等进行媒体交互,例如,媒体模块用于媒体网关104基于实时传输协议(real-time transport protocol,RTP)接收接入网关102发送的音频流,并基于RTP向会议终端105发送该音频流,以及,基于RTP接收任一会议终端105发送的媒体流,并基于RTP向接入网关102以及其他会议终端105发送该媒体流;音频处理模块用于媒体网关104将发往会议终端105的音频流解码成PCM文件,并向音频识别设备107发送该音频流的PCM文件,以及,接收音频识别设备107发送的音频特征,根据音频识别设备107发送的音频特征确定电话终端103与会议终端105之间的通话连接处于呼叫保持状态;存储模块用于存储指定音频特征(例如指定语音片段或相似度阈值),该指定音频特征可以以文本的方式存储在存储模块中。其中,音频识别设备107可以包括音频收发模块和音频识别模块,音频收发模块用于音频识别设备107接收媒体网关104发送的音频流,并将该音频流提供给音频识别模块进行分析识别,音频识别模块用于对接收到的音频流进行分析识别,得到该音频流的音频特征。接入网关102可以包括信令模块和媒体模块,该信令模块用于接入网关102与电话交换系统101、媒体网关104等进行信令交互,该媒体模块用于接入网关102与电话交换系统101、媒体网关104等进行媒体交互,例如,该媒体模块用于接入网关102与电话交换系统101、媒体网关104等基于RTP进行媒体交互。本申请关于媒体网关104、音频识别设备107、接入网关102等的功能模块的划分仅仅是示例性的,媒体网关104、音频识别设备107、接入 网关102还可能包括其他功能模块,本申请实施例对此不做限定。
图2以媒体网关集成有信令网关和媒体服务器的功能为例说明。在一些实施例中,信令网关、媒体网关、媒体服务器中的至少两个独立部署。例如,请参考图3,其示出了本申请实施例提供的再一种应用场景的示意图。该应用场景以信令网关、媒体网关、媒体服务器是三台独立的设备为例说明。与图2所示应用场景不同的是,图3所示的应用场景中,多媒体会议系统中还包括信令网关108和媒体服务器109,信令网关108与接入网关102、会议管理设备106分别连接,信令网关108用于在接入网关102和会议管理设备106之间进行信令交互,媒体服务器109与媒体网关104、会议终端105分别连接,媒体服务器109用于在媒体网关104与会议终端105之间进行媒体交互。对于图3所示的应用场景,媒体网关104可以包括媒体模块和音频处理模块,而不包括信令模块,信令相关的处理功能可以由信令网关108实现,本申请实施例对此不做限定。
图1至图3所示的应用场景仅用于举例,并非用于限制本申请技术方案。在实现过程中,可以根据需要配置会议终端、电话终端的数量,并且该应用场景还可能包括其他的设备,或者该应用场景包括比图1至图3所示更少的设备,本申请实施例对此不做限定。
以上是对本申请的应用场景的介绍,下面介绍本申请的方法实施例。
请参考图4,其示出了本申请实施例提供的一种多媒体会议的控制方法的流程图。该多媒体会议的控制方法应用于目标网关。例如,目标网关是图2或图3中的媒体网关。如图4所示,该多媒体会议的控制方法包括如下S401至S402。
S401.目标网关通过电话交换系统接收发往第一终端的音频流A。
在本申请实施例中,目标网关与第一终端通信连接,并且目标网关通过电话交换系统第二终端通信连接,目标网关可以通过电话交换系统接收发往第一终端的音频流A,音频流A是与第二终端相关的音频流,音频流A可以携带第二终端的标识,例如该音频流A携带第二终端的流号。第二终端的流号是为第二终端分配的流号,例如是会议管理设备为第二终端分配的流号,或者,第二终端的流号是目标网关与第二终端、电话交换系统、接入网关等进行信令协商确定的第二终端的流号。示例的,图2或图3所示,目标网关是媒体网关104,第一终端是会议终端105,第二终端是电话终端103,目标网关(即媒体网关104)通过电话交换系统101和接入网关102接收发往会议终端105的音频流A,音频流A与电话终端103相关,音频流A至少经过接入网关102的转发到达目标网关(即媒体网关104)。
在本申请实施例中,音频流A可以是第二终端发往第一终端的音频流,也可以是电话交换系统发往第一终端的音频流。也即,音频流A可以来自第二终端,也可以来自电话交换系统。换句话说,音频流A可以是第二终端生成的,也可以是电话交换系统生成的。示例的,第二终端接入多媒体会议系统之后,第二终端与第一终端之间建立一路通话连接1,在通话连接1处于激活状态时,第二终端可以进行音频流采集(例如采集第二终端的用户说话的声音,采集第二终端所处环境中的声音等),并通过通话连接1向第一终端发送音频流,在这种场景下,音频流A可以是第二终端发往第一终端的音频流。再示例的,第二终端接入多媒体会议系统之后,第二终端与第一终端之间建立一路通话连接1,在通话连接1处于激活状态时,若第二终端因开展其他电话业务而与其他终端(例如第三终端)建立通话连接2,第二终端处于被呼叫状态(例如第二终端处于被第三终端呼叫的状态),电话交换系统会对通 话连接1进行呼叫保持(也即,控制通话连接1处于呼叫保持状态,呼叫保持状态也可以称为去激活状态),电话交换系统可以向第一终端发送第二终端的被呼叫提示音(或称为通话连接1的呼叫保持提示音),也即,电话交换系统可以向第一终端发送音频流,在这种场景下,音频流A可以是电话交换系统发往第一终端的音频流。
可选的实施例中,音频流A是第二终端处于被呼叫状态,通话连接1处于呼叫保持状态时,电话交换系统发往第一终端的音频流,目标网关接收到的音频流A的数据包可以携带第一标识,第一标识用于指示第二终端处于被呼叫状态。例如,第一标识用于指示通话连接1处于呼叫保持状态,从而指示第二终端处于被呼叫状态。其中,第一标识可以是电话交换系统在音频流A的数据包中携带的,也可以是接入网关在接收到的音频流A的数据包中添加的。示例的,第二终端的被呼叫提示音(或称为通话连接1的呼叫保持提示音)可以是“您所拨打的用户正在通话中”,第一标识可以是“holdflag”,“holdflag”用于指示通话连接1处于呼叫保持状态,从而指示第二终端处于被呼叫状态。
S402.目标网关对音频流A进行去噪处理,该去噪处理为去除音频流A中第二终端被呼叫的噪音。
可选的实施例中,音频流A中包括第二终端的被呼叫提示音,在第二终端处于被呼叫状态时,第二终端与第一终端之间的通话连接1处于呼叫保持状态,第二终端未在多媒体会议系统中开展多媒体会议(也即第二终端与多媒体会议系统中的第一终端之间不进行媒体传输),但是第一终端可能仍然在多媒体会议系统中开展多媒体会议,若音频流A到达第一终端且被第一终端播放,则音频流A中第二终端的被呼叫提示音容易对第一终端产生干扰,因此该被呼叫提示音对于第一终端而言可以称为噪音。例如音频流A是第二终端的被呼叫提示音,音频流A对于第一终端而言可以称为噪音。
在本申请实施例中,目标网关可以对音频流A进行去噪处理,以去除音频流A中第二终端被呼叫的噪音(也即第二终端的被呼叫提示音,或称为通话连接1的呼叫保持提示音),这样可以避免音频流A中第二终端被呼叫的噪音对第一终端产生干扰。
可选的实施例中,目标网关对音频流A进行去噪处理包括以下三种可能的实现方式。
第一种实现方式:目标网关对音频流A进行拦截。
也即,目标网关不向第一终端转发音频流A。示例的,目标网关丢弃音频流A的数据包。目标网关对音频流A进行拦截,可以避免音频流A到达第一终端,从而避免第一终端播放音频流A,进而避免音频流A中第二终端被呼叫的噪音对第一终端产生干扰。
第二种实现方式:目标网关将音频流A的数据包替换为静音包,并向第一终端发送该静音包。示例的,如图3所示,目标网关是媒体网关104,第一终端是会议终端105,目标网关(也即媒体网关104)通过媒体服务器109向第一终端(也即会议终端105)发送静音包。
其中,静音包满足以下任一种;不包括音频数据、包括音频数据且音频数据无法引发物理声音感知。例如,静音包是按照音频协议、格式封装的数据包,并且静音包的有效载荷为空。静音包的有效载荷为空可以是静音包不包括有效载荷,或者是静音包包括有效载荷但有效载荷中的数据为0,静音包播放出来没有任何声音,无法引发物理声音感知。
由于目标网关将音频流A的数据包替换为静音包并向第一终端发送该静音包,因此目标网关未向第一终端发送音频流A,可以避免音频流A到达第一终端,从而避免第一终端播放音频流A,进而避免音频流A中第二终端被呼叫的噪音对第一终端产生干扰。
第三种实现方式:目标网关在音频流A的数据包中添加第二标识,并向第一终端发送音频流A的包含第二标识的数据包,第二标识用于指示第一终端不播放音频流A。
由于第一终端接收到的音频流A的数据包包含第二标识,第二标识用于指示第一终端不播放音频流A,因此,第一终端不播放音频流A,可以避免音频流A中第二终端被呼叫的噪音对第一终端产生干扰。
综上所述,本申请实施例提供的多媒体会议的控制方法,目标网关通过电话交换系统接收发往第一终端的音频流之后,目标网关对该音频流进行去噪处理,以去除该音频流中第二终端被呼叫的噪音,因此可以避免第二终端被呼叫的噪音干扰第一终端,从而避免第二终端被呼叫的噪音影响多媒体会议的开展。
目标网关对音频流A进行去噪处理之前,目标网关可以确定第二终端处于被呼叫状态。在本申请实施例中,目标网关确定第二终端处于被呼叫状态可以包括以下两个可选实施例。
一个可选实施例中,请继续参考图4,在S402之前,还包括如下步骤S403a。
S403a.目标网关根据音频流A的特征参数,确定第二终端处于被呼叫状态。
其中,音频流A的特征参数包括以下至少一种:音频流A的音频特征、音频流A的数据包的数据特征。
作为一个示例,音频流A的特征参数包括音频流A的数据包的数据特征,目标网关根据音频流A的数据包的数据特征确定第二终端处于被呼叫状态。可选的,目标网关根据音频流A的数据包包含第一标识确定第二终端处于被呼叫状态,第一标识用于指示第二终端处于被呼叫状态。例如,第一标识用于指示第二终端与第一终端之间的通话连接1处于呼叫保持状态,从而用于指示第二终端处于被呼叫状态,目标网关可以判断音频流A的数据包是否包含第一标识,如果音频流A的数据包包含第一标识,目标网关确定通话连接1处于呼叫保持状态,从而确定第二终端处于被呼叫状态;如果音频流A的数据包不包含第一标识,目标网关确定通话连接1未处于呼叫保持状态,从而确定第二终端未处于被呼叫状态。
作为另一个示例,音频流A的特征参数包括音频流A的音频特征,该音频特征包括音频流A包含的语音片段,目标网关根据音频流A包含的语音片段确定第二终端处于被呼叫状态。可选的,目标网关比较音频流A包含的语音片段与指定语音片段,根据比较结果确定第二终端处于被呼叫状态。例如,目标网关比较音频流A包含的语音片段与指定语音片段,以判断音频流A包含的语音片段是否包括指定语音片段,如果音频流A包含的语音片段包括指定语音片段,目标网关确定第二终端处于被呼叫状态,如果音频流A包含的语音片段不包括指定语音片段,目标网关确定第二终端未处于被呼叫状态。或者,目标网关比较音频流A包含的语音片段与指定语音片段,以判断音频流A包含的语音片段与指定语音片段的相似度是否大于相似度阈值,如果音频流A包含的语音片段与指定语音片段的相似度大于相似度阈值,目标网关确定第二终端处于被呼叫状态,如果音频流A包含的语音片段与指定语音片段的相似度不大于相似度阈值,目标网关确定第二终端未处于被呼叫状态。其中,指定语音片段用于描述第二终端处于被呼叫状态,例如,该指定语音片段用于描述第二终端与第一终端之间的通话连接1处于呼叫保持状态,从而用于描述第二终端处于被呼叫状态,示例的,该指定语音片段是“您所拨打的用户正在通话中”。
可选的实施例中,目标网关根据音频流A的音频特征确定第二终端处于被呼叫状态之前, 目标网关获取音频流A的音频特征。示例的,目标网关向音频识别设备发送音频流A,并接收音频识别设备发送的音频流A的音频特征,音频识别设备用于对音频流A进行音频识别得到音频流A的音频特征。可选的,目标网关对音频流A进行解码得到音频流A的音频裸码流文件,例如PCM文件,然后目标网关向音频识别设备发送音频流A的PCM文件,音频识别设备根据音频流A的PCM文件进行音频识别得到音频流A的音频特征。例如,音频识别设备通过音频识别模型,根据音频流A的PCM文件进行音频识别,也即,音频识别设备可以将音频流A的PCM文件输入该音频识别模型,该音频识别模型可以对该音频流A的PCM文件进行计算,得到音频流A的音频特征,并输出音频流A的音频特征。
另一个可选实施例中,请参考图5,在S402之前,还包括如下步骤S403b至S404b。
S403b.目标网关接收目标信令消息。
其中,目标信令消息是与第二终端相关的信令消息,目标信令消息可以携带第二终端的标识,例如,目标信令消息携带第二终端的电话号码。示例的,目标网关、接入网关、电话交换系统、第二终端依次连接,目标网关接收接入网关发送的目标信令消息。如图2或图3所示,目标网关是媒体网关104,第二终端是电话终端103,媒体网关104接收接入网关102发送的目标信令消息,该目标信令消息携带电话终端103的电话号码。其中,目标信令消息可以是SIP消息或基于私有协议的信令消息,本申请实施例对此不做限定。
在本申请实施例中,目标信令消息可以是协商消息。示例的,第二终端接入多媒体会议系统之后,第二终端与该多媒体会议系统中的第一终端之间建立一路通话连接1,在通话连接1处于激活状态时,若第二终端因开展其他电话业务而与其他终端建立通话连接2(或者说需要与其他终端建立通话连接2),第二终端处于被呼叫状态(也即,第二终端处于被该其他终端呼叫的状态),第二终端需要对通话连接1进行呼叫保持,因此第二终端向电话交换系统发送第一协商消息,以与电话交换系统协商来对通话连接1进行呼叫保持。电话交换系统接收到第一协商消息之后,电话交换系统可以根据第一协商消息向接入网关发送第二协商消息,以与接入网关协商来对通话连接1进行呼叫保持。接入网关接收到第二协商消息之后,接入网关可以根据第二协商消息向目标网关发送目标信令消息,以与接入网关协商来对通话连接1进行呼叫保持。目标网关可以接收接入网关发送的目标信令消息。其中,第一协商消息、第二协商消息以及目标信令消息均携带指定信令信息和第二终端的标识,以指示第二终端处于被呼叫状态。
示例的,第一协商消息、第二协商消息、目标信令消息均为SIP消息,第一协商消息、第二协商消息以及目标信令消息均携带指定信令信息“a=sendonly”。“a=sendonly”是征求意见(request for comments,RFC)5359提供的呼叫保持流程中涉及的信令信息,a=sendonly表示在呼叫方与被呼叫方之间的通话连接处于呼叫保持状态时被呼叫方仍有媒体发送,例如,表示在呼叫方与被呼叫方之间的通话连接处于呼叫保持状态时通过语音向呼叫方提示该被呼叫方处于被呼叫状态(或者说呼叫方与被呼叫方之间的通话连接处于呼叫保持状态),因此,目标信令消息携带的指定信令信息“a=sendonly”可以指示第二终端与第一终端之间的通话连接处于呼叫保持状态,从而指示第二终端处于被呼叫状态。在本申请实施例中,第一协商消息、第二协商消息、目标信令消息可以是同一条信令消息,也可以是三条不同的信令消息。可以理解,如果第一协商消息、第二协商消息、目标信令消息是同一条信令消息,该一条信 令消息来自第二终端,电话交换系统、接入网关在接收到该信令消息时,可以根据该信令消息进行相关处理,并转发该信令消息。
S404b.目标网关根据目标信令消息包含指定信令信息,确定第二终端处于被呼叫状态,指定信令信息用于指示第二终端处于被呼叫状态。
可选的,目标网关判断目标信令消息是否包含指定信令信息;如果目标信令消息包含指定信令信息,目标网关确定第二终端处于被呼叫状态;如果被呼叫状态不包含指定信令信息,目标网关确定第二终端未处于被呼叫状态。其中,该指定信令信息用于指示第二终端处于被呼叫状态,例如,该指定信令信息为a=sendonly,该指定信令信息用于指示第二终端与第一终端之间的通话连接1处于呼叫保持状态,从而用于指示第二终端处于被呼叫状态。
图5所示实施例以接入网关通过目标信令消息向目标网关通知第二终端处于被呼叫状态为例说明,接入网关还可以通过其他方式向目标网关通知第二终端处于被呼叫状态。例如,接入网关采用接口回调或发布订阅方式向目标网关通知第二终端处于被呼叫状态。也即,接入网关可以调用与目标网关通信的接口向目标网关通知第二终端处于被呼叫状态,或者,在目标网关向接入网关订阅了相关通知的情况下,接入网关向目标网关通知第二终端处于被呼叫状态,本申请实施例对此不做限定。
本申请实施例中,在第二终端与多媒体会议系统中的第一终端之间建立通话连接1,并且通话连接1处于激活状态时,若第二终端基于各种可能的原因与除第一终端之外的终端(例如第三终端)建立通话连接2,则第二终端处于被呼叫状态(例如第二终端处于被第三终端呼叫的状态),第二终端与第一终端之间的通话连接1被呼叫保持(或者说被去激活)。在第二终端与第三终端断开通话连接2或者通话连接2被呼叫保持时,第二终端可以取消被呼叫状态(例如第二终端取消被第三终端呼叫的状态),此时可以重新激活通话连接1,使得第二终端与第一终端可以通过通话连接1进行媒体传输。
一个可选实施例中,第二终端取消被呼叫状态时,第二终端向电话交换系统发送第三协商消息,以与电话交换系统协商取消第二终端与第一终端之间的通话连接1的呼叫保持状态;电话交换系统接收到第三协商消息之后,电话交换系统根据第三协商消息向接入网关发送第四协商消息,以与接入网关协商取消第二终端与第一终端之间的通话连接1的呼叫保持状态;接入网关接收到第四协商消息之后,接入网关根据第四协商消息向目标网关发送第五协商消息,以与目标网关协商取消第二终端与第一终端之间的通话连接1的呼叫保持状态;目标网关根据第五协商消息确定第二终端取消第二终端与第一终端之间的通话连接1的呼叫保持状态,从而确定第二终端取消被呼叫状态。其中,第三协商消息、第四协商消息以及第五协商消息均携带第二终端的标识,且未携带指定信令信息,以指示第二终端取消第二终端与第一终端之间的通话连接1的呼叫保持状态,从而指示第二终端取消被呼叫状态。目标网关确定第二终端取消被呼叫状态之后,目标网关通过电话交换系统接收到发往第一终端的与第二终端相关的音频流时,目标网关向第一终端转发该音频流。
另一个可选实施例中,第二终端取消被呼叫状态后,第二终端未处于被呼叫状态,第二终端可以采集音频流(例如称为音频流B),并通过通话连接1向第一终端发送音频流B,目标网关接收到音频流B之后,目标网关可以根据音频流B的特征参数确定第二终端取消被呼叫状态(或者说确定第二终端未处于被呼叫状态),目标网关可以向第一终端转发音频流 B。示例的,音频流B的特征参数包括以下至少一种:音频流B的音频特征、音频流B的数据包的数据特征,目标网关可以根据音频流B的数据包不包含第一标识,确定第二终端取消被呼叫状态;或者,目标网关可以根据音频流B包含的语音片段不包括指定语音片段,确定第二终端取消被呼叫状态;或者,目标网关可以根据音频流B包含的语音片段与指定语音片段的相似度不大于相似度阈值,确定第二终端取消被呼叫状态。
本申请实施例中,多媒体会议系统还包括会议管理设备,会议管理设备与目标网关(例如媒体网关)通信连接,第二终端接入多媒体会议系统之后,会议管理设备可以向目标网关发送检测指示信息,以指示目标网关对发往第一终端的与第二终端相关的音频流进行检测。目标网关通过对发往第一终端的与第二终端相关的音频流进行检测,可以根据与第二终端相关的音频流的特征参数确定第二终端是否处于被呼叫状态。示例的,该检测指示信息包括第二终端的标识和检测标识,以指示目标网关对第二终端相关的音频流进行检测,该检测指示信息还可以通过其他方式指示目标网关对第二终端相关的音频流进行检测,这里不做限定。
为了便于理解本申请的技术方案,下面以目标网关是媒体网关为例,结合图2中不同设备之间的交互描述本申请的技术方案。
本申请实施例中,第二终端(例如电话终端103)接入多媒体会议系统之后,第二终端与该多媒体会议系统中的第一终端(例如会议终端105)之间建立通话连接1,在通话连接1处于激活状态时,第二终端与第一终端通过通话连接1进行媒体传输。在通话连接1处于激活状态时,若第二终端开展其他电话业务(例如接听第三终端的电话呼叫),第二终端处于被呼叫状态(也即是处于被第三终端呼叫的状态),第二终端与第三终端建立通话连接2,电话交换系统会对通话连接1进行呼叫保持,使通话连接1处于呼叫保持状态。在第二终端处于被呼叫状态时,目标网关(即媒体网关104)可以对发往第一终端的媒体流进行去噪处理,以去除该媒体流中第二终端被呼叫的噪音。第二终端结束其他电话业务之后,第二终端与第三终端之间的通话连接2断开或呼叫保持,第二终端取消被呼叫状态,可以重新激活第二终端与第一终端之间的通话连接1,使得第二终端与第一终端可以通过通话连接1进行媒体传输。因此,本申请实施例的技术方案涉及第二终端接入多媒体会议系统的阶段、第二终端处于被呼叫状态的阶段(或者说第二终端与第一终端之间的通话连接1处于呼叫保持状态的阶段)和第二终端取消被呼叫状态的阶段(或者说第二终端与第一终端之间的通话连接1处于激活状态的阶段)。下面结合附图,分这三个阶段介绍本申请实施例的技术方案。
请参考图6,其示出了本申请实施例提供的一种第二终端接入多媒体会议系统的流程图。如图6所示,第二终端接入多媒体会议系统的过程包括如下步骤S601至S614。
S601.会议管理设备指示目标网关将第二终端接入多媒体会议系统。
可选的,会议管理设备向目标网关发送接入指示信息,以指示目标网关将第二终端接入多媒体会议系统。其中,该接入指示信息可以包括第二终端的标识,例如,该接入指示信息包括第二终端的电话号码或者用于指示第二终端的其他标识信息。
示例的,会议管理设备通过SIP信令或私有协议的信令向目标网关发送接入指示信息。
S602.目标网关根据会议管理设备的指示向接入网关发送呼叫请求1。
目标网关根据会议管理设备的指示确定将第二终端接入多媒体会议系统,因此,目标网关向接入网关发送呼叫请求1,以请求接入网关呼叫第二终端。其中,呼叫请求1包括第二终端的标识,呼叫请求1可以是SIP信令或私有协议的信令。
S603.接入网关向目标网关发送对应于呼叫请求1的呼叫响铃响应1。
接入网关接收到呼叫请求1后,接入网关根据呼叫请求1确定呼叫第二终端,接入网关向目标网关发送对应于呼叫请求1的呼叫响铃响应1,以向目标网关告知该接入网关即将呼叫第二终端,请目标网关等待后续响应。其中,呼叫响铃响应1可以包括第二终端的标识,例如,包括第二终端的电话号码。呼叫响铃响应1还可以包括呼叫响铃的信令内容,例如该呼叫响铃的信令内容为“180”。呼叫响铃响应1可以是SIP信令或私有协议的信令。
S604.接入网关根据呼叫请求1向电话交换系统发送呼叫请求2。
接入网关确定呼叫第二终端之后,接入网关根据呼叫请求1向电话交换系统发送呼叫请求2,以请求电话交换系统呼叫第二终端。其中,呼叫请求2包括第二终端的标识,例如包括第二终端的电话号码。呼叫请求2可以是SIP信令或私有协议的信令。
S605.电话交换系统向接入网关发送对应于呼叫请求2的呼叫响铃响应2。
电话交换系统接收到呼叫请求2后,电话交换系统根据呼叫请求2确定呼叫第二终端,电话交换系统向接入网关发送对应于呼叫请求2的呼叫响铃响应2,以向接入网关告知电话交换系统即将呼叫第二终端,请接入网关等待后续响应。其中,呼叫响铃响应2可以包括第二终端的标识。呼叫响铃响应2还可以包括呼叫响铃的信令内容,例如该呼叫响铃的信令内容为“180”。呼叫响铃响应2可以是SIP信令或私有协议的信令。
S606.电话交换系统根据呼叫请求2向电话交换系统发送呼叫请求3。
电话交换系统确定呼叫第二终端之后,电话交换系统根据呼叫请求2向第二终端发送呼叫请求3,以呼叫第二终端。其中,呼叫请求3包括第二终端的标识,例如包括第二终端的电话号码。呼叫请求3可以是SIP信令或私有协议的信令。
S607.第二终端向电话交换系统发送对应于呼叫请求3的呼叫响铃响应3。
第二终端接收到呼叫请求3后,第二终端向电话交换系统发送对应于呼叫请求3的呼叫响铃响应3,以告知电话交换系统等待后续响应。第二终端还可以根据呼叫请求3进行响铃,以提示第二终端的用户接听电话呼叫。其中,呼叫响铃响应3包括第二终端的标识,例如包括第二终端的电话号码。呼叫响铃响应3还可以包括呼叫响铃的信令内容,例如该呼叫响铃的信令内容为“180”。呼叫响铃响应3可以是SIP信令或私有协议的信令。
S608.第二终端向电话交换系统发送对应于呼叫请求3的呼叫接通响应3。
第二终端确定用户接听电话呼叫之后,第二终端可以向电话交换系统发送对应于呼叫请求3的呼叫接通响应3,以向电话交换系统告知第二终端接通了电话交换系统的电话呼叫。其中,呼叫接通响应3可以包括第二终端的标识,还可以包括呼叫接通的信令内容,例如该呼叫接通的信令内容为“100”。呼叫接通响应3可以是SIP信令或私有协议的信令。
S609.电话交换系统向第二终端发送对应于呼叫接通响应3的接通确认响应3。
电话交换系统接收到呼叫接通响应3之后,电话交换系统向第二终端发送对应于呼叫接通响应3的接通确认响应3,以向第二终端告知电话交换系统接收到了呼叫接通响应3。接通确认响应3可以是SIP信令或私有协议的信令。电话交换系统向第二终端发送接通确 认响应3之后,电话交换系统与第二终端之间成功建立双向的通话连接11。
S610.电话交换系统向接入网关发送对应于呼叫请求2的呼叫接通响应2。
电话交换系统接收到呼叫接通响应3之后,电话交换系统根据呼叫接通响应3向接入网关发送对应于呼叫请求2的呼叫接通响应2,以向接入网关告知电话交换系统接通了接入网关的电话呼叫。呼叫接通响应2可以是SIP信令或私有协议的信令。
S611.接入网关向电话交换系统发送对应于呼叫接通响应2的接通确认响应2。
接入网关接收到呼叫接通响应2之后,接入网关向电话交换系统发送对应于呼叫接通响应2的接通确认响应2,以向电话交换系统告知接入网关接收到了呼叫接通响应2。接通确认响应2可以是SIP信令或私有协议的信令。接入网关向电话交换系统发送接通确认响应2之后,接入网关与电话交换系统之间成功建立双向的通话连接12。
S612.接入网关向目标网关发送对应于呼叫请求1的呼叫接通响应1。
接入网关接收到呼叫接通响应2之后,电话交换系统根据呼叫接通响应2向目标网关发送对应于呼叫请求1的呼叫接通响应1,以向目标网关告知电话接入网关接通了目标网关的电话呼叫。呼叫接通响应1可以是SIP信令或私有协议的信令。
S613.目标网关向接入网关发送对应于呼叫接通响应1的接通确认响应1。
目标网关接收到呼叫接通响应1之后,目标网关向接入网关发送对应于呼叫接通响应1的接通确认响应1,以向接入网关告知目标网关接收到了呼叫接通响应1。接通确认响应1可以是SIP信令或私有协议的信令。目标网关向接入网关发送接通确认响应1之后,目标网关与接入网关之间成功建立双向的通话连接13。
经过步骤S602至S613,第二终端与第一终端之间成功建立通话连接1,第二终端成功接入多媒体会议系统。其中,通话连接1包括电话交换系统与第二终端之间的通话连接11、接入网关与电话交换系统之间的通话连接12、目标网关与接入网关之间的通话连接13,以及,第一终端与目标网关之间的通话连接10。第一终端与目标网关之间的通话连接10是第一终端接入多媒体会议系统时第一终端与目标网关建立的。
S614.目标网关向会议管理设备告知第二终端成功接入多媒体会议系统。
可选的,目标网关向会议管理设备发送第二终端的接入结果,以向会议管理设备告知第二终端成功接入多媒体会议系统。其中,第二终端的接入结果可以为“接入成功”。
第二终端接入多媒体会议系统之后,第二终端与该多媒体会议系统中的第一终端可以通过通话连接1进行音频流的传输。也即,第二终端可以通过通话连接1向第一终端传输音频流,第一终端也可以通过通话连接1向第二终端传输音频流。
请参考图7,其示出了本申请实施例提供的一种第二终端处于被呼叫状态的流程图。图7主要介绍第二终端进入被呼叫状态的流程以及第二终端进入被呼叫状态之后第二终端相关的音频流的处理流程,且图7以目标网关根据信令消息确定第二终端处于被呼叫状态为例说明。如图7所示,第二终端处于被呼叫状态的流程包括如下步骤S701至S715。
S701.第二终端向电话交换系统发送重协商请求1。
第二终端接入多媒体会议系统之后,第二终端与该多媒体会议系统中的第一终端之间建立通话连接1,在通话连接1处于激活状态时,若第二终端因开展其他电话业务而与其他终端建立通话连接2,例如第二终端接听第三终端的电话呼叫,第二终端处于被呼叫状 态,第二终端可以对通话连接1进行呼叫保持。可选的,第二终端向电话交换系统发送重协商请求1,以与电话交换系统协商对通话连接1进行呼叫保持。其中,重协商请求1可以是SIP信令或私有协议的信令,重协商请求1可以包括第二终端的标识,还可以包括指定信令信息,例如该指定信令信息为“a=sendonly”,该指定信令信息用于指示对第二终端与第一终端之间的通话连接1进行呼叫保持,从而指示第二终端处于被呼叫状态。
S702.电话交换系统根据重协商请求1向接入网关发送重协商请求2。
电话交换系统接收到重协商请求1之后,电话交换系统根据重协商请求1确定对第二终端与第一终端之间的通话连接1进行呼叫保持,电话交换系统根据重协商请求1向接入网关发送重协商请求2,以与接入网关协商对通话连接1进行呼叫保持。其中,重协商请求2包括第二终端的标识,还可以包括指定信令信息,例如该指定信令信息为“a=sendonly”。示例的,重协商请求2是SIP信令或私有协议的信令,重协商请求2与重协商请求1是同一条信令。
S703.接入网关根据重协商请求2向目标网关发送重协商请求3。
接入网关接收到重协商请求2之后,接入网关根据重协商请求2确定对第二终端与第一终端之间的通话连接1进行呼叫保持,接入网关根据重协商请求2向目标网关发送重协商请求3,以与目标网关协商对通话连接1进行呼叫保持。其中,重协商请求3包括第二终端的标识,还可以包括指定信令信息,例如该指定信令信息为“a=sendonly”。示例的,重协商请求3是SIP信令或私有协议的信令,重协商请求3与重协商请求2是同一条信令。
S704.目标网关向接入网关发送对应于重协商请求3的重协商响应3。
目标网关接收到重协商请求3之后,目标网关根据重协商请求3确定对第二终端与第一终端之间的通话连接1进行呼叫保持,目标网关向接入网关发送对应于重协商请求3的重协商响应3。其中,重协商响应3包括第二终端的标识,还可以包括呼叫保持的信令内容,例如该呼叫保持的信令内容为“a=reconly”,该呼叫保持的信令内容表示在第二终端与第一终端之间的通话连接1处于呼叫保持状态时(第二终端与第一终端之间的通话连接1处于呼叫保持状态时,第二终端处于被呼叫状态时),第一终端只接收媒体流而不发送媒体流。示例的,重协商响应3可以是SIP信令或私有协议的信令。
S705.接入网关向电话交换系统发送对应于重协商请求2的重协商响应2。
接入网关接收到重协商响应3之后,接入网关根据重协商响应3向电话交换系统发送对应于重协商请求2的重协商响应2。其中,重协商响应2包括第二终端的标识,还可以包括呼叫保持的信令内容,例如该呼叫保持的信令内容为“a=reconly”。示例的,重协商响应2是SIP信令或私有协议的信令,重协商响应2与重协商响应3可以是同一条信令。
S706.电话交换系统向第二终端发送对应于重协商请求1的重协商响应1。
电话交换系统接收到重协商响应2之后,电话交换系统根据重协商响应2向第二终端发送对应于重协商请求1的重协商响应1。其中,重协商响应1包括第二终端的标识,还可以包括呼叫保持的信令内容,例如该呼叫保持的信令内容为“a=reconly”。示例的,重协商响应1是SIP信令或私有协议的信令,重协商响应1与重协商响应2可以是同一条响应。
S707.第二终端向电话交换系统发送对应于重协商响应1的重协商确认1。
第二终端接收到重协商响应1之后,第二终端可以向电话交换系统发送对应于重协商响应1的重协商确认1,以向电话交换系统告知第二终端接收到了重协商响应1。第二终端 向电话交换系统发送重协商确认1之后,第二终端与电话交换系统之间的双向的通话连接11调整为从第二终端到电话交换系统的单向的通话连接11,第二终端可以通过该单向的通话连接11向电话交换系统发送音频流,但是电话交换系统不通过该单向的通话连接11向第二终端发送音频流。示例的,重协商确认1可以是SIP信令或私有协议的信令。
S708.电话交换系统向接入网关发送对应于重协商响应2的重协商确认2。
电话交换系统接收到重协商确认1之后,电话交换系统可以根据重协商确认1向接入网关发送对应于重协商响应2的重协商确认2,以向接入网关告知电话交换系统接收到了重协商响应2。电话交换系统向接入网关发送重协商确认2之后,电话交换系统与接入网关之间的双向的通话连接12调整为从电话交换系统到接入网关的单向的通话连接12,电话交换系统可以通过该单向的通话连接12向接入网关发送音频流,但是接入网关不通过该单向的通话连接12向电话交换系统发送音频流。示例的,重协商确认2是SIP信令或私有协议的信令,重协商确认2与重协商确认1可以是同一条信令。
S709.接入网关向目标网关发送对应于重协商响应3的重协商确认3。
接入网关接收到重协商确认2之后,接入网关可以根据重协商确认2向目标网关发送对应于重协商响应3的重协商确认3,以向目标网关告知接入网关接收到了重协商响应3。接入网关向目标网关发送重协商确认3之后,接入网关与目标网关之间的双向的通话连接13调整为从接入网关到目标网关的单向的通话连接13,接入网关可以通过该单向的通话连接13向目标网关发送音频流,但是目标网关不通过该单向的通话连接13向接入网关发送音频流。示例的,重协商确认3是SIP信令或私有协议的信令,重协商确认3与重协商确认2可以是同一条信令。
S710.目标网关根据重协商请求3确定第二终端处于被呼叫状态。
目标网关可以根据重协商请求3携带的指定信令信息确定第二终端处于被呼叫状态。例如,重协商请求3携带的指定信令信息为“a=sendonly”,该指定信令信息表示在第二终端与第一终端之间的通话连接处于呼叫保持状态时通过语音提示第一终端,该指定信令信息用于指示第二终端与第一终端之间的通话连接处于呼叫保持状态,从而指示第二终端处于被呼叫状态,目标网关根据该指定信令信息确定第二终端处于被呼叫状态。
S711.目标网关向会议管理设备发送状态信息1,状态信息1指示第二终端处于被呼叫状态。
目标网关可以通过SIP信令或私有协议的信令向会议管理设备发送状态信息1。状态信息1可以包括第二终端的标识和被呼叫标识,以指示第二终端处于被呼叫标识。
S712.会议管理设备控制第一终端显示第二终端处于被呼叫状态。
会议管理设备可以根据状态信息1确定第二终端处于被呼叫状态,之后会议管理设备可以向第一终端发送控制指示信息以指示第二终端处于被呼叫状态。第一终端可以根据该控制指示信息在第一终端的会议界面中显示第二终端处于被呼叫状态的信息或标识。
S713.电话交换系统向接入网关发送音频流1。
第二终端处于被呼叫状态时,电话交换系统生成被呼叫提示音,并根据该被呼叫提示音向接入网关发送音频流1。其中,被呼叫提示音用于提示第二终端处于被呼叫状态。
S714.接入网关向目标网关转发音频流1。
S715.目标网关对音频流1进行去噪处理,以去除音频流1中第二终端被呼叫的噪音。
示例的,目标网关拦截音频流1,或者,目标网关将音频流1的数据包替换为静音包,并向第一终端发送静音包,或者,目标网关在音频流1的数据包中添加第二标识后向第一终端发送音频流1的数据包,其中,第二标识用于指示第一终端不播放音频流1。目标网关通过这些手段,可以避免音频流1到达第一终端,或者即使音频流1到达第一终端,可以避免第一终端播放音频流1,因此可以避免音频流1对第一终端产生干扰。
上述S701至S709描述的是第二终端进入被呼叫状态的流程,S713至S715描述的是第二终端进入呼叫保持状态后第二终端相关的音频流的处理流程。
在目前呼叫保持流程中,用于电话终端(例如第二终端)呼叫保持协商的重协商信令在接入网关终结,也即,接入网关接收到用于电话终端呼叫保持协商的重协商信令之后,接入网关不向媒体网关(例如目标网关)发送重协商信令,因此,电话终端的被呼叫状态不会传递到媒体网关,导致媒体网关无法感知电话终端的被呼叫状态,从而在电话终端处于被呼叫状态时,媒体网关仍然转发该电话终端的被呼叫提示音,导致对多媒体会议系统中的其他终端产生干扰。本申请实施例中,接入网关接收到用于电话终端呼叫保持协商的重协商信令之后,接入网关向媒体网关发送重协商信令以与媒体网关进行协商,使得媒体网关能够感知电话终端的被呼叫状态,从而在电话终端处于被呼叫状态时,媒体网关对发往其他终端的与该电话终端相关的音频流进行去噪处理,以去除该音频流中该电话终端的被呼叫提示音,避免该电话终端的被呼叫提示音对多媒体会议系统中的其他终端产生干扰,达到精准抑制不必要的干扰音频的效果。
请参考图8,其示出了本申请实施例提供的另一种第二终端处于被呼叫状态的流程图。图8主要介绍第二终端进入被呼叫状态的流程以及第二终端进入被呼叫状态之后第二终端相关的音频流的处理流程,且图8以目标网关根据音频流的特征参数确定第二终端处于被呼叫状态为例说明。如图8所示,第二终端处于被呼叫状态的流程包括如下步骤S801至S817。
S801.第二终端向电话交换系统发送重协商请求1。
S802.电话交换系统根据重协商请求1向接入网关发送重协商请求2。
S801至S802的实现过程可以参考S701至702的实现过程,这里不再赘述。
S803.接入网关向电话交换系统发送对应于重协商请求2的重协商响应2。
S804.电话交换系统向第二终端发送对应于重协商请求1的重协商响应1。
S805.第二终端向电话交换系统发送对应于重协商响应1的重协商确认1。
第二终端向电话交换系统发送重协商确认1之后,第二终端与电话交换系统之间的双向的通话连接11调整为从第二终端到电话交换系统的单向的通话连接11。
S806.电话交换系统向接入网关发送对应于重协商响应2的重协商确认2。
电话交换系统向接入网关发送重协商确认2之后,电话交换系统与接入网关之间的双向的通话连接12调整为从电话交换系统到接入网关的单向的通话连接12。
S803至S806的实现过程可以参考S705至708的实现过程,这里不再赘述。
S807.会议管理设备向目标网关发送检测指示信息,该检测指示信息用于指示对第二终端相关的音频流进行检测。
可选的,该检测指示信息包括第二终端的标识和检测标识,以指示目标网关对第二终 端相关的音频流进行检测。其中,第二终端的标识可以是第二终端的电话号码或者用于指示第二终端的其他标识信息,本申请实施例对此不做限定。
示例的,会议管理设备通过SIP信令或私有协议的信令向目标网关发送检测指示信息。
S808.电话交换系统向接入网关发送音频流1。
第二终端处于被呼叫状态时,电话交换系统生成被呼叫提示音,并根据该被呼叫提示音向接入网关发送音频流1。其中,被呼叫提示音用于提示第二终端处于被呼叫状态。
S809.接入网关向目标网关转发音频流1。
S810.目标网关将音频流1解码成PCM文件。
目标网关接收到音频流1之后,目标网关确定音频流1与第二终端相关。由于在S807中,会议管理设备指示目标网关对第二终端相关的音频流进行检测,因此,目标网关确定需要对第二终端相关的音频流1进行检测,目标网关将音频流1解码成PCM文件。
S811.目标网关向音频识别设备发送音频流1的PCM文件。
S812.音频识别设备根据音频流1的PCM文件进行音频识别,得到音频流1的音频特征。
S813.音频识别设备向目标网关发送音频流1的音频特征。
S814.目标网关根据音频流1的音频特征,确定第二终端处于被呼叫状态。
示例的,目标网关比较音频流1包含的语音片段与指定语音片段,根据比较结果确定第二终端处于被呼叫状态。例如,目标网关比较音频流1包含的语音片段与指定语音片段,以判断音频流1包含的语音片段是否包括指定语音片段,如果音频流1包含的语音片段包括指定语音片段,目标网关确定第二终端处于被呼叫状态,如果音频流1包含的语音片段不包括指定语音片段,目标网关确定第二终端未处于被呼叫状态。或者,目标网关比较音频流1包含的语音片段与指定语音片段,以判断音频流1包含的语音片段与指定语音片段的相似度是否大于相似度阈值,如果音频流1包含的语音片段与指定语音片段的相似度大于相似度阈值,目标网关确定第二终端处于被呼叫状态,如果音频流1包含的语音片段与指定语音片段的相似度不大于相似度阈值,目标网关确定第二终端未处于被呼叫状态。
S815.目标网关向会议管理设备发送状态信息1,状态信息1指示第二终端处于呼叫保持状态。
S816.会议管理设备控制第一终端显示第二终端处于呼叫保持状态。
S815至S816的实现过程可以参考S711至S712的实现过程,这里不再赘述。
S817.目标网关对音频流1进行去噪处理,以去除音频流1中第二终端被呼叫的噪音。
S817的实现过程可以参考S715的实现过程,这里不再赘述。
上述S801至S806描述的是第二终端进入被呼叫状态的流程,S808至S817描述的是第二终端进入被呼叫状态之后第二终端相关的音频流的处理流程。
在目前呼叫保持流程中,媒体网关(例如目标网关)无法感知电话终端的被呼叫状态,从而在电话终端处于被呼叫状态时,媒体网关仍然转发该电话终端的被呼叫提示音,导致对多媒体会议系统中的其他终端产生干扰。本申请实施例中,媒体网关可以对该电话终端相关的音频流进行检测,以确定该电话终端处于被呼叫状态,在电话终端处于被呼叫状态时,媒体网关对发往其他终端的与该电话终端相关的音频流进行去噪处理,以 去除该音频流中该电话终端的被呼叫提示音,避免该电话终端的被呼叫提示音对多媒体会议系统中的其他终端产生干扰,达到精准抑制不必要的干扰音频的效果。
请参考图9,其示出了本申请实施例提供的一种第二终端取消被呼叫状态的流程图。图9主要介绍第二终端取消被呼叫状态的流程以及第二终端取消被呼叫状态之后第二终端相关的音频流的处理流程。如图9所示,第二终端取消被呼叫状态的流程包括如下步骤S901至S917。
S901.第二终端向电话交换系统发送重协商请求4。
第二终端接入多媒体会议系统之后,第二终端与该多媒体会议系统中的第一终端之间建立通话连接1,在通话连接1处于激活状态时,若第二终端因开展其他电话业务而与其他终端建立通话连接2,则第二终端处于被呼叫状态,第二终端与第一终端之间的通话连接1被呼叫保持。若第二终端结束其他电话业务,第二终端与第三终端之间的通话连接2断开或呼叫保持,第二终端可以取消被呼叫状态,此时可以重新激活第二终端与第一终端之间的通话连接1(或者说取消第二终端与第一终端之间的通话连接1的呼叫保持状态)。示例的,第二终端取消被呼叫状态时,第二终端向电话交换系统发送重协商请求4,以与电话交换系统协商取消第二终端与第一终端之间的通话连接1的呼叫保持状态。其中,重协商请求4可以包括第二终端的标识,并且重协商请求4不包括指定信令信息,例如,指定信令信息为“a=sendonly”。
S902.电话交换系统根据重协商请求4向接入网关发送重协商请求5。
电话交换系统接收到重协商请求4之后,电话交换系统根据重协商请求4确定取消第二终端与第一终端之间的通话连接1的呼叫保持状态,电话交换系统根据重协商请求4向接入网关发送重协商请求5,以与接入网关协商取消第二终端与第一终端之间的通话连接1的呼叫保持状态。其中,重协商请求5可以包括第二终端的标识,并且重协商请求5不包括指定信令信息,例如,指定信令信息为“a=sendonly”。示例的,重协商请求5是SIP信令或私有协议的信令,重协商请求5与重协商请求4可以是同一条信令。
S903.接入网关根据重协商请求5向目标网关发送重协商请求6。
接入网关接收到重协商请求5之后,接入网关根据重协商请求5确定取消第二终端与第一终端之间的通话连接1的呼叫保持状态,接入网关根据重协商请求5向目标网关发送重协商请求6,以与目标网关协商取消第二终端与第一终端之间的通话连接1的呼叫保持状态。其中,重协商请求6可以包括第二终端的标识,并且重协商请求6不包括指定信令信息,例如,指定信令信息为“a=sendonly”。示例的,重协商请求6是SIP信令或私有协议的信令,重协商请求6与重协商请求5可以是同一条信令。
S904.目标网关向接入网关发送对应于重协商请求6的重协商响应6。
目标网关接收到重协商请求6之后,目标网关根据重协商请求6确定取消第二终端与第一终端之间的通话连接1的呼叫保持状态,从而确定第二终端取消被呼叫状态,目标网关向接入网关发送对应于重协商请求6的重协商响应6。其中,重协商响应6可以包括第二终端的标识,并且重协商响应6不包括“a=reconly”。
S905.接入网关向电话交换系统发送对应于重协商请求5的重协商响应5。
接入网关接收到重协商响应6之后,接入网关根据重协商响应6向电话交换系统发送对 应于重协商请求5的重协商响应5。其中,重协商响应5可以包括第二终端的标识,并且重协商响应5不包括“a=reconly”。示例的,重协商响应5是SIP信令或私有协议的信令,重协商响应6与重协商响应5可以是同一条信令。
S906.电话交换系统向第二终端发送对应于重协商请求4的重协商响应4。
电话交换系统接收到重协商响应5之后,电话交换系统根据重协商响应5向第二终端发送对应于重协商请求4的重协商响应4。其中,重协商响应4可以包括第二终端的标识,并且重协商响应4不包括“a=reconly”。示例的,重协商响应4是SIP信令或私有协议的信令,重协商响应5与重协商响应4可以是同一条响应。
S907.第二终端向电话交换系统发送对应于重协商响应4的重协商确认4。
第二终端接收到重协商响应4之后,第二终端可以向电话交换系统发送对应于重协商响应4的重协商确认4,以向电话交换系统告知第二终端接收到了重协商响应4。重协商确认4可以是SIP信令或私有协议的信令。第二终端向电话交换系统发送重协商确认4之后,第二终端与电话交换系统之间的单向的通话连接11调整为双向的通话连接11。
S908.电话交换系统向接入网关发送对应于重协商响应5的重协商确认5。
电话交换系统接收到重协商确认4之后,电话交换系统可以根据重协商确认4向接入网关发送对应于重协商响应5的重协商确认5,以向接入网关告知电话交换系统接收到了重协商响应5。电话交换系统向接入网关发送重协商确认5之后,电话交换系统与接入网关之间的单向的通话连接12调整为双向的通话连接12。示例的,重协商确认5是SIP信令或私有协议的信令,重协商确认5与重协商确认4可以是同一条信令。
S909.接入网关向目标网关发送对应于重协商响应6的重协商确认6。
接入网关接收到重协商确认5之后,接入网关可以根据重协商确认5向目标网关发送对应于重协商响应6的重协商确认6,以向目标网关告知接入网关接收到了重协商响应6。接入网关向目标网关发送重协商确认6之后,接入网关与目标网关之间的单向的通话连接13调整为双向的通话连接13。示例的,重协商确认6是SIP信令或私有协议的信令,重协商确认6与重协商确认6可以是同一条信令。
S910.目标网关根据重协商请求6确定第二终端取消被呼叫状态。
目标网关可以根据重协商请求6未携带指定信令信息,例如根据重协商请求6未携带指定信令信息“a=sendonly”,确定第二终端取消被呼叫状态。
S911.目标网关向会议管理设备发送状态信息2,状态信息2指示第二终端取消被呼叫状态。
目标网关可以通过SIP信令或私有协议的信令向会议管理设备发送状态信息2。状态信息2包括第二终端的标识,且不包括被呼叫标识,以指示第二终端取消被呼叫状态。
S912.会议管理设备控制第一终端显示第二终端未处于被呼叫状态。
会议管理设备可以根据状态信息2确定第二终端取消被呼叫状态,之后会议管理设备向第一终端发送控制指示信息以指示第二终端未处于被呼叫状态。第一终端可以根据该控制指示信息在第一终端的会议界面中显示第二终端未处于被呼叫状态的标识或信息。
S913.第二终端向电话交换系统发送音频流2。
第二终端取消被呼叫状态之后,第二终端可以采集音频流(例如称为音频流2),并向电话交换系统发送音频流2。
S914.电话交换系统向接入网关发送音频流2。
S915.接入网关向目标网关转发音频流2。
S916.目标网关向第一终端转发音频流2。
由于第二终端取消被呼叫状态,因此音频流2中不包括第二终端处于被呼叫状态的噪音,音频流2不会对第一终端产生干扰,因此目标网关向第一终端转发音频流2。
S917.第一终端播放音频流2。
在本申请实施例中,音频流1与前述音频流A可以是同一音频流,音频流2与前述音频流B可以是同一音频流,可选的,音频流1与音频流A也可以不是同一音频流,音频流2与前述音频流B也可以不是同一音频流,本申请实施例对此不做限定。
以上是本申请的方法实施例的介绍,下面介绍本申请的装置实施例,本申请的装置可以用于执行本申请的方法。对于本申请装置实施例中未披露的细节,请参照本申请方法实施例。
请参考图10,其示出了本申请实施例提供的一种多媒体会议的控制装置1000的结构示意图,控制装置1000可以是目标网关或者是目标网关中的功能组件,目标网关可以是媒体网关。参见图10,该控制装置1000包括:接收模块1010和处理模块1020。
接收模块1010,用于通过电话交换系统接收发往第一终端的音频流;处理模块1020,用于对该音频流进行去噪处理,该去噪处理为去除该音频流中第二终端被呼叫的噪音。接收模块1010的功能实现可以参考上述S401的实现过程,处理模块1020的功能实现可以参考上述S402的实现过程。
可选的,处理模块1020,还用于根据该音频流的特征参数,确定第二终端处于被呼叫状态。处理模块1020的功能实现还可以参考上述S403a的实现过程。
可选的,特征参数包括以下至少一种:音频流的音频特征、音频流的数据包的数据特征。
可选的,特征参数包括音频流的音频特征,该音频特征包括该音频流包含的语音片段,处理模块1020,用于比较该音频流包含的语音片段与指定语音片段,确定第二终端处于被呼叫状态,该指定语音片段用于描述第二终端处于被呼叫状态。
可选的,请继续参考图10,该控制装置1000还包括:发送模块1030,用于向音频识别设备发送音频流,音频识别设备用于对音频流进行音频识别,得到该音频流的音频特征;对应的,接收模块1010,还用于接收音频识别设备发送的该音频流的音频特征。其中,发送模块1030的功能实现和接收模块1010的功能实现均可以参考上述S403a中的相关描述。
可选的,特征参数包括音频流的数据包的数据特征,处理模块1020,用于根据该音频流的数据包包含第一标识,确定第二终端处于被呼叫状态,第一标识用于指示第二终端处于被呼叫状态。
可选的,接收模块1010,还用于接收目标信令消息;处理模块1020,还用于根据目标信令消息包含指定信令信息,确定第二终端处于被呼叫状态,该指定信令信息用于指示第二终端处于被呼叫状态。其中,接收模块1010的功能实现还可以参考上述S403b中的相关描述,处理模块1020功能实现还可以参考上述S404b中的相关描述。
可选的,去噪处理包括:对音频流进行拦截。也即,目标网关不转发该音频流。目标网关对该音频流进行拦截,可以实现对该音频流的去噪处理,避免该音频流到达第一终端,从而避免第一终端播放该音频流,进而避免该音频流中第二终端被呼叫的噪音影响第一终端。
可选的,去噪处理包括:将音频流的数据包替换为静音包,并向第一终端发送该静音包。目标网关将该音频流的数据包替换为静音包并向第一终端发送该静音包,可以实现对该音频流的去噪处理,避免该音频流到达第一终端,从而避免第一终端播放该音频流,进而避免该音频流中第二终端被呼叫的噪音影响第一终端。
可选的,去噪处理包括:在音频流的数据包中添加第二标识,并向第一终端发送该音频流的数据包,第二标识用于指示第一终端不播放该音频流。目标网关在该音频流的数据包中添加第二标识并向第一终端发送该音频流的添加有第二标识的数据包,第一终端接收到该音频流之后,第一终端不播放该音频流,因此可以避免该音频流中第二终端被呼叫的噪音影响第一终端,实现了该音频流的去噪处理。
综上所述,本申请实施例提供的多媒体会议的控制装置,接收模块通过电话交换系统接收发往第一终端的音频流之后,处理模块对该音频流进行去噪处理,以去除该音频流中第二终端被呼叫的噪音,因此可以避免第二终端被呼叫的噪音干扰第一终端,从而避免第二终端被呼叫的噪音影响多媒体会议的开展。
本申请实施例提供的多媒体会议的控制装置还可以用专用集成电路(application-specific integrated circuit,ASIC)或可编程逻辑器件(programmable logic device,PLD)实现。上述PLD可以是复杂程序逻辑器件(complex programmable logical device,CPLD),现场可编程门阵列(field-programmable gate array,FPGA),通用阵列逻辑(generi carray logic,GAL)或其任意组合。也可以通过软件实现上述方法实施例提供的多媒体会议的控制方法,当通过软件实现多媒体会议的控制方法时,上述多媒体会议的控制装置中的各个模块也可以为软件模块。
请参考图11,其示出了本申请实施例提供的另一种多媒体会议的控制装置1100的结构示意图,控制装置1100可以是目标网关或者是目标网关中的功能组件,目标网关可以是媒体网关。参见图11,控制装置1100包括处理器1102、存储器1104、通信接口1106和总线1108,处理器1102、存储器1104和通信接口1106通过总线1108彼此通信连接。图11所示的处理器1102、存储器1104和通信接口1106之间的连接方式仅仅是示例性的,处理器1102、存储器1104和通信接口1106也可以采用除了总线1108之外的其他连接方式彼此通信连接。
其中,存储器1104可以用于存储计算机程序11042,计算机程序11042可以包括指令和数据。在本申请实施例中,存储器1104可以是各种类型的存储介质,例如随机存取存储器(random access memory,RAM)、只读存储器(read-only memory,ROM)、非易失性RAM(non-volatile RAM,NVRAM)、可编程ROM(programmable ROM,PROM)、可擦除PROM(erasable PROM,EPROM)、电可擦除PROM(electrically erasable PROM,EEPROM)、闪存、光存储器和寄存器等。并且,存储器1104可以包括硬盘和/或内存。
其中,处理器1102可以是通用处理器,通用处理器可以是通过读取并执行存储器(例如存储器1104)中存储的计算机程序(例如计算机程序11042)来执行特定步骤和/或操作的处理器,通用处理器在执行上述步骤和/或操作的过程中可能用到存储在存储器(例如存储器1104)中的数据。该存储的计算机程序例如可以被执行以实现前述处理模块1020的相关功能。通用处理器可以是,例如但不限于中央处理器(central processing unit,CPU)。此外,处理 器1102也可以是专用处理器,专用处理器可以是专门设计的用于执行特定步骤和/或操作的处理器,专用处理器可以是,例如但不限于,数字信号处理器(digital signal processor,DSP)、ASIC、FPGA等。处理器1102还可以是多个处理器的组合,例如多核处理器。处理器1102可以包括至少一个电路,以执行上述实施例提供的多媒体会议的控制方法的全部或部分步骤。
其中,通信接口1106可以包括输入/输出(input/output,I/O)接口、物理接口和逻辑接口等用于实现控制装置1100内部的器件互连的接口,以及用于实现控制装置1100与其他设备(例如终端设备、服务器、网关等)互连的接口。物理接口可以是千兆的以太接口(gigabit Ethernet,GE),其可以用于实现控制装置1100与其他设备互连,逻辑接口是控制装置1100内部的接口,其可以用于实现控制装置1100内部的器件互连。容易理解,通信接口1106可以用于控制装置1100与其他设备通信,例如,通信接口1106用于控制装置1100与其他设备之间信令的发送和接收,音频流的发送和接收等,通信接口1106可以实现前述接收模块1010和发送模块1030的相关功能。
其中,总线1108可以是任何类型的,用于实现处理器1102、存储器1104和通信接口1106互连的通信总线,例如系统总线。
上述器件可以分别设置在彼此独立的芯片上,也可以至少部分的或者全部的设置在同一块芯片上。将各个器件独立设置在不同的芯片上,还是整合设置在一个或者多个芯片上,往往取决于产品设计的需要。本申请实施例对上述器件的具体实现形式不做限定。
图11所示的控制装置1100仅仅是示例性的,在实现过程中,控制装置1100还可以包括其他组件,本文不再一一列举。图11所示的控制装置1100可以通过执行上述实施例提供的多媒体会议的控制方法的全部或部分步骤来进行多媒体会议的控制。
本申请实施例提供了一种通信系统,该通信系统包括目标网关、第一终端和第二终端,第一终端与目标网关通信连接,第二终端通过电话交换系统与目标网关通信连接,目标网关可以包括如图10或图11所示的多媒体会议的控制装置。可选的,该通信系统如图1至图3任一所示,对于图2或图3所示的通信系统,目标网关可以是媒体网关。
本申请实施例提供了一种计算机可读存储介质,该计算机可读存储介质内存储有计算机程序,该计算机程序被执行(例如,被目标网关、一个或多个处理器等执行)时,实现如上述方法实施例提供的方法的全部或部分步骤。
本申请实施例提供了一种计算机程序产品,该计算机程序产品包括程序或代码,该程序或代码被执行(例如,被目标网关、一个或多个处理器等执行)时,实现如上述方法实施例提供的方法的全部或部分步骤。
本申请实施例提供了一种芯片,该芯片包括可编程逻辑电路和/或程序指令,该芯片运行时用于实现如上述方法实施例提供的方法的全部或部分步骤。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现,所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、计算机网络、或者 其他可编程装置。所述计算机指令可以存储在计算机的可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线)或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者包含一个或多个可用介质集成的服务器、数据中心等数据存储装置。所述可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质,或者半导体介质(例如固态硬盘)等。
应当理解的是,本申请中的术语“至少一个”指一个或多个,术语“多个”指两个或两个以上,术语“至少两个”指两个或两个以上。在本申请中,除非另有说明,符号“/”表示或的意思,例如,A/B表示A或B。本申请中的术语“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,为了便于清楚描述,在本申请中,采用了“第一”、“第二”、“第三”等字样对功能和作用基本相同的相同项或相似项进行区分。本领域技术人员可以理解“第一”、“第二”、“第三”等字样并不对数量和执行次序进行限定。
本申请提供的方法实施例和装置实施例等不同类型的实施例可以相互参考,本申请实施例提供的方法实施例操作的先后顺序能够进行适当调整,操作也能够根据情况进行响应增减,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化的方法,都应涵盖在本申请的保护范围之内,因此不再赘述。
在本申请提供的相应实施例中,应该理解到,所揭露的装置等可以通过其它的构成方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,模块的划分仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个模块或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。
作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块描述的部件可以是或者也可以不是物理模块,既可以位于一个地方,也可以分布到多个设备(例如终端设备、网关)上。可以根据实际的需要选择部分或者全部模块来实现本实施例方案的目的。
以上所述,仅为本申请的示例性实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。

Claims (24)

  1. 一种多媒体会议的控制方法,其特征在于,所述方法包括:
    目标网关通过电话交换系统接收发往第一终端的音频流;
    所述目标网关对所述音频流进行去噪处理,所述去噪处理为去除所述音频流中第二终端被呼叫的噪音。
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    所述目标网关根据所述音频流的特征参数,确定所述第二终端处于被呼叫状态。
  3. 根据权利要求2所述的方法,其特征在于,所述特征参数包括以下至少一种:
    所述音频流的音频特征、所述音频流的数据包的数据特征。
  4. 根据权利要求3所述的方法,其特征在于,所述特征参数包括所述音频流的音频特征,所述音频特征包括所述音频流包含的语音片段,
    所述目标网关根据所述音频流的特征参数,确定所述第二终端处于被呼叫状态,包括:
    所述目标网关比较所述音频流包含的语音片段与指定语音片段,确定所述第二终端处于被呼叫状态,所述指定语音片段用于描述所述第二终端处于所述被呼叫状态。
  5. 根据权利要求3或4所述的方法,其特征在于,所述方法还包括:
    所述目标网关向音频识别设备发送所述音频流,所述音频识别设备用于对所述音频流进行音频识别,得到所述音频流的音频特征;
    所述目标网关接收所述音频识别设备发送的所述音频流的音频特征。
  6. 根据权利要求3所述的方法,其特征在于,所述特征参数包括所述音频流的数据包的数据特征,
    所述目标网关根据所述音频流的特征参数,确定所述第二终端处于被呼叫状态,包括:
    所述目标网关根据所述音频流的数据包包含第一标识,确定所述第二终端处于被呼叫状态,所述第一标识用于指示所述第二终端处于所述被呼叫状态。
  7. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    所述目标网关接收目标信令消息;
    所述目标网关根据所述目标信令消息包含指定信令信息,确定所述第二终端处于被呼叫状态,所述指定信令信息用于指示所述第二终端处于所述被呼叫状态。
  8. 根据权利要求1至7任一项所述的方法,其特征在于,所述去噪处理包括:
    对所述音频流进行拦截。
  9. 根据权利要求1至7任一项所述的方法,其特征在于,所述去噪处理包括:
    将所述音频流的数据包替换为静音包,并向所述第一终端发送所述静音包。
  10. 根据权利要求1至7任一项所述的方法,其特征在于,所述去噪处理包括:
    在所述音频流的数据包中添加第二标识,并向所述第一终端发送所述音频流的数据包,所述第二标识用于指示所述第一终端不播放所述音频流。
  11. 一种多媒体会议的控制装置,其特征在于,应用于目标网关,所述装置包括:
    接收模块,用于通过电话交换系统接收发往第一终端的音频流;
    处理模块,用于对所述音频流进行去噪处理,所述去噪处理为去除所述音频流中第二终端被呼叫的噪音。
  12. 根据权利要求11所述的装置,其特征在于,
    所述处理模块,还用于根据所述音频流的特征参数,确定所述第二终端处于被呼叫状态。
  13. 根据权利要求12所述的装置,其特征在于,所述特征参数包括以下至少一种:
    所述音频流的音频特征、所述音频流的数据包的数据特征。
  14. 根据权利要求13所述的装置,其特征在于,所述特征参数包括所述音频流的音频特征,所述音频特征包括所述音频流包含的语音片段,
    所述处理模块,用于比较所述音频流包含的语音片段与指定语音片段,确定所述第二终端处于被呼叫状态,所述指定语音片段用于描述所述第二终端处于所述被呼叫状态。
  15. 根据权利要求13或14所述的装置,其特征在于,
    所述装置还包括:发送模块,用于向音频识别设备发送所述音频流,所述音频识别设备用于对所述音频流进行音频识别,得到所述音频流的音频特征;
    所述接收模块,还用于接收所述音频识别设备发送的所述音频流的音频特征。
  16. 根据权利要求13所述的装置,其特征在于,所述特征参数包括所述音频流的数据包的数据特征,
    所述处理模块,用于根据所述音频流的数据包包含第一标识,确定所述第二终端处于被呼叫状态,所述第一标识用于指示所述第二终端处于所述被呼叫状态。
  17. 根据权利要求11所述的装置,其特征在于,
    所述接收模块,还用于接收目标信令消息;
    所述处理模块,还用于根据所述目标信令消息包含指定信令信息,确定所述第二终端处于被呼叫状态,所述指定信令信息用于指示所述第二终端处于所述被呼叫状态。
  18. 根据权利要求11至17任一项所述的装置,其特征在于,所述去噪处理包括:
    对所述音频流进行拦截。
  19. 根据权利要求11至17任一项所述的装置,其特征在于,所述去噪处理包括:
    将所述音频流的数据包替换为静音包,并向所述第一终端发送所述静音包。
  20. 根据权利要求11至17任一项所述的装置,其特征在于,所述去噪处理包括:
    在所述音频流的数据包中添加第二标识,并向所述第一终端发送所述音频流的数据包,所述第二标识用于指示所述第一终端不播放所述音频流。
  21. 一种多媒体会议的控制装置,其特征在于,包括存储器和处理器;
    所述存储器用于存储计算机程序;
    所述处理器用于执行所述存储器中存储的计算机程序以使得所述控制装置执行如权利要求1至10任一项所述的方法。
  22. 一种通信系统,其特征在于,所述通信系统包括目标网关、第一终端和第二终端,所述第一终端与所述目标网关通信连接,所述第二终端通过电话交换系统与所述目标网关通信连接,所述目标网关包括如权利要求11至21任一项所述的多媒体会议的控制装置。
  23. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有计算机程序,所述计算机程序被执行时实现如权利要求1至10任一项所述的方法。
  24. 一种计算机程序产品,其特征在于,所述计算机程序产品包括程序或代码,所述程序或代码被执行时实现如权利要求1至10任一项所述的方法。
PCT/CN2022/131215 2022-04-14 2022-11-10 多媒体会议的控制方法及装置、通信系统 WO2023197593A1 (zh)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202210395237.9 2022-04-14
CN202210395237 2022-04-14
CN202210666906.1 2022-06-13
CN202210666906.1A CN116962364A (zh) 2022-04-14 2022-06-13 多媒体会议的控制方法及装置、通信系统

Publications (1)

Publication Number Publication Date
WO2023197593A1 true WO2023197593A1 (zh) 2023-10-19

Family

ID=88328858

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/131215 WO2023197593A1 (zh) 2022-04-14 2022-11-10 多媒体会议的控制方法及装置、通信系统

Country Status (1)

Country Link
WO (1) WO2023197593A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1250302A (zh) * 1998-08-21 2000-04-12 朗迅科技公司 解决电信会议中的持续音乐问题的方法
US20030128830A1 (en) * 2002-01-09 2003-07-10 Coffman James E. Selectable muting on conference calls
CN1611059A (zh) * 2001-12-31 2005-04-27 思科技术公司 用于在多方通信会话期间控制音频内容的方法和系统
CN101076108A (zh) * 2007-06-19 2007-11-21 中兴通讯股份有限公司 视频会议终端

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1250302A (zh) * 1998-08-21 2000-04-12 朗迅科技公司 解决电信会议中的持续音乐问题的方法
CN1611059A (zh) * 2001-12-31 2005-04-27 思科技术公司 用于在多方通信会话期间控制音频内容的方法和系统
US20030128830A1 (en) * 2002-01-09 2003-07-10 Coffman James E. Selectable muting on conference calls
CN101076108A (zh) * 2007-06-19 2007-11-21 中兴通讯股份有限公司 视频会议终端

Similar Documents

Publication Publication Date Title
US11778091B2 (en) Utilizing sip messages to determine the status of a remote terminal in VOIP communication systems
EP1704709B1 (en) Method and system for providing a call answering service between a source telephone and a target telephone
US8861510B1 (en) Dynamic assignment of media proxy
US6909776B2 (en) Systems and methods for monitoring network-based voice messaging systems
US6031896A (en) Real-time voicemail monitoring and call control over the internet
US7702792B2 (en) Method and system for managing communication sessions between a text-based and a voice-based client
US11546741B2 (en) Call routing using call forwarding options in telephony networks
US9071684B2 (en) Media forking
US7738638B1 (en) Voice over internet protocol call recording
US11588933B2 (en) Methods and apparatus for identification and optimization of artificial intelligence calls
CN102148775B (zh) 网页呼叫服务网关、呼叫服务系统和方法
CN114401252B (zh) 话务系统的呼叫方法以及话务系统
US8290138B2 (en) Systems, methods, apparatus and computer program products for sharing resources between turret systems and PBXS using SIP
US9148306B2 (en) System and method for classification of media in VoIP sessions with RTP source profiling/tagging
WO2023197593A1 (zh) 多媒体会议的控制方法及装置、通信系统
JP2013046116A (ja) コールセンタシステムにおける通話録音システム及び方法
GB2480552A (en) Providing call disposition information for outgoing calls
CN109479071B (zh) 一种网络电话的处理方法及相关网络设备
US11082557B1 (en) Announcement or advertisement in text or video format for real time text or video calls
CN116962364A (zh) 多媒体会议的控制方法及装置、通信系统
US8837459B2 (en) Method and apparatus for providing asynchronous audio messaging
US8116299B2 (en) Techniques for listening to a caller leaving a voicemail message in real-time and real-time pick up of a call
WO2020076344A1 (en) Call routing using call forwarding options in telephony networks
KR20170087941A (ko) 클라이언트 애플리케이션을 통한 pbx 전화 호의 제어
KR20040016952A (ko) 교환기에서 호 보류 서비스를 제공하기 위한 방법 및 그교환기 시스템

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22937225

Country of ref document: EP

Kind code of ref document: A1