CN108632476B

CN108632476B - PSTN-fused mobile internet voice platform system and communication method thereof

Info

Publication number: CN108632476B
Application number: CN201810383023.3A
Authority: CN
Inventors: 周平; 胡海; 田维忠; 向泽清; 蔡君
Original assignee: Guiyang Longmaster Information and Technology Co ltd
Current assignee: Guiyang Longmaster Information and Technology Co ltd
Priority date: 2018-04-26
Filing date: 2018-04-26
Publication date: 2021-10-15
Anticipated expiration: 2038-04-26
Also published as: CN108632476A

Abstract

The invention discloses a mobile Internet voice platform system fusing PSTN. The above-mentioned system includes: the system comprises a mobile Internet voice platform and at least one PSTN access end system respectively connected with the mobile Internet voice platform; wherein each PSTN access-end system comprises: a channel access server CES responsible for upper audio processing of the PSTN access end system is respectively connected with a channel server CS and a channel management server CMS of the mobile internet voice platform; and the computer telecommunication integration server CTIS is used for processing the lower audio of the PSTN access end system and is connected with the CES. By adopting the technical scheme, under the condition that the Internet is unavailable and the user has strong demand for using the mobile Internet real-time voice communication product, the purpose of real-time full-duplex voice communication demand is realized, and a channel for accessing the mobile Internet voice platform by the PSTN network is provided.

Description

PSTN-fused mobile internet voice platform system and communication method thereof

Technical Field

The invention relates to the field of communication, in particular to a mobile internet voice platform system fusing a Public Switched Telephone Network (PSTN) and a communication method thereof.

Background

With the development of information technology and the arrival of the mobile internet era, mobile social products gradually rise, and products such as microblogs, QQ, WeChat, strange people, YY and the like which are common in the market at present provide business scene services such as media information publishing, instant messaging, stranger friend making, audio and video live broadcast and the like, and simultaneously, social elements are integrated in a diversified manner, so that various mobile internet APPs can have rich product functions and good user experience.

On the social platform product, the person-to-person communication is a basic element, and the common communication functions mainly include: the presentation forms include image-text instant messages, asynchronous Voice messages (Voice Over Internet Protocol, VOIP for short), real-time Voice calls (full-duplex VOIP Voice calls), and the like. The image-text instant message is the most convenient and most common communication tool, and text or multimedia data is transmitted in WiFi/GSM/CMDA and other networks through client application software and a server to realize image-text instant communication; the 'asynchronous voice' technology is a 'push-to-talk' service based on a network, the service function is that when a user presses a recording button, a recording module is started, a speaking audio signal is collected, encoded and compressed, the speech audio signal is transmitted to a mobile phone of the opposite side through a network server, and after receiving audio data, a receiving side device clicks a playing button and listens through a loudspeaker. According to the scheme, the social communication is carried out by recording the voice short messages through speaking, the input information of both hands is released, the use threshold is reduced, and the convenience is improved to a certain extent; the real-time voice communication is a VOIP technology which is rapidly developed, under the condition that a broadband network and a 4G mobile network are popularized, the real-time voice application is widely developed and applied in the mobile internet, the reality sense in a virtual network social environment is enhanced by continuous voice communication, and the continuous voice communication is favored by users. The QQ, WeChat, YY and other internet social products realize one-to-one, many-to-many real-time call products among friends, and the VOIP network voice communication product close to the traditional telephone communication gives people more comfortable communication experience, but also provides higher requirements for the data transmission quality of the communication network.

In view of construction cost, convenience in use and other factors, mobile Internet real-time voice communication products in the market basically use Internet broadband WiFi and mobile 3G/4G networks as communication carrying networks, and simultaneously perform audio data jitter buffering, coding and decoding compression, audio mixing synthesis processing and distribution delivery through a server soft switching system, so as to realize a transmission switching subsystem of multi-party real-time voice communication. At present, broadband and 4G networks are already popular, although the 4G network theoretically can reach a bandwidth speed of 100Mbps per second (about 12.5MB per second), because IP data packets are transmitted on the Internet depending on multiple times of network routing node data exchange, and network problems caused by wireless signal strength factors such as terminal network signal interference, shielding and the like, the network access quality of a part of mobile devices is still poor, and scenes that Internet network interruption, data transmission delay jitter is large and the like are not suitable for using VOIP real-time voice are caused.

Therefore, under the situation that the Internet network is unavailable and the user has a strong demand for using the mobile Internet real-time voice communication product, what kind of platform architecture is adopted can meet the real-time full-duplex voice communication demand, and the problem to be solved at present is urgently needed.

Disclosure of Invention

The invention mainly aims to disclose a mobile Internet voice platform system fusing a Public Switched Telephone Network (PSTN) and a communication method thereof, so as to at least solve the problem that a platform architecture capable of realizing real-time full-duplex voice communication requirements is lacked under the condition that a user has strong requirements on using a mobile Internet real-time voice communication product under the condition that an Internet network is unavailable in the related technology.

According to one aspect of the invention, a mobile internet voice platform system fusing Public Switched Telephone Network (PSTN) is provided.

The mobile Internet voice platform system fusing the PSTN comprises the following components: the mobile Internet voice system comprises a mobile Internet voice platform and at least one PSTN access end system respectively connected with the mobile Internet voice platform; wherein, each PSTN access-end system includes: a channel access server CES in charge of upper audio processing of the PSTN access end system, respectively connected to a channel server CS and a channel management server CMS of the mobile internet voice platform; and a Computer Telecommunications Integration (CTIS) for processing the lower audio of the PSTN access end system, wherein the Computer Telecommunications Integration (CTIS) is connected to the CES.

According to another aspect of the invention, a communication method of the PSTN-converged mobile Internet voice platform system is provided.

The communication method of the mobile Internet voice platform system fusing the PSTN comprises the following steps: after a public switched telephone network PSTN user accesses a computer telecommunication integration server CTIS, a request message is sent to a channel management server CMS through a channel access service CES; the CMS creates a virtual user and requests to join the channel server CS; the CMS acquiring address port information from the CS and replying the address port information to the CES; and the CES transmits the address port information to the CTIS.

Compared with the prior art, the embodiment of the invention at least has the following advantages: the public switched telephone network PSTN is accessed into the mobile Internet voice platform to form a mobile Internet voice platform system fused with the PSTN, and by adopting the system, under the condition that the Internet is unavailable and users have strong requirements for using mobile Internet real-time voice communication products, the purpose of real-time full-duplex voice communication requirements is realized, and a channel for accessing the mobile Internet voice platform by the PSTN network is provided.

Drawings

Fig. 1 is a block diagram of a mobile internet voice platform system fusing a public switched telephone network PSTN according to an embodiment of the present invention;

fig. 2 is a block diagram of the structure of a PSTN access end system in accordance with a preferred embodiment of the present invention;

fig. 3 is a schematic diagram of the structure of CES and CTIS according to a preferred embodiment of the present invention;

fig. 4 is a block diagram of the architecture of a mobile internet voice platform according to a preferred embodiment of the present invention;

fig. 5 is a block diagram of a PSTN converged mobile internet voice platform system according to a preferred embodiment of the present invention;

fig. 6 is a flowchart of a communication method of a mobile internet voice platform system fusing a PSTN according to an embodiment of the present invention;

fig. 7 is a timing diagram of a CES node registration in accordance with a preferred embodiment of the present invention;

fig. 8 is a flowchart of room music URI change and audio mixing according to a preferred embodiment of the present invention.

Detailed Description

The following detailed description of specific embodiments of the present invention is provided in conjunction with the accompanying drawings.

Fig. 1 is a block diagram of a mobile internet voice platform system fusing a public switched telephone network PSTN according to an embodiment of the present invention. As shown in fig. 1, the mobile internet voice platform system includes: a mobile internet voice platform 10, and at least one PSTN access end system 20 respectively connected to the mobile internet voice platform 10; each of the PSTN access end systems 20 includes: a channel access server (CES)200 responsible for upper audio processing of the PSTN access end system 20, respectively connected to the Channel Server (CS)100 and the Channel Management Server (CMS)102 of the mobile internet voice platform 10; a Computer Telecommunication Integration Server (CTIS)202 responsible for processing the lower audio of the PSTN access system 20 is connected to the CES 200.

In the mobile Internet voice platform system shown in fig. 1, a Public Switched Telephone Network (PSTN) is accessed to the mobile Internet voice platform to form a mobile Internet voice platform system fused with the PSTN, and by adopting the system, under the condition that an Internet network is unavailable and a user has a strong demand for using a mobile Internet real-time voice communication product, the purpose of real-time full-duplex voice communication demand is realized, and a channel for accessing the mobile Internet voice platform by the PSTN network is provided.

Preferably, as shown in fig. 2, the PSTN access terminal system 200 may further include: a WEB server 204; and an Interactive Voice Response System (IVRS)206 for calling the interface of the WEB server to obtain room channel information and broadcasting the room channel information when the user accesses the corresponding flow, and respectively connected to the WEB server and the CTIS.

The IVRS 40 encapsulates the HTTP access interface, calls the Web server 30 interface to acquire room channel information, broadcasts the room information (type, number of people and the like) when a user accesses the corresponding flow, and enables the user of the telephone to enter the corresponding flow through the IVR. When the telephone user selects to enter the channel, the IVRS transmits the room ID information to the CTIS through the interface with the CTIS. And finally, calling a libCES corresponding interface by the CTIS to enter a channel.

The IVRS listens to the voice social platform room sound function, and the voice social platform (CMS) realizes that a virtual 'transparent' user delivers the corresponding room audio to the CTIS gateway. A channel user manager is required to be designed in the CTIS, the information of the number of people participating in the telephone of the current gateway node can be recorded, and meanwhile, the audio stream subscription function is realized through the channel user manager.

The CTIS is a main body of the PSTN network management for the audio data docking of the voice social platform channel, and is responsible for calling a request interface which is provided by a libCES shared library module embedded in the CTIS and enters the voice social platform channel and adding a user into the voice channel. The CTIS realizes the design of a channel and channel member manager, is used for managing corresponding memory data, and maintains the data in the manager when a libCES shared library module embedded in the CTIS is called to enter (leave) a room channel. In order to save traffic bandwidth of the CTIS machine room, downlink voice of a room channel is sent to the CTIS only through a virtual user channel (a corresponding virtual user management mechanism is completed by CMS/CES/libCES cooperation). The CTIS calls a corresponding interface of the libCES to complete the function of accessing a room channel of the voice social platform, and meanwhile, the libCES calls back output audio data after the libCES is successfully accessed into the room. At this time, the CTIS delivers the voice packet data (g.711a-law) of a certain room which is continuously received to the corresponding bottom hardware channel of the board card through the driving API according to the room channel and the member information managed by the CTIS, so as to realize audio playing on the telephone line.

In order to realize the real-time chat communication function between the telephone user and the voice social platform user, the libCES creates a full-duplex voice uplink and downlink channel for the speaking user to send uplink and downlink audio data for the speaking user.

Preferably, the CTIS includes: the system comprises a downstream receiving port, an upstream sending port, a computer-telephone integrated CTI controller respectively connected with the downstream receiving port and the upstream sending port, and CTI relay switching equipment connected with the CTI controller.

The CTIS is generally deployed on an industrial computer, common digital trunk switching equipment (such as a Sanhui SHD-120D-CT digital trunk voice card and an east Keygoe3003 multimedia switch) can be inserted or connected into a PCI slot of the industrial computer, and a digital trunk coaxial cable is accessed to a backboard of the digital trunk switching equipment. The IPC is simultaneously provided with a hardware driver of a voice card, and the CTIS starts through a software program and operates the voice card driver to realize access control (such as basic operations of telephone connection, audio recording, audio playing and the like) to the relay line.

In view of the fact that the CTIS needs to be connected with board card hardware, and the hardware driving version is limited by various hardware product manufacturers, the problem of inconsistent support quality of the operating system often exists. Therefore, there may be multiple system versions of the CTIS, which are commonly used as Windows platform and Linux platform. Therefore, libES needs to provide multiple versions of running systems, corresponding to Windows systems, and should be libES.

The libCES is loaded into the memory address space allocated by the CTIS process by the operating system integration during the operation period to operate. The SDK is an application layer integrated SDK of CES service, external service providing is realized through API, and the SDK is designed to provide function level interface calling, so that the integration of CTIS and other integrated access parties is facilitated.

libCES communicates with CES, signaling is transmitted through a TCP channel, and voice is transmitted through a UDP channel. And packaging various instruction protocols and audio stream data, and providing an SDK development kit at an API level for integrated use of an application layer.

And the libCES designs a callback notification interface through a C language function pointer scheme, an application layer program corresponding to the CTIS transfers a function address used for receiving instructions and audio data to a memory of a libCES module, and the libCES transfers the data to the CTIS application layer through the callback function pointer after unpacking and converting different kinds of data.

The IVRS obtains the list information of the chat room of the voice social platform, provides a room ID when communicating with the CTIS, and calls a corresponding interface of libCES by the CTIS to complete the function of accessing to the chat room channel of the mobile internet voice social platform. Meanwhile, the libCES callback outputs room audio data.

Preferably, the CES includes: an uplink receiving port, an encoder connected with the uplink receiving port, and an uplink transmitting port connected with the encoder; a downlink receiving port, a jitter buffer controller connected with the downlink receiving port, an audio decoder connected with the jitter buffer controller, an audio mixer connected with the audio decoder, and a downlink transmitting port connected with the audio mixer.

Preferably, a Transmission Control Protocol (TCP) channel for transmitting signaling is established between the CES and the CMS, and a User Datagram Protocol (UDP) channel for transmitting voice is established between the CES and the CS.

Preferably, a TCP channel for transmitting signaling and a UDP channel for transmitting voice are established between the CES and the CTIS.

Fig. 3 is a schematic diagram of the structure of CES and CTIS according to the preferred embodiment of the present invention.

As shown in fig. 3, CES is a core server of the virtual mobile internet voice social APP terminal, and is an upper-level (compared with an upper-level computer of an automation control system) audio processing unit of each PSTN access point.

The libCES is used as an application integration SDK of the CES, and a communication bridge is established between the CES, the CTIS and other servers. FIG. 3 makes libCES transparent, and omits the logic existence of the module, so as to express the organization relation between servers.

After the CES is successfully connected with the CMS in the 'joining channel' signaling, the CES and the CS establish a UDP communication channel (adopting RTP protocol for communication) and access data in the appointed channel. The UDP channel is a full duplex channel and can perform downlink audio packet transmission and uplink audio transmission. The processes of collecting, encoding, compressing, transmitting and the like of the VOIP audio data packet are not the key points of the present invention, and are not described in detail.

When receiving downlink voice, a jitter buffer controller of a CES is responsible for transmission jitter buffer control of audio data, an audio decoder decodes audio, then an audio mixer mixes the decoded audio data with received room background music media stream data (music media stream receiving modules may have actions of music media stream decoding, resampling and the like), and after audio sampling width, frequency and coding format conversion processing, the audio data is sent to a CTIS through a downlink sending port, and the CTIS operation board card outputs the audio data to a PSTN line.

When receiving the uplink voice, the CES encoder is responsible for encoding and compressing the audio data, the uplink transmitting port is responsible for packet transmission, and finally the audio data is sent to the CS for processing.

The audio coding between the CES and the CS of the system can adopt an SILK coding algorithm (the code rate is 30kbps), and the audio coding between the CES and the CTIS can adopt a G.711a-law coding algorithm (the code rate is 64 kbps). All audio coding and decoding processes are calculated in CES.

As shown in fig. 3, the CTIS is a processing server that is accessed to the PSTN network and controls the voice relay board card device to perform data interfacing between audio data and a board card link, and is a lower-level (lower-level machine compared to the automation control system) audio processing unit of each PSTN access point.

The CTIS encapsulates N (e.g., 32) channel slots (timeslots) on a re 1 into 30 channels (0/16 timeslots are synchronization and signaling slots), for example, by digital trunk switching equipment, and each re 1 provides 30 user concurrent access capabilities.

The CTIS can create a virtual user (created and managed in advance) for each room channel of the voice social platform, and is used for uniformly receiving downlink voice of each channel of the voice social platform, and if a PSTN telephone user only listens to channel content, the CTIS only needs to subscribe the channel audio stream. The CTIS will deliver audio data to each audio subscriber.

When a PSTN user wishes to speak, the CTIS creates an exclusive signaling (the signaling is transmitted by an exclusive TCP channel established by a CES and an ES and forwarded to the CMS through the ES) and an audio stream (the audio stream carries out UDP communication with the CS) transmission channel for the user through the CES, so that full-duplex audio real-time data receiving and transmitting are realized. The CTIS interacts with voice board audio data, and adopts PCMA (G.711a-law coding) voice compression standard.

The CTIS carries out memory playback and memory recording through the drive control board card, and realizes real-time audio data recording and playing control with the VOIP system. The PCMA data in the memory is transmitted to the board card to be driven to realize playback, and similarly, the board card can also call back the CTIS program to obtain the recorded PCMA data through driving during recording.

Preferably, as shown in fig. 4, the CMS 102 is connected to the CS 100 through a full duplex UDP communication channel; the mobile internet voice platform 10 may further include: an access server ES 104 for providing an interface of the room channel information is connected to the CMS 102 and the WEB server 204 of the PSTN access system 20, wherein the ES 104 communicates with the WEB server 204 through the HTTP protocol.

The CMS is a voice chat channel management server of a social server, and is one of core servers of a real-time voice function of a social platform. In this access scheme, interface logic is provided for PSTN access users to join a chat room, create a user and join a CS. And returning the information such as the IP/PORT of the RTP service related to the access of the CS allocation user to the Client (CES).

For administrative and cost reasons, leased or co-operating telecom, Unicom, Mobile, etc. telecom carrier digital trunk E1 lines are typically only capable of receiving or calling in-network telephone numbers. Therefore, the CMS supports multiple CES node access, i.e. IVR systems are deployed in many operator rooms. Therefore, there are multiple CES, and each CES applies for joining the same chat room channel according to the service requirement.

The CES can virtualize a certain number of social platform audio clients and is responsible for communicating with the CMS to access virtual users to the CS and acquire audio real-time transmission information such as RTP ports and SSRC distributed by the CS.

And receiving the channel voice of the chat room by using an RTP protocol and parameter rules matched with the CS and the CS communication, carrying out jitter buffering and audio decoding, and resampling the output audio into G.711a-law PCM data with 8K sampling frequency and 8bit sampling bits supported by the PSTN line.

CES and a libCES (Windows system version is libCES.dll or linux system version is libCES.so) of a gateway system with an IVR function, which is connected to a PSTN network, are communicated, instruction information and audio data are output to the CTIS, the Server is controlled to be in API interface with CTI relay switching equipment, and media information is output to a digital relay link.

Considering factors such as migration and development complexity of a voice system, a CES is an application service program running in a Linux system, and a CTIS is determined according to a required version of a digital relay switching equipment product, and may be a Windows server program or a Linux server program. And libCES is a dynamic runtime library embedded in CTIS. The libCES is only responsible for simple control instructions and audio data serialization packaging transparent transmission work, and a network jitter control module is not required to be designed. Therefore, the CES and the CTIS need to be deployed in a 1000M lan in the same computer room, so as to ensure a reliable network communication line between them. When the libCES is initialized, the client is actively connected with a monitoring port bound by CES, and a TCP/UDP communication channel with stable transmission is established between the client and the monitoring port. The control instructions and audio packets are then forwarded on the channel.

In the scene of intercommunication between a telephone user and an APP user of a voice social platform, downlink voice of each room channel is sent to the CTIS through a CES (network access control system) by adopting a special virtual user channel so as to save flow bandwidth.

Because the digital trunk circuit is connected through the digital voice board card after the CTIS, the line delay and the transmission jitter of the audio data transmission of the telephone users connected with the PSTN network are extremely small, and the CTIS is centralized at one node and is connected with the CS through the CES. Therefore, users under the control of the same CTIS gateway can listen to the audio information of the same chat channel and can receive data through the uniform downlink channel to subscribe and distribute memory data. The bandwidth waste caused by the transmission of the same data by a plurality of data channels is avoided. Generally speaking, a speaking user who accesses a telephone needs to create an exclusive full-duplex channel, and uplink and downlink audio of the speaking user of the telephone are transmitted by a dedicated channel of the speaking user, so that virtual user channel data is not used any more.

The ES may provide an interface to chat room channel lists, obtaining an open room list that may be entered. The room list with more personalized characteristics of interest matching, crowd matching, friend matching and the like and room information query service are provided. And the interface provided by the ES is packaged by adopting an HTTP protocol, and is used for the IVRS integrated HTTP client module to call and pull data.

The ES is used as a cloud entry server used by the mobile Internet voice platform, provides information such as a room list for servers such as the IVRS and the like, and also carries user access capability service of the mobile Internet voice platform. Therefore, the ES has a whole set of functional systems such as user login and authentication, and the like, so as to realize the function of accessing the service. Meanwhile, the ES is used as an access server of a mobile internet voice platform, and the proxy function of network communication between the mobile client application program APP and the CMS is also realized, namely, the mobile client APP interacts with the CMS (such as actions of joining a channel, leaving the channel, acquiring a channel member list, overhearing, speaking and the like), unlike the communication mode of CES directly connecting the CMS provided by the document, all communication is realized by proxy forwarding of a TCP long connection communication channel established between the APP and the ES.

The CES is used as a simulator of the mobile internet voice client logic, adopts a direct connection CMS communication scheme instead of direct connection of ES like APP, and realizes communication from APP to CMS through ES agents. The reason is as follows:

and the CES serves as a client, and when a PSTN user needs to enter a certain mobile Internet voice APP room channel, whether a communication channel of the channel which is initialized and established is available is judged. If not, a communication channel is established by default, and the subscription listening function for a certain channel on the CMS/CS is realized. The function is used as a main working mode (namely: a listening mode) of the PSTN system for outputting audio, and under the mode, a listener only receives audio output of a room channel, and due to the special design among servers, the function can be realized anonymously, and the business logic design is not developed on the basis of user identity.

CES acts as a client and PSTN users wish to participate in channel interaction, at which time the system needs to establish a full duplex voice channel. The PSTN speaking user realizes the two-way real-time voice communication with other users of the mobile Internet voice system through the full duplex voice channel which is shared by the PSTN speaking user. The user who owns the exclusive full-duplex channel necessarily needs to display the identity and state information of the user at the product end of other users, and the user who is still anonymous cannot meet the requirement at the moment. Therefore, the CES needs to communicate with the ES to apply for a user identity token (i.e., a user ID, which may be temporarily assigned or permanently assigned), and then enter a room channel to interact with other users like a normal speaking user by using the user ID as an identifier.

The ES is used as an access server of the mobile client APP, provides TCP long connection service, and performs service interaction through a customized signaling protocol packet agreed between the ES and the APP client. These signaling protocol packet structured data are serialized into a binary data stream before being finally transmitted over a network. In this example, because the IVRS is an external service server, it is inconvenient to perform customized TCP long connection data communication with the ES in consideration of interface encapsulation, so we design that acquisition of room channel list information data between the IVRS and the ES is performed by using an HTTP protocol widely used in the industry, and docking difficulty is reduced. The ES can thus embed HTTP server functionality, receive HTTP requests from clients and return room channel information. It is also possible to load data by other systems (e.g., databases) and implement data acquisition services through conventional HTTP servers (e.g., Nginx/Apache/Nodejs).

Fig. 5 is a block diagram of a mobile internet voice platform system fusing a public switched telephone network PSTN according to a preferred embodiment of the present invention. As shown in fig. 5, the mobile internet voice platform system includes: the mobile internet voice system comprises a mobile internet voice platform and at least one PSTN access end system respectively connected with the mobile internet voice platform, wherein each PSTN access end system is respectively deployed in a machine room.

In fig. 5, in the mobile internet voice platform, the CMS is connected to the ES and the CS, respectively. In the PSTN access end system, the CTIS is respectively connected with the CES and the IVRS, and the IVRS is connected with the WEB server. The WEB server communicates with the ES through an HTTP protocol, a TCP channel for transmitting signaling is established between the CMS and the CES, and a full-duplex UDP channel for transmitting voice is established between the CES and the CS.

In the preferred implementation process, a user of the PSTN system accesses under the cooperation control of IVRS/CTIS, performs authentication and authorization to the ES through an interface of the WEB server, and then initiates a request to the CMS through the CES. The CMS establishes a virtual user, requests information such as an address port to be allocated to the CS, and replies to the CES. Then, the user of PSTN system, through CTIS as communication access, IVRS carries on the management of the data logic service, as a telephone user, in CES end, imitates to a virtual user, through CS in chat room with other users carry on the communication.

Fig. 6 is a flowchart of a communication method of a mobile internet voice platform system fusing PSTN according to an embodiment of the present invention. As shown in fig. 6, the communication method of the PSTN-integrated mobile internet voice platform system includes:

step S601: after a Public Switched Telephone Network (PSTN) user accesses a computer telecommunication integration server CTIS, a request message is sent to a channel management server CMS through a channel access service CES;

step S603: the CMS creates a virtual user and requests to join the channel server CS;

step S605: the CMS acquiring address port information from the CS and replying the address port information to the CES;

step S607: and the CES transmits the address port information to the CTIS.

In the method shown in fig. 1, a PSTN subscriber, at CTIS access, initiates a request to the CMS via CES. The CMS establishes a virtual user, acquires address port information from the CS and replies the address port information to the CES, and by adopting the communication method, under the condition that the Internet is unavailable and the user has strong demand for using a mobile Internet real-time voice communication product, the purpose of real-time full-duplex voice communication demand is realized, and a channel for accessing a mobile Internet voice platform by a PSTN network is provided.

Preferably, after the PSTN user accesses the Computer Telecommunication Integration Server (CTIS), before sending the request message to the channel management server CMS through the channel access service (CES), the following process may be further included: sending an authentication request to an access server (ES) through an interface of a WEB server; and after the ES passes the authentication of the PSTN user, the CES initiates the request message.

Preferably, before transmitting the request message to the Channel Management Server (CMS) through a channel access service (CES), the method may further include: the CES receives a register node instruction, wherein the register node instruction is transmitted by a client application layer through calling a libCES module interface embedded in the CTIS; the CES creating a TCP tunnel connection with the CMS and managing a connection object, and transmitting the registration node command to the CMS; the CMS maintains the TCP channel connection, records the registration node information after receiving the registration node instruction, and returns the registration result to the CES; and the CES returns the registration result to the libCES module of the CTIS.

The above preferred embodiment is further described below in conjunction with fig. 7.

Fig. 7 is a timing diagram of a CES node registration in accordance with a preferred embodiment of the present invention. As shown in fig. 7, the CES node registration procedure mainly includes:

step S701: the client application layer shares the library interface by calling libCES.

Step S703: an instruction is passed to the CES server to notify the CES to register with the CMS.

Step S705: CES creates a TCP connection with the CMS and saves the connection descriptor.

Step S707: the registration instruction packet is sent over the TCP channel.

Step S709: CMS monitors connection request from network, processes and maintains the connection after receiving TCP connection request, at the same time, receives registration command packet from CES on the channel, CMS records registration node information and returns registration result command packet.

Step S711: and after receiving the registration result packet, the CES maintains the internal state and forwards and informs the information to the libCES.

Preferably, the sending, by the CES, the audio information to the CTIS may further include: the CES receiving downlink audio data from the CS; the CES establishes connection with an RTMP server and receives background music data from the RTMP server; the CES determines whether the sampling rate of the background music data is identical to the sampling rate of the downlink audio data, and if not, performs resampling on the background music data; and after the background music data with the consistent sampling rate and the downlink audio data are mixed, the mixed data are directly transmitted to a libCES module embedded in the CTIS and are synchronized to a PSTN trunk line.

Fig. 8 is a flowchart of room music URI change and audio mixing according to a preferred embodiment of the present invention. As shown in fig. 8, the process of room music URI change and audio mixing mainly includes:

step S801: the CES receives a channel background MUSIC state change notification STRU _ CMS _ CES _ MUSIC _ RTMP _ CHG _ ID;

step S803: and the CES judges whether an RTMP server is connected or not. If so, step S805 is performed, otherwise, step S809 is performed.

Step S805: and the CES is disconnected with the original RTMP server.

Step S807: CES empties the music buffer.

Step S809: new RTMP channel information is set.

Step S811: and the CES is connected with the new RTMP server again.

Step S813: and after the CES is connected with the RTMP server, subscribing the background music data of the appointed channel.

Step S815: CES decodes music data into PCM data streams.

Step S817: the CES receives downstream audio data from the CS.

Step S819: the CES determines whether the sampling rate of the background music data is identical to the sampling rate of the downstream audio data, and if not, performs step S821.

Step S821: resampling is performed on the background music data.

Step S823: and mixing the background music data with the same sampling rate with the downlink audio data.

Step S825: and directly transmitting the audio data after audio mixing to a libCES module embedded in the CTIS, and synchronizing to a PSTN trunk line.

The social application program is generally provided with a music function, so that users can create various atmospheres in a chat room, and generally adopts an RTMP protocol to perform functions of distributing and subscribing song channels. An independent music channel is allocated to each room, and when the client accesses the RTMP server, the client transmits the ID of the music channel so as to upload, distribute and download song data subscribed to the music channel.

The mobile internet APP client side receives the audio data of the chat room, simultaneously connects and receives the music data issued by the music server, and determines whether resampling is needed according to the decoded music data format. The resampled music data is mixed with the audio information of the chat room and output to the audio data interface for the user to listen.

The mobile internet voice platform system is also in butt joint with the RTMP module, only the work is completely encapsulated in CES, the mixed data is directly transmitted to libCES and is synchronized to an E1 circuit of a PSTN digital relay, and at the moment, the PSTN telephone end can receive and hear audio information with background music after mixed sound.

Preferably, the communication method may further include: the interactive voice response system IVRS of the PSTN access end system responds to the operation of the PSTN user, and when the PSTN user selects a room channel entering a mobile Internet voice platform, the room channel information acquired from the ES is transmitted to the CTIS, so that the CTIS adds the PSTN user into the room channel; when the PSTN user selects the room channel to listen to through operation, the CES receives audio data of the room channel through a virtual user simplex channel specially established for the PSTN user and sends the audio data to the CTIS; when the PSTN user selects to speak in the process of listening to the room channel through operation, the CTIS cancels subscription of downstream audio data of a simplex channel of the PSTN user and invokes an interface of the libCES module to create a full-duplex channel, wherein the full-duplex channel is used for receiving the downstream audio data from the room channel and sending the voice data spoken by the PSTN user.

In the preferred implementation process, after the Interactive Voice Response System (IVRS) of the PSTN access-in end system controls the PSTN phone to dial in, the user is guided to press keys to enter the butt-joint operation flow according to the IVR voice. And (4) operating a key by a user, butting the IVRS and the CTIS to obtain a user response, and performing the next logic layer processing. And finally, the IVRS controls the CTIS to enter a room channel of a certain voice social APP platform. The room designed in IVRS can be further expanded into IVR procedures. For example, the user or a designated user's room can be found and used according to the user's membership number, channel number, etc. unique identification.

The telephone user dials the service number special for the IVR system to enter the IVR flow system. Then enter the chat room under the direction of the voice prompt button of the IVR system, and listen to the audio content of the chat room in an overhearing identity by default. For example, in a room in the host mode, when the user presses the key No. 1, the IVRS voice prompts that "your application is sent, please wait for the host to schedule, and the key No. 2 cancels the application for speech", when the user presses the key No. 2, the system judges whether there is an application for speech, if so, the voice prompts that "your application for speech is cancelled", and if not, no prompt processing is performed. When the user presses the number 0 key, the voice prompts the user to 'quit the chat room and return to the upper menu'.

When the user selects the listening room, the audio data received by the virtual user simplex channel which is specially established for the PSTN user between the CES and the CMS/CS is played. Specifically, if each chat room has a plurality of PSTN users, the CTIS only generates one-time signaling connection with the CMS/CS through libCES control CES, issues one piece of audio data and performs member management and audio data subscription and distribution in the CTIS.

The user can speak at any time through pressing a button in the on-hearing chat room, and at the moment, the CTIS cancels the subscription of the simplex channel downlink audio data of the virtual user in the room. And then calling a libCES interface to recreate an own exclusive full-duplex channel for receiving the audio information of the room listened by the user and sending the speaking voice data of the user through the channel.

The user can press a button to cancel the speech at any time in the speech state, and at the moment, the full-duplex exclusive channel cannot be immediately disconnected but the switch of the uplink voice data is stopped for the user experience consideration. Then, if the user also wants to speak, the system simply turns on the switch for upstream voice data. Certainly, considering according to the number of users, if there are many situations that a user applying for speaking cancels speaking (closes a microphone), it is further necessary to determine the time for switching to the operation of the audio data mode of the subscribed virtual user downlink simplex channel, and close the full duplex channel, so as to save traffic.

To sum up, with the above embodiments provided by the present invention, a Public Switched Telephone Network (PSTN) is accessed to a mobile Internet voice platform to form a mobile Internet voice platform system fused with the PSTN, and by using the system, under the condition that an Internet network is not available but a user has a strong demand for using a product, a channel of a PSTN network access platform can be provided, and the purpose of real-time full duplex voice communication demand is achieved. After the system is implemented, more access ways can be effectively provided for end product users. Meanwhile, the technology can be used for butting voice communities which are built on the basis of pure PSTN telephone networks and exist in the market, the product contents of the voice communities are enriched, a closed system platform for butting the traditional PSTN voice communication community with the mobile internet users is built, the product user area is enlarged, and the communication channels of people are enriched.

The above disclosure is only for a few specific embodiments of the present invention, but the present invention is not limited thereto, and any variations that can be made by those skilled in the art are intended to fall within the scope of the present invention.

Claims

1. A mobile internet voice platform system for fusing PSTN, comprising: the system comprises a mobile Internet voice platform and at least one PSTN access end system respectively connected with the mobile Internet voice platform;

wherein each PSTN access-end system comprises:

a channel access server CES responsible for upper audio processing of the PSTN access end system is respectively connected with a channel server CS and a channel management server CMS of the mobile internet voice platform;

the computer telecommunication integration server CTIS is responsible for processing the lower audio frequency of the PSTN access end system and is connected with the CES; the CTIS starts through a software program and operates a voice card drive to realize access control on a relay line, and voice packet data is delivered to a corresponding bottom hardware channel of the voice card through the drive to realize audio playing on a telephone line;

the telephone users of the PSTN system are used as communication access through the CTIS, and the CES can virtualize a certain number of social platform audio clients and is responsible for communicating with the CMS to access the virtual users to the CS.

2. The system of claim 1,

the CES comprises:

the system comprises an uplink receiving port, an audio encoder connected with the uplink receiving port and an uplink sending port connected with the audio encoder;

a downlink receiving port, a jitter buffer controller connected with the downlink receiving port, an audio decoder connected with the jitter buffer controller, an audio mixer connected with the audio decoder, and a downlink transmitting port connected with the audio mixer;

the CTIS comprises the following steps:

the system comprises a downlink receiving port, an uplink sending port, a computer telephony integration CTI controller respectively connected with the downlink receiving port and the uplink sending port, and CTI relay switching equipment connected with the CTI controller.

3. The system according to claim 1 or 2,

a Transmission Control Protocol (TCP) channel for transmitting signaling is established between the CES and the CMS, and a User Datagram Protocol (UDP) channel for transmitting voice is established between the CES and the CS;

and a TCP channel for transmitting signaling and a UDP channel for transmitting voice are established between the CES and the CTIS.

4. The system of claim 1, wherein the PSTN access end system further comprises:

a WEB server;

and the interactive voice response system IVRS is responsible for calling an interface of the WEB server to acquire room channel information and broadcasting the room channel information when a user accesses a corresponding flow, and is respectively connected with the WEB server and the CTIS.

5. The system of claim 1,

the CMS connected with the CS through a full-duplex UDP communication channel;

the mobile internet voice platform further comprises: and an access server (ES) which is responsible for providing an interface of the room channel information and is respectively connected with the CMS and a WEB server of the PSTN access end system, wherein the ES is communicated with the WEB server through a hypertext transfer protocol (HTTP).

6. A communication method of the mobile Internet voice platform system of any one of claims 1 to 5,

after a public switched telephone network PSTN user accesses a computer telecommunication integration server CTIS, a request message is sent to a channel management server CMS through a channel access service CES;

the CMS creates a virtual user and requests to join a channel server CS;

the CMS acquires address port information from the CS and replies the address port information to the CES;

and the CES sends the address port information to the CTIS.

7. The communication method of claim 6, wherein after the PSTN user accesses the computer telecommunication integration server CTIS, before sending the request message to the channel management server CMS through the channel access service CES, further comprising:

sending an authentication request to an access server ES through an interface of a WEB server;

and after the ES passes the authentication of the PSTN user, the CES initiates the request message.

8. The communication method according to claim 6, before sending the request message to the channel management server CMS through the channel access service CES, further comprising:

the CES receives a register node instruction, wherein the register node instruction is transmitted by a client application layer through calling a libCES module interface embedded in the CTIS;

the CES creates a TCP channel connection with the CMS and manages a connection object, and transmits the register node instruction to the CMS;

the CMS maintains the TCP channel connection, records the information of the registration node after receiving the instruction of the registration node, and returns the registration result to the CES;

and the CES returns the registration result to a libCES module of the CTIS.

9. The communication method of claim 6, wherein the CES sending audio information to the CTIS comprises:

the CES receives downstream audio data from the CS;

the CES establishes connection with the RTMP server and receives background music data from the RTMP server;

the CES determines whether the sampling rate of the background music data is consistent with the sampling rate of the downlink audio data, and if not, performs resampling on the background music data;

and mixing the background music data with the consistent sampling rate with the downlink audio data, directly transmitting the mixed data to a libCES module embedded in the CTIS, and synchronizing the mixed data to a PSTN trunk line.

10. The communication method according to any one of claims 6 to 9, further comprising:

an interactive voice response system IVRS of the PSTN access end system responds to the operation of the PSTN user, and when the PSTN user selects a room channel entering a mobile Internet voice platform, room channel information acquired from an ES is transmitted to the CTIS, so that the CTIS adds the PSTN user into the room channel;

when the PSTN user selects the room channel to listen to through operation, the CES receives audio data of the room channel through a virtual user simplex channel specially established for the PSTN user and sends the audio data to the CTIS;

when the PSTN user selects to speak in the process of listening to the room channel through operation, the CTIS cancels subscription of downstream audio data of a simplex channel of the PSTN user and calls an interface of a libCES module to create a full-duplex channel, wherein the full-duplex channel is used for receiving the downstream audio data from the room channel and sending the voice data spoken by the PSTN user.