CN108683820B

CN108683820B - Mobile internet voice platform system fusing public switched telephone network PSTN and application method thereof

Info

Publication number: CN108683820B
Application number: CN201810384107.9A
Authority: CN
Inventors: 周平; 胡海; 田维忠; 向泽清; 蔡君
Original assignee: Guiyang Longmaster Information and Technology Co ltd
Current assignee: Guiyang Longmaster Information and Technology Co ltd
Priority date: 2018-04-26
Filing date: 2018-04-26
Publication date: 2021-10-15
Anticipated expiration: 2038-04-26
Also published as: CN108683820A

Abstract

The invention discloses a PSTN-fused mobile Internet voice platform system and an application method thereof. The method comprises the following steps: CES receives a request packet from a CTIS on-listening room channel; initiating a request packet for a PSTN user to apply for joining a room channel to a CMS; when the CMS determines that a room channel can be added, a virtual user is created and requests to be added to the CS, and after address port information is obtained from the CS, the CES sends the address port information to a libCES module embedded in the CTIS; and the libCES module and the CES respectively start a service process for enabling the CES to send audio information to the CTIS. According to the technical scheme provided by the invention, under the condition that the Internet is unavailable and the user has strong demand for using the mobile Internet real-time voice communication product, the purpose of realizing the real-time voice communication demand can be realized, and a channel for accessing the mobile Internet voice platform by the PSTN network is provided.

Description

Mobile internet voice platform system fusing public switched telephone network PSTN and application method thereof

Technical Field

The invention relates to the field of communication, in particular to a mobile internet voice platform system fusing a Public Switched Telephone Network (PSTN) and an application method thereof.

Background

With the development of information technology and the arrival of the mobile internet era, mobile social products gradually rise, and products such as microblogs, QQ, WeChat, strange people, YY and the like which are common in the market at present provide business scene services such as media information publishing, instant messaging, stranger friend making, audio and video live broadcast and the like, and simultaneously, social elements are integrated in a diversified manner, so that various mobile internet APPs can have rich product functions and good user experience.

On the social platform product, the person-to-person communication is a basic element, and the common communication functions mainly include: the presentation forms include image-text instant messages, asynchronous Voice messages (Voice Over Internet Protocol, VOIP for short), real-time Voice calls (full-duplex VOIP Voice calls), and the like. The image-text instant message is the most convenient and most common communication tool, and text or multimedia data is transmitted in WiFi/GSM/CMDA and other networks through client application software and a server to realize image-text instant communication; the 'asynchronous voice' technology is a 'push-to-talk' service based on a network, the service function is that when a user presses a recording button, a recording module is started, a speaking audio signal is collected, encoded and compressed, the speech audio signal is transmitted to a mobile phone of the opposite side through a network server, and after receiving audio data, a receiving side device clicks a playing button and listens through a loudspeaker. According to the scheme, the social communication is carried out by recording the voice short messages through speaking, the input information of both hands is released, the use threshold is reduced, and the convenience is improved to a certain extent; the real-time voice communication is a VOIP technology which is rapidly developed, under the condition that a broadband network and a 4G mobile network are popularized, the real-time voice application is widely developed and applied in the mobile internet, the reality sense in a virtual network social environment is enhanced by continuous voice communication, and the continuous voice communication is favored by users. The QQ, WeChat, YY and other internet social products realize one-to-one, many-to-many real-time call products among friends, and the VOIP network voice communication product close to the traditional telephone communication gives people more comfortable communication experience, but also provides higher requirements for the data transmission quality of the communication network.

In view of construction cost, convenience in use and other factors, mobile Internet real-time voice communication products in the market basically use Internet broadband WiFi and mobile 3G/4G networks as communication carrying networks, and simultaneously perform audio data jitter buffering, coding and decoding compression, audio mixing synthesis processing and distribution delivery through a server soft switching system, so as to realize a transmission switching subsystem of multi-party real-time voice communication. At present, broadband and 4G networks are already popular, although the 4G network theoretically can reach a bandwidth speed of 100Mbps per second (about 12.5MB per second), because IP data packets are transmitted on the Internet depending on multiple times of network routing node data exchange, and network problems caused by wireless signal strength factors such as terminal network signal interference, shielding and the like, the network access quality of a part of mobile devices is still poor, and scenes that Internet network interruption, data transmission delay jitter is large and the like are not suitable for using VOIP real-time voice are caused.

Therefore, under the situation that the Internet network is unavailable and the user has a strong demand for using the mobile Internet real-time voice communication product, what kind of technical scheme is adopted can meet the real-time voice communication demand, and the problem to be solved at present is urgently needed.

Disclosure of Invention

The invention mainly aims to disclose a mobile Internet voice platform system fusing a Public Switched Telephone Network (PSTN) and an application method thereof, so as to at least solve the problem that in the related technology, under the condition that an Internet network is unavailable and a user has strong requirements on using a mobile Internet real-time voice communication product, a technical scheme capable of realizing the real-time voice communication requirement is lacked.

According to one aspect of the invention, a mobile internet voice platform system fusing Public Switched Telephone Network (PSTN) is provided.

The mobile Internet voice platform system fusing the PSTN comprises the following components: the computer telecom integration server CTIS is embedded with a libCES module and is used for sending a request packet of a PSTN user for listening to a room channel to a channel access service CES, receiving address port information and starting a service process to receive audio information; the CES is configured to receive a request packet of the room channel to listen to from the CTIS, initiate a request packet of the PSTN user applying for joining the room channel to a channel management server CMS, send address port information obtained by the CMS to a libCES module embedded in the CTIS, and start a service process to send audio information to the CTIS; the CMS is configured to, when determining that the channel can be added to the room, create a virtual user and request to add to a channel server CS, and acquire address port information from the CS; and the CS is used for distributing the address port information and issuing the audio information.

According to another aspect of the invention, an application method of the PSTN-converged mobile Internet voice platform system is provided.

The application method of the mobile Internet voice platform system fusing the PSTN comprises the following steps: a channel access server CES receives a request packet of a room channel of a listening side from a computer telecommunication integration server CTIS; initiating a request packet for the PSTN user to apply for joining a room channel to a channel management server CMS; when the CMS determines that the CMS can join the room channel, a virtual user is created and requests to join a channel server CS, and after address port information is obtained from the CS, the CES sends the address port information to a libCES module embedded in the CTIS; and the libCES module and the CES respectively start service processes for enabling the CES to send audio information to the CTIS.

Compared with the prior art, the embodiment of the invention at least has the following advantages: the public switched telephone network PSTN is accessed into the mobile Internet voice platform to form a mobile Internet voice platform system fused with the PSTN, and by adopting the system, under the condition that an Internet network is unavailable and users have strong requirements for using mobile Internet real-time voice communication products, the purpose of real-time voice communication requirements can be realized, and a channel for accessing the mobile Internet voice platform by the PSTN network is provided.

Drawings

Fig. 1 is a block diagram of a mobile internet voice platform system fusing a public switched telephone network PSTN according to an embodiment of the present invention;

fig. 2 is a schematic diagram of the structure of CES and CTIS according to a preferred embodiment of the present invention;

fig. 3 is a block diagram of a PSTN-converged mobile internet voice platform system according to a preferred embodiment of the present invention;

fig. 4 is a flowchart of an application method of the mobile internet voice platform system fusing the PSTN according to an embodiment of the present invention;

FIG. 5 is a timing diagram of a process for requesting an overhearing chat room in accordance with a preferred embodiment of the present invention;

FIG. 6 is a timing diagram illustrating a process for requesting termination of an onlisten chat room in accordance with a preferred embodiment of the present invention;

FIG. 7 is a sequence diagram of a CMS kicking user flow according to a preferred embodiment of the present invention;

fig. 8 is a timing diagram of a request to participate in a talk flow in accordance with a preferred embodiment of the present invention.

Detailed Description

The following detailed description of specific embodiments of the present invention is provided in conjunction with the accompanying drawings.

Fig. 1 is a block diagram of a mobile internet voice platform system fusing a public switched telephone network PSTN according to an embodiment of the present invention. As shown in fig. 1, the mobile internet voice platform system includes: the computer telecommunication integration server CTIS 10 embedded with the libCES module 100 is used for sending a request packet of a PSTN user for listening to a room channel to a channel access service CES, receiving address port information, and starting a service process to receive audio information; the CES 12 is configured to receive a request packet of the room channel to listen to from the CTIS, initiate a request packet of the PSTN user applying for joining the room channel to a channel management server CMS, send address port information obtained by the CMS to a libCES module embedded in the CTIS, and start a service process to send audio information to the CTIS; the CMS 14, configured to create a virtual user and request to join the channel server CS 16 when determining that the channel in the room can be joined, and obtain address port information from the CS; the CS 16 is configured to allocate the address port information and issue the audio information.

Under the condition that the Internet network is unavailable and the user has strong demand for using the mobile Internet real-time voice communication product, the PSTN-integrated mobile Internet voice platform system shown in the figure 1 is applied, the purpose of real-time voice communication (for example, listening to a room channel) can be realized, and a channel for accessing the mobile Internet voice platform by the PSTN network is provided.

Preferably, the mobile internet voice platform system may further include: a WEB server 18; and an Interactive Voice Response System (IVRS)20 for calling the interface of the WEB server to obtain room channel information and broadcasting the room channel information when the user accesses the corresponding flow, wherein the IVRS is respectively connected with the WEB server and the CTIS.

The IVRS 20 encapsulates the HTTP access interface, calls the Web server 30 interface to acquire room channel information, broadcasts the room information (type, number of people and the like) when a user accesses the corresponding flow, and enables the user of the telephone to enter the corresponding flow through the IVR. When the telephone user selects to enter the channel, the IVRS transmits the room ID information to the CTIS through the interface with the CTIS. And finally, calling a libCES corresponding interface by the CTIS to enter a channel.

The IVRS 20 listens to the voice social platform room sound function, and the voice social platform (CMS) implements a virtual "transparent" user to deliver the corresponding room audio to the CTIS gateway. A channel user manager is required to be designed in the CTIS, the information of the number of people participating in the telephone of the current gateway node can be recorded, and meanwhile, the audio stream subscription function is realized through the channel user manager.

The CTIS is a main body of the PSTN network management for the audio data docking of the voice social platform channel, and is responsible for calling a request interface which is provided by a libCES shared library module embedded in the CTIS and enters the voice social platform channel and adding a user into the voice channel. The CTIS realizes the design of a channel and channel member manager, is used for managing corresponding memory data, and maintains the data in the manager when a libCES shared library module embedded in the CTIS is called to enter (leave) a room channel. In order to save traffic bandwidth of the CTIS machine room, downlink voice of a room channel is sent to the CTIS only through a virtual user channel (a corresponding virtual user management mechanism is completed by CMS/CES/libCES cooperation). The CTIS calls a corresponding interface of the libCES to complete the function of accessing a room channel of the voice social platform, and meanwhile, the libCES calls back output audio data after the libCES is successfully accessed into the room. At this time, the CTIS delivers the voice packet data (g.711a-law) of a certain room which is continuously received to the corresponding bottom hardware channel of the board card through the driving API according to the room channel and the member information managed by the CTIS, so as to realize audio playing on the telephone line.

In order to realize the real-time chat communication function between the telephone user and the voice social platform user, the libCES creates a full-duplex voice uplink and downlink channel for the speaking user to send uplink and downlink audio data for the speaking user.

Preferably, the CTIS may include: the system comprises a downstream receiving port, an upstream sending port, a computer-telephone integrated CTI controller respectively connected with the downstream receiving port and the upstream sending port, and CTI relay switching equipment connected with the CTI controller.

The CTIS is generally deployed on an industrial computer, common digital trunk switching equipment (such as a Sanhui SHD-120D-CT digital trunk voice card and an east Keygoe3003 multimedia switch) can be inserted or connected into a PCI slot of the industrial computer, and a digital trunk coaxial cable is accessed to a backboard of the digital trunk switching equipment. The IPC is simultaneously provided with a hardware driver of a voice card, and the CTIS starts through a software program and operates the voice card driver to realize access control (such as basic operations of telephone connection, audio recording, audio playing and the like) to the relay line.

In view of the fact that the CTIS needs to be connected with board card hardware, and the hardware driving version is limited by various hardware product manufacturers, the problem of inconsistent support quality of the operating system often exists. Therefore, there may be multiple system versions of the CTIS, which are commonly used as Windows platform and Linux platform. Therefore, libES needs to provide multiple versions of running systems, corresponding to Windows systems, and should be libES.

The libCES is loaded into the memory address space allocated by the CTIS process by the operating system integration during the operation period to operate. The SDK is an application layer integrated SDK of CES service, external service providing is realized through API, and the SDK is designed to provide function level interface calling, so that the integration of CTIS and other integrated access parties is facilitated.

libCES communicates with CES, signaling is transmitted through a TCP channel, and voice is transmitted through a UDP channel. And packaging various instruction protocols and audio stream data, and providing an SDK development kit at an API level for integrated use of an application layer.

And the libCES designs a callback notification interface through a C language function pointer scheme, an application layer program corresponding to the CTIS transfers a function address used for receiving instructions and audio data to a memory of a libCES module, and the libCES transfers the data to the CTIS application layer through the callback function pointer after unpacking and converting different kinds of data.

The IVRS obtains the list information of the chat room of the voice social platform, provides a room ID when communicating with the CTIS, and calls a corresponding interface of libCES by the CTIS to complete the function of accessing to the chat room channel of the mobile internet voice social platform. Meanwhile, the libCES callback outputs room audio data.

Preferably, the CES includes: an uplink receiving port, an encoder connected with the uplink receiving port, and an uplink transmitting port connected with the encoder; a downlink receiving port, a jitter buffer controller connected with the downlink receiving port, an audio decoder connected with the jitter buffer controller, an audio mixer connected with the audio decoder, and a downlink transmitting port connected with the audio mixer.

Preferably, a Transmission Control Protocol (TCP) channel for transmitting signaling is established between the CES and the CMS, and a User Datagram Protocol (UDP) channel for transmitting voice is established between the CES and the CS.

Preferably, a TCP channel for transmitting signaling and a UDP channel for transmitting voice are established between the CES and the CTIS.

Fig. 2 is a schematic diagram of the structure of CES and CTIS according to the preferred embodiment of the present invention.

As shown in fig. 2, CES is a core server of the virtual mobile internet voice social APP terminal, and is an upper-level (compared with an upper-level computer of an automation control system) audio processing unit of each PSTN access point.

The libCES is used as an application integration SDK of the CES, and a communication bridge is established between the CES, the CTIS and other servers. FIG. 2 makes libCES transparent, and omits the logic existence of the module, so as to express the organization relation between servers.

After the CES is successfully connected with the CMS in the 'joining channel' signaling, the CES and the CS establish a UDP communication channel (adopting RTP protocol for communication) and access data in the appointed channel. The UDP channel is a full duplex channel and can perform downlink audio packet transmission and uplink audio transmission. The processes of collecting, encoding, compressing, transmitting and the like of the VOIP audio data packet are not the key points of the present invention, and are not described in detail.

When receiving downlink voice, a jitter buffer controller of a CES is responsible for transmission jitter buffer control of audio data, an audio decoder decodes audio, then an audio mixer mixes the decoded audio data with received room background music media stream data (music media stream receiving modules may have actions of music media stream decoding, resampling and the like), and after audio sampling width, frequency and coding format conversion processing, the audio data is sent to a CTIS through a downlink sending port, and the CTIS operation board card outputs the audio data to a PSTN line.

When receiving the uplink voice, the CES encoder is responsible for encoding and compressing the audio data, the uplink transmitting port is responsible for packet transmission, and finally the audio data is sent to the CS for processing.

The audio coding between the CES and the CS of the system can adopt an SILK coding algorithm (the code rate is 30kbps), and the audio coding between the CES and the CTIS can adopt a G.711a-law coding algorithm (the code rate is 64 kbps). All audio coding and decoding processes are calculated in CES.

As shown in fig. 2, the CTIS is a processing server that is accessed to the PSTN network and controls the voice relay board card device to perform data interfacing between audio data and a board card link, and is a lower-level (lower-level machine compared to the automation control system) audio processing unit of each PSTN access point.

The CTIS encapsulates N (e.g., 32) channel slots (timeslots) on a re 1 into 30 channels (0/16 timeslots are synchronization and signaling slots), for example, by digital trunk switching equipment, and each re 1 provides 30 user concurrent access capabilities.

The CTIS can create a virtual user (created and managed in advance) for each room channel of the voice social platform, and is used for uniformly receiving downlink voice of each channel of the voice social platform, and if a PSTN telephone user only listens to channel content, the CTIS only needs to subscribe the channel audio stream. The CTIS will deliver audio data to each audio subscriber.

When a PSTN user wishes to speak, the CTIS creates an exclusive signaling (the signaling is transmitted by an exclusive TCP channel established by a CES and an ES and forwarded to the CMS through the ES) and an audio stream (the audio stream carries out UDP communication with the CS) transmission channel for the user through the CES, so that full-duplex audio real-time data receiving and transmitting are realized. The CTIS interacts with voice board audio data, and adopts PCMA (G.711a-law coding) voice compression standard.

The CTIS carries out memory playback and memory recording through the drive control board card, and realizes real-time audio data recording and playing control with the VOIP system. The PCMA data in the memory is transmitted to the board card to be driven to realize playback, and similarly, the board card can also call back the CTIS program to obtain the recorded PCMA data through driving during recording.

Preferably, the CMS 14 may be connected to the CS 16 through a full duplex UDP communication channel.

Preferably, the mobile internet voice platform system may further include: an access server (ES) configured to allocate a temporary user identifier of the PSTN user, determine a backend server to be forwarded after receiving and identifying the CES and ES dedicated communication packets, authenticate the temporary user identifier in the received dedicated communication packet, and send a request packet for adding a room channel to the CMS if the authentication is passed; the libCES module embedded in the CTIS is further configured to control the CTIS to cancel subscription of the PSTN user to downlink audio data of the room channel when the PSTN user needs to speak in the room channel, control the CES to establish a TCP communication channel to the ES through a message, and apply for a temporary user identifier for identifying the PSTN user to the ES; the CES is further configured to embed service signaling data including the temporary user identifier into the CES and ES dedicated communication packets, send the CES and ES dedicated communication packets to the ES, and send a response message indicating that a user joins a room channel to a libCES module embedded in the CTIS; the CMS is further configured to create a virtual user, request a channel joining server CS, acquire address port information from the CS, and then embed a response packet including the address port information on a channel joining a room in the CES and the ES dedicated communication packet and forward the response packet to the CES via the ES.

The mobile Internet voice platform system fusing the PSTN can realize the purpose of real-time voice communication (for example, applying for speaking in a room channel), and provides a channel for the PSTN network to access the mobile Internet voice platform.

In a preferred embodiment, the access server ES 104, which is responsible for providing an interface for the room channel information, is connected to the CMS 102 and the WEB server 204 of the PSTN access system 20, wherein the ES 104 communicates with the WEB server 204 through the HTTP protocol.

The CMS is a voice chat channel management server of a social server, and is one of core servers of a real-time voice function of a social platform. In this access scheme, interface logic is provided for PSTN access users to join a chat room, create a user and join a CS. And returning the information such as the IP/PORT of the RTP service related to the access of the CS allocation user to the Client (CES).

For administrative and cost reasons, leased or co-operating telecom, Unicom, Mobile, etc. telecom carrier digital trunk E1 lines are typically only capable of receiving or calling in-network telephone numbers. Therefore, the CMS supports multiple CES node access, i.e. IVR systems are deployed in many operator rooms. Therefore, there are multiple CES, and each CES applies for joining the same chat room channel according to the service requirement.

The CES can virtualize a certain number of social platform audio clients and is responsible for communicating with the CMS to access virtual users to the CS and acquire audio real-time transmission information such as RTP ports and SSRC distributed by the CS.

And receiving channel voice of the chat room by using an RTP protocol and parameter rules matched with the CS and the CS communication, carrying out jitter buffering and audio decoding, and resampling output audio into G.711a-law PCM data with 8 bits of sampling frequency and 8K of sampling bits supported by the PSTN line.

CES and a libCES (Windows system version is libCES.dll or linux system version is libCES.so) of a gateway system with an IVR function, which is connected to a PSTN network, are communicated, instruction information and audio data are output to the CTIS, the Server is controlled to be in API interface with CTI relay switching equipment, and media information is output to a digital relay link.

Considering factors such as migration and development complexity of a voice system, a CES is an application service program running in a Linux system, and a CTIS is determined according to a required version of a digital relay switching equipment product, and may be a Windows server program or a Linux server program. And libCES is a dynamic runtime library embedded in CTIS. The libCES is only responsible for simple control instructions and audio data serialization packaging transparent transmission work, and a network jitter control module is not required to be designed. Therefore, the CES and the CTIS need to be deployed in a 1000M lan in the same computer room, so as to ensure a reliable network communication line between them. When the libCES is initialized, the client is actively connected with a monitoring port bound by CES, and a TCP/UDP communication channel with stable transmission is established between the client and the monitoring port. The control instructions and audio packets are then forwarded on the channel.

In the scene of intercommunication between a telephone user and an APP user of a voice social platform, downlink voice of each room channel is sent to the CTIS through a CES (network access control system) by adopting a special virtual user channel so as to save flow bandwidth.

Because the digital trunk circuit is connected through the digital voice board card after the CTIS, the line delay and the transmission jitter of the audio data transmission of the telephone users connected with the PSTN network are extremely small, and the CTIS is centralized at one node and is connected with the CS through the CES. Therefore, users under the control of the same CTIS gateway can listen to the audio information of the same chat channel and can receive data through the uniform downlink channel to subscribe and distribute memory data. The bandwidth waste caused by the transmission of the same data by a plurality of data channels is avoided. Generally speaking, a speaking user who accesses a telephone needs to create an exclusive full-duplex channel, and uplink and downlink audio of the speaking user of the telephone are transmitted by a dedicated channel of the speaking user, so that virtual user channel data is not used any more.

The ES may provide an interface to chat room channel lists, obtaining an open room list that may be entered. The room list with more personalized characteristics of interest matching, crowd matching, friend matching and the like and room information query service are provided. And the interface provided by the ES is packaged by adopting an HTTP protocol, and is used for the IVRS integrated HTTP client module to call and pull data.

The ES is used as a cloud entry server used by the mobile Internet voice platform, provides information such as a room list for servers such as the IVRS and the like, and also carries user access capability service of the mobile Internet voice platform. Therefore, the ES has a whole set of functional systems such as user login and authentication, and the like, so as to realize the function of accessing the service. Meanwhile, the ES is used as an access server of a mobile internet voice platform, and the proxy function of network communication between the mobile client application program APP and the CMS is also realized, namely, the mobile client APP interacts with the CMS (such as actions of joining a channel, leaving the channel, acquiring a channel member list, overhearing, speaking and the like), unlike the communication mode of CES directly connecting the CMS provided by the document, all communication is realized by proxy forwarding of a TCP long connection communication channel established between the APP and the ES.

The CES is used as a simulator of the mobile internet voice client logic, adopts a direct connection CMS communication scheme instead of direct connection of ES like APP, and realizes communication from APP to CMS through ES agents. The reason is as follows:

and the CES serves as a client, and when a PSTN user needs to enter a certain mobile Internet voice APP room channel, whether a communication channel of the channel which is initialized and established is available is judged. If not, a communication channel is established by default, and the subscription listening function for a certain channel on the CMS/CS is realized. The function is used as a main working mode (namely: a listening mode) of the PSTN system for outputting audio, and under the mode, a listener only receives audio output of a room channel, and due to the special design among servers, the function can be realized anonymously, and the business logic design is not developed on the basis of user identity.

CES acts as a client and PSTN users wish to participate in channel interaction, at which time the system needs to establish a full duplex voice channel. The PSTN speaking user realizes the two-way real-time voice communication with other users of the mobile Internet voice system through the full duplex voice channel which is shared by the PSTN speaking user. The user who owns the exclusive full-duplex channel necessarily needs to display the identity and state information of the user at the product end of other users, and the user who is still anonymous cannot meet the requirement at the moment. Therefore, the CES needs to communicate with the ES to apply for a user identity token (i.e., a user ID, which may be temporarily assigned or permanently assigned), and then enter a room channel to interact with other users like a normal speaking user by using the user ID as an identifier.

The ES is used as an access server of the mobile client APP, provides TCP long connection service, and performs service interaction through a customized signaling protocol packet agreed between the ES and the APP client. These signaling protocol packet structured data are serialized into a binary data stream before being finally transmitted over a network. In this example, because the IVRS is an external service server, it is inconvenient to perform customized TCP long connection data communication with the ES in consideration of interface encapsulation, so we design that acquisition of room channel list information data between the IVRS and the ES is performed by using an HTTP protocol widely used in the industry, and docking difficulty is reduced. The ES can thus embed HTTP server functionality, receive HTTP requests from clients and return room channel information. It is also possible to load data by other systems (e.g., databases) and implement data acquisition services through conventional HTTP servers (e.g., Nginx/Apache/Nodejs).

Fig. 3 is a block diagram of a mobile internet voice platform system fusing a public switched telephone network PSTN according to a preferred embodiment of the present invention. As shown in fig. 3, the mobile internet voice platform system includes: the mobile internet voice system comprises a mobile internet voice platform and at least one PSTN access end system respectively connected with the mobile internet voice platform, wherein each PSTN access end system is respectively deployed in a machine room.

In fig. 3, in the mobile internet voice platform, the CMS is connected to the ES and the CS, respectively. In the PSTN access end system, the CTIS is respectively connected with the CES and the IVRS, and the IVRS is connected with the WEB server. The WEB server communicates with the ES through an HTTP protocol, a TCP channel for transmitting signaling is established between the CMS and the CES, and a full-duplex UDP channel for transmitting voice is established between the CES and the CS.

In the preferred implementation process, a user of the PSTN system accesses under the cooperation control of IVRS/CTIS, performs authentication and authorization to the ES through an interface of the WEB server, and then initiates a request to the CMS through the CES. The CMS establishes a virtual user, requests information such as an address port to be allocated to the CS, and replies to the CES. Then, the user of PSTN system, through CTIS as communication access, IVRS carries on the management of the data logic service, as a telephone user, in CES end, imitates to a virtual user, through CS in chat room with other users carry on the communication.

Fig. 4 is a flowchart of an application method of the mobile internet voice platform system fusing the PSTN according to an embodiment of the present invention. As shown in fig. 6, the application method of the PSTN-integrated mobile internet voice platform system includes:

step S401: CES receives a request packet of a room channel of a listening side from a computer telecommunication integration server CTIS;

step S403: initiating a request packet for the PSTN user to apply for joining a room channel to a channel management server CMS;

step S405: when the CMS determines that the CMS can join the room channel, a virtual user is created and requests to join a channel server CS, and after address port information is obtained from the CS, the CES sends the address port information to a libCES module embedded in the CTIS;

step S407: and the libCES module and the CES respectively start service processes for enabling the CES to send audio information to the CTIS.

In the method shown in fig. 4, a PSTN subscriber, at CTIS access, initiates a request to the CMS via CES. The CMS establishes a virtual user, acquires address port information from the CS and replies the address port information to the CES, and by adopting the communication method, under the condition that an Internet network is unavailable and under the condition that the user has strong requirements for using a mobile Internet real-time voice communication product, the purpose of real-time voice communication (for example, the user listens to a room channel) is realized, and a channel for accessing a mobile Internet voice platform by a PSTN network is provided.

Preferably, after the CES receives a request packet of a room channel to listen on the side from the computer telecommunication integration server CTIS, the method may further include: and the CES analyzes the request packet of the room channel and clears the data of the PSTN user in the expired room channel when the PSTN user is determined to be added into the room channel.

Preferably, after the CMS acquiring address port information from the CS and replying the address port information to the CES, the CMS may further include: and the CES provides audio channel port information distributed to the PSTN user and replies the audio channel port information and the address port information to the libCES module embedded in the CTIS together.

Preferably, the enabling of the service process by the libCES module and the CES respectively, for enabling the CES to send the audio information to the CTIS, may further include: the libCES module sends a network address translation NAT traversal packet to a monitoring port of the CES in a timing cycle mode; CES sends NAT traversing packet to RTP port distributed by CS in timing cycle; and the CES receives the downlink audio data from the CS, processes the downlink audio data, converts the downlink audio data into voice information in a preset voice format and transmits the voice information to the CTIS.

Preferably, for a case that background music data exists in a chat room channel, the libCES module and the CES module respectively start a service process, and the enabling the CES to send audio information to the CTIS may further include: the libCES module sends NAT traversing packets to the monitoring port of the CES in a timing cycle mode; CES sends NAT traversing packet to RTP port distributed by CS in timing cycle; the CES receiving downlink audio data from the CS; the CES establishes connection with an RTMP server and receives background music data from the RTMP server; the CES determines whether the sampling rate of the background music data is identical to the sampling rate of the downlink audio data, and if not, performs resampling on the background music data; and after the background music data with the consistent sampling rate and the downlink audio data are mixed, converting the mixed data into the preset voice format, directly transmitting the preset voice format to a libCES module embedded in the CTIS, and synchronizing the preset voice format to a PSTN trunk line.

The preferred implementation described above is further described below in conjunction with fig. 5.

Fig. 5 is a timing diagram of a process for requesting to listen to a chat room in accordance with a preferred embodiment of the present invention. As shown in fig. 5, the process of requesting to listen to a chat room includes:

step S501: when a PSTN user initiates a request for an on-listening chat ROOM, a request packet STRU _ CLIENT _ CES _ START _ AUDIT _ ROOM _ RQ of an on-listening ROOM channel is initiated to a CES through libCES of a CTIS;

step S503: the CES analyzes the request packet and judges that the PSTN user joins the room channel;

step S505: when the PSTN user is determined to be added into the room channel, disaster recovery protection processing is initiated to CMS (for example, the existing onlooker user of the channel is kicked out);

step S507: the CMS clears the data that the PSTN subscriber has expired in the room channel.

Step S509: after the CES initiates disaster recovery protection processing, a data packet request for applying for joining a room is continuously initiated to the CMS, that is: STRU _ CES _ CMS _ JOIN _ RQ;

step S511: after the CMS analyzes the data packet, whether the next operation can be carried out is calculated according to indexes such as room states, and if the next operation can be carried out, a room channel can be added, and then a virtual user is created;

step S513: CMS initiates a PCS request to CS;

step S515: after receiving the join request, the CS allocates an RTP port;

step S517: the CS returns the RTP port to the CMS;

step S519: the CMS returns the CES with a data packet STRU _ CES _ CMS _ JOIN _ RS;

step S521: CES provides a clientUDP port assigned to the PSTN user;

step S523: the data of the RTP port and the clientUDP port are uniformly replied to the CTIS;

step S525: the libCES regularly and circularly sends a hole packet (namely an NAT traversal packet) to a monitoring port of the CES, and receives an audio data packet from a socket of the hole packet; CES sends a hole packet to an RTP port distributed by the CS in a timing cycle, sends a heartbeat packet to the CS, and receives a CS voice packet;

step S527: CES extracts the background sound of the room, acquires the related information of the appointed background sound, starts the RTMP manager and subscribes the music data stream;

step S529: starting an RTMP server manager: connecting the RTMP server, subscribing real-time streaming media data according to the channel information, decoding, and mixing the voice data decoded from the CS voice packet;

step S531: and converting the audio mixing data into a G.711a-law format, and sending the address acquired by punching the libCES to the libCES.

Preferably, after the libCES module and the CES module respectively start a service process to enable the CES to send audio information to the CTIS, the method may further include: the CES receives an end overhearing request from the PSTN user through a libCES module embedded in the CTIS; the CES judges whether the PSTN user has joined the room channel; under the condition that the PSTN user joins the room channel, verifying the PSTN user according to the user identification of the PSTN user; and if the verification is passed, the CES sends a fallback notification to the CMS so that the CMS notifies the CS of the PSTN subscriber being kicked out.

The above preferred embodiment is further described below in conjunction with fig. 6.

Fig. 6 is a timing diagram illustrating a process for requesting termination of an onlisten chat room in accordance with a preferred embodiment of the present invention. As shown in fig. 6, the process of requesting to end the chat room includes:

step S601: a PSTN user puts forward a request for finishing the overhearing to a CES through libCES of the CTIS;

step S603: CES makes logic judgment to make the PSTN user join the room channel; if so, step S605 is performed.

Step S605: checking the PSTN user according to the user identification of the PSTN user; if so, step S607 is executed.

Step S607: and clearing all expired data resources of the PSTN user in the room channel.

Step S609: the CES transmits a fallback notification STRU _ CES _ CMS _ EXIT _ ID to the CMS.

Step S611: after the CMS analysis is completed, the number of the channels is cleared, namely the channel is kicked out of the bystander users, and the traffic information CS is kicked out of the users.

Preferably, after the CMS notifies the CS of kicking out the PSTN user, the method may further include: receiving, by the CES, a kick-out user notification from the CMS; the CES judges whether the PSTN user joins the room channel again; under the condition that the PSTN user joins the room channel, verifying the PSTN user again according to the user identification of the PSTN user; and the CES clears all data of the PSTN user in the room channel and sends a user kicking notification to a libCES module embedded in the CTIS.

The above preferred embodiment is further described below in conjunction with fig. 7.

Fig. 7 is a sequence diagram of a process of a CMS kicking out a user according to a preferred embodiment of the present invention. As shown in fig. 7, the CMS kicking user flow includes:

step S701: after the CMS finishes the operation kicked by the user, the CMS sends a notification STRU _ CES _ CMS _ KICKOUT _ ID to the CES.

Step S703: the CES carries out logic judgment again to judge that the PSTN user joins the room channel; if so, step S705 is performed.

Step S705: checking the PSTN user according to the user identification of the PSTN user; if so, step S707 is executed.

Step S707: and clearing all expired data resources of the PSTN user in the room channel.

Step S709: the CES replies STRU _ CES _ CLIENT _ BE _ STOP _ AUDIT _ ROOM _ ID to libCES, at which time the user requests the completion of the logout operation.

Preferably, the communication method may further include: when the PSTN user needs to participate in speaking in the room channel, the libCES module controls the CTIS to cancel subscription of the PSTN user to downlink audio data of the room channel; the libCES module controls the CES to establish a TCP communication channel to the ES through a message, and applies a temporary user identifier for identifying the PSTN user to the ES; the CES embeds service signaling data containing the temporary user identifier into the CES and the ES special communication packet and sends the service signaling data to the ES, so that the ES identifies the special communication packet and then determines a back-end server to be forwarded; the ES authenticates the temporary user id in the received dedicated communication packet, and sends a request packet for adding a room channel to the CMS when the authentication is passed; creating a virtual user and requesting to join a channel server CS in the CMS, acquiring address port information from the CS, and then embedding a response packet including the address port information for joining a room channel in the CES and the ES dedicated communication packet and forwarding the response packet to the CES via the ES; and the CES sends a response message of adding the user into the room channel to the libCES module embedded in the CTIS.

Preferably, after the CES sends a response message for the user to join the room channel to the libCES module embedded in the CTIS, the method may further include: CES sends NAT traversing packet to RTP port distributed by CS in timing cycle; and the CES receives the uplink audio data which is recorded by the CTIS operating relay switching equipment and sent by the libCES module, processes the uplink audio data, and sends the processed uplink audio data to the CS.

Preferably, after the CES sends a response message for the user to join the room channel to the libCES module embedded in the CTIS, the method may further include: CES sends NAT traversing packet to RTP port distributed by CS in timing cycle; the CES receives uplink audio data which are recorded by the CTIS operation relay switching equipment and sent by the libCES module; the CES establishes connection with an RTMP server and receives background music data from the RTMP server; the CES determining whether a sampling rate of the background music data is identical to a sampling rate of the upstream audio data, and if not, performing resampling on the background music data; and mixing the background music data with the up audio data at the same sampling rate and then sending the mixed data to the CS.

In a preferred implementation, the PSTN user wants to talk in the room channel, and the CTIS will immediately establish an exclusive full duplex communication channel for him via libCES. The libCES firstly controls the CTIS to cancel the subscription of the CTI channel where the user is located to the output audio source of the original room. The libCES establishes a TCP communication channel to the ES through message control CES, and applies for a temporary userID for identifying PSTN users. Subsequently, all signaling messages served by the CES for the user are interacted through the ES proxy, and do not depend on the direct connection TCP channel between the CES and the CMS to work.

CES and CMS communication define a service signaling packet (such as join, leave, and the like), the serialized memory block data of the signaling packet is embedded into a special big packet (a specific communication information container packet, which has a specified packet type number and can immediately confirm which backend server the data content carried in the container packet should be forwarded to according to an agreed specification after the ES is identified) of CES and ES communication as small packet data, and the ES forwards the data to the CMS to realize service logic. Conversely, when the CMS needs to send packets back to the CES, it sends the container packet containing the real payload to the ES, and the ES is routed to the CES. The above-described proxy mechanism is implemented according to a communication model in which a mobile client APP interacts with an ES server (all signaling interactions of the mobile client APP are performed through a long TCP connection established between the mobile client APP and the ES, and only real-time voice data communicates with a CS through a UDP), and in a full duplex mode, a CES is simulated as an APP client real user to replace a PSTN access of a CTIS to complete various service interactions.

CES completes the signaling interaction of participation in speaking with the joining channel of CMS by means of ES agent, and then completes the receiving work of downlink voice and the code conversion and sending work of uplink voice by interacting with CS, thereby realizing the real-time two-way communication of exclusive full duplex channel.

The above preferred embodiment is further described below in conjunction with fig. 8.

Fig. 8 is a timing diagram of a request to participate in a talk flow in accordance with a preferred embodiment of the present invention. As shown in fig. 8, the requesting to participate in the talk flow includes:

step S801: and the libCES controls the CTIS to cancel the subscription of the CTI channel where the user is located to the output audio source of the original room.

Step S803: the libCES module sends a request STRU _ CLIENT _ CES _ USER _ JOIN _ ROOM _ RQ to the CES.

Step S805: the CES transmits a request STRU _ CES _ ES _ USER _ ID _ FETCH _ RQ to the ES.

Step S807: the ES assigns a temporary membership ID, creates a temporary session sessionID

Step S809: the ES returns a response STRU _ CES _ ES _ USER _ ID _ FETCH _ RS to the CES.

Step S811: the CES maintains the userID/sessionID.

Step S813: and the CES embeds the service signaling data containing the temporary user identifier into the CES and the ES special communication packet and transmits the service signaling data to the ES.

Step S815: and the ES authenticates the temporary user identification in the received special communication packet.

Step S817: and in case of passing the authentication, sending a request packet for joining the room channel to the CMS.

Step S819: the CMS creates a virtual user and requests the CS to allocate port resources.

Step S821: and embedding the response packet of the added room channel containing the address port information into the CES and the ES special communication packet and transmitting the response packet to the ES.

Step S823: and traversing and acquiring the downlink TCP channel.

Step S825: the ES forwards the dedicated communication packet from the CMS to the CES.

Step S827: and the CES sends a response message of adding the user into the room channel to the libCES module embedded in the CTIS.

Step S829: and starting to send a hole packet (namely an NAT traversal packet) to a user audio data receiving port distributed by the CS, sending a heartbeat packet to the CS, and starting to receive a CS voice packet. And receiving user uplink voice packets which are from CTIS operation relay exchange equipment for recording and are sent through libCES, converting and coding the user uplink voice packets, and sending the user uplink voice packets to the CS.

Step S831: and if the background sound is set in the room, acquiring the live music RTMP _ uri, starting the RTMP manager and subscribing the music data stream.

Step S833: starting an RTMP manager: 1. connecting the rtmpserver and successfully handshaking; 2. subscribing real-time music streaming media data according to the channel information; 3. the music streaming media data is decoded and mixed with the voice data.

Step S835: and converting the remix data into a G.711a-law format, and sending the remix data to the libCES through an address acquired by punching from the libCES.

To sum up, with the above embodiments provided by the present invention, a Public Switched Telephone Network (PSTN) is accessed to a mobile Internet voice platform to form a mobile Internet voice platform system fused with the PSTN, and by using the system, under the condition that an Internet network is not available but a user has a strong demand for using a product, a channel of a PSTN network access platform can be provided, so as to achieve the purpose of real-time voice communication (for example, a PSTN user listens to a room channel of the mobile Internet voice platform, and a PSTN user participates in the room channel of the mobile Internet voice platform to speak). After the system is implemented, more access ways can be effectively provided for end product users. Meanwhile, the technology can be used for butting voice communities which are built on the basis of pure PSTN telephone networks and exist in the market, the product contents of the voice communities are enriched, a closed system platform for butting the traditional PSTN voice communication community with the mobile internet users is built, the product user area is enlarged, and the communication channels of people are enriched.

The above disclosure is only for a few specific embodiments of the present invention, but the present invention is not limited thereto, and any variations that can be made by those skilled in the art are intended to fall within the scope of the present invention.

Claims

1. A mobile internet voice platform system for fusing PSTN, comprising:

the computer telecom integration server CTIS is embedded with a libCES module and is used for sending a request packet of a PSTN user for listening to a room channel to a channel access service CES, receiving address port information and starting a service process to receive audio information;

the CES is used for receiving a request packet of the room channel audited from the CTIS, initiating the request packet of the PSTN user applying for joining the room channel to a channel management server CMS, sending address port information obtained by the CMS to a libCES module embedded in the CTIS, starting a service process and sending audio information to the CTIS, wherein telephone users of the PSTN access through the CTIS, and the CES can virtualize a certain number of social platform audio clients and is responsible for communicating with the CMS to enable virtual users to access a channel server CS;

the CMS is used for creating a virtual user and requesting to join the CS when determining that the channel can be joined into the room, acquiring address port information from the CS and sending the address port information to the libCES module;

and the CS is used for distributing the address port information and issuing the audio information.

2. The mobile Internet voice platform system of claim 1,

the system further comprises: an access server ES, configured to allocate a temporary user identifier of the PSTN user, determine a backend server to be forwarded after receiving and identifying the CES and the ES dedicated communication packet, authenticate the temporary user identifier in the received dedicated communication packet, and send a request packet for adding a room channel to the CMS when the authentication is passed;

the libCES module embedded in the CTIS is also used for controlling the CTIS to cancel subscription of the PSTN user to downlink audio data of the room channel when the PSTN user needs to participate in speaking in the room channel, controlling the CES to establish a TCP communication channel to the ES through a message, and applying for a temporary user identifier for identifying the PSTN user to the ES;

the CES is also used for embedding the service signaling data containing the temporary user identification into the CES and ES special communication packet and sending the embedded service signaling data to the ES, and sending a response message of a user added to a room channel to a libCES module embedded in the CTIS;

the CMS is further configured to create a virtual user and request to join a channel server CS, and after acquiring address port information from the CS, embed a response packet including the address port information for joining a room channel in the CES and the ES dedicated communication packet and forward the response packet to the CES via the ES.

3. An application method of a mobile internet voice platform system fused with a public switched telephone network PSTN is characterized by comprising the following steps:

a channel access server CES receives a request packet of a room channel of a listening side from a computer telecommunication integration server CTIS;

initiating a request packet of a PSTN user for applying to join a room channel to a channel management server CMS;

when the CMS determines that the channel can be added to the room, a virtual user is created and requests to be added to a Channel Server (CS), and after address port information is obtained from the CS, the CES sends the address port information to a libCES module embedded in the CTIS;

the libCES module and the CES respectively start a service process, so that the CES sends audio information to the CTIS;

the telephone users of the PSTN are accessed through the CTIS, and the CES can virtualize a certain number of social platform audio clients and is responsible for communicating with the CMS to access the virtual users to the channel server CS.

4. The method of claim 3, wherein after receiving the request packet for the room channel by-listening from the CTIS, the CES further comprises:

and the CES analyzes the request packet of the room channel, and when the PSTN user is determined to be added into the room channel, the CMS clears the data of the PSTN user which is expired in the room channel.

5. The method as claimed in claim 3, wherein after the CMS obtaining address port information from the CS and replying the address port information to the CES, the method further comprises:

and the CES provides audio channel port information distributed to the PSTN users, and the audio channel port information and the address port information are replied to a libCES module embedded in the CTIS together.

6. The application method of claim 3, wherein the libCES module and the CES respectively start a service process, and wherein the enabling of the CES to send audio information to the CTIS comprises:

the libCES module periodically and circularly sends a network address translation NAT traversal packet to a monitoring port of the CES;

CES sends NAT traversing packet to RTP port distributed by CS in timing cycle;

and the CES receives downlink audio data from the CS, processes the downlink audio data, converts the downlink audio data into voice information in a preset voice format and sends the voice information to the CTIS.

7. The application method of claim 3, wherein the libCES module and the CES respectively start a service process, and wherein the enabling of the CES to send audio information to the CTIS comprises:

the libCES module sends NAT traversing packets to a monitoring port of the CES in a timing cycle mode;

CES sends NAT traversing packet to RTP port distributed by CS in timing cycle;

the CES receives downstream audio data from the CS;

the CES establishes connection with the RTMP server and receives background music data from the RTMP server;

the CES determines whether the sampling rate of the background music data is consistent with the sampling rate of the downlink audio data, and if not, performs resampling on the background music data;

and after the background music data with the consistent sampling rate and the downlink audio data are mixed, converting the mixed data into a preset voice format, directly transmitting the preset voice format to a libCES module embedded in the CTIS, and synchronizing the preset voice format to a PSTN trunk line.

8. The application method of claim 3, after the libCES module and the CES respectively start a service process for enabling the CES to send audio information to the CTIS, further comprising:

the CES receives an end overhearing request from the PSTN user through a libCES module embedded in the CTIS;

the CES judges whether the PSTN user already joins the room channel;

under the condition that the PSTN user joins the room channel, verifying the PSTN user according to the user identification of the PSTN user;

in case the check passes, the CES sends a fallback notification to the CMS, so that the CMS notifies the CS to kick out the PSTN user.

9. The method of applying as claimed in claim 8, wherein after the CMS notifying the CS of kicking out the PSTN user, further comprising:

the CES receiving a kick-out user notification from the CMS;

the CES judges whether the PSTN user joins the room channel again;

under the condition that the PSTN user joins the room channel, verifying the PSTN user again according to the user identification of the PSTN user;

and the CES clears all data of the PSTN user in the room channel and sends a user kicked-out notice to a libCES module embedded in the CTIS.

10. The application method according to any one of claims 3 to 9, further comprising:

when the PSTN user needs to participate in speaking in the room channel, the libCES module controls the CTIS to cancel subscription of the PSTN user to downlink audio data of the room channel;

the libCES module controls the CES to establish a TCP communication channel to the ES through a message, and applies for a temporary user identifier for identifying the PSTN user to the ES;

the CES embeds service signaling data containing the temporary user identifier into the CES and the ES special communication packet and sends the CES and the ES special communication packet to the ES, so that the ES identifies the special communication packet and then determines a back-end server to be forwarded;

the ES authenticates the temporary user identifier in the received special communication packet, and sends a request packet for adding a room channel to the CMS under the condition that the authentication is passed;

after the CMS creates a virtual user and requests to join a Channel Server (CS), and address port information is acquired from the CS, embedding a response packet of joining a room channel containing the address port information into the CES and the ES special communication packet and forwarding the response packet to the CES through the ES;

and the CES sends a response message of adding a user into a room channel to a libCES module embedded in the CTIS.

11. The application method of claim 10, after the CES sends a response message for a user joining a room channel to a libCES module embedded in the CTIS, the method further comprising:

CES sends NAT traversing packet to RTP port distributed by CS in timing cycle;

and the CES receives the uplink audio data which is recorded by the CTIS operating relay switching equipment and sent by the libCES module, processes the uplink audio data, and sends the processed uplink audio data to the CS.

12. The application method of claim 10, after the CES sends a response message for a user joining a room channel to a libCES module embedded in the CTIS, the method further comprising:

CES sends NAT traversing packet to RTP port distributed by CS in timing cycle;

the CES receives uplink audio data which are recorded by the CTIS operation relay exchange equipment and sent by the libCES module;

the CES determines whether the sampling rate of the background music data is consistent with the sampling rate of the uplink audio data, and if not, performs resampling on the background music data;

and after the background music data with the consistent sampling rate and the uplink audio data are mixed, sending the mixed data to the CS.