CN117201468A

CN117201468A - VOIP self-adaptive voice coding system

Info

Publication number: CN117201468A
Application number: CN202311329523.6A
Authority: CN
Inventors: 周铁华; 邱金峰
Original assignee: Hangzhou Qianwan Technology Co ltd
Current assignee: Hangzhou Qianwan Technology Co ltd
Priority date: 2023-10-13
Filing date: 2023-10-13
Publication date: 2023-12-08

Abstract

The application relates to the technical field of language coding, and discloses a VOIP self-adaptive voice coding system, which comprises: the SIP server side, the client side and the opposite-end client side; the SIP server side is used for receiving a registration request from the client side, detecting the network available bandwidth of the client side, sequencing the voice coding sequence according to the detected network available bandwidth of the client side and sending the voice coding sequence to the client side of the opposite side; the client is used for selecting a self-supported coding format according to the received voice coding sequence to carry out voice coding on the real-time transmission protocol RTP stream; the SIP server compares the voice coding formats of the two parties of the conversation of the client and the client of the opposite terminal, and if the coding formats of the two parties are the same, the media relay server is controlled to carry out transparent transmission on the RTP stream; if the coding formats are different, the SIP server end carries out voice coding conversion on the RTP stream, converts the voice coding format into the coding format of the client end of the opposite end and then transmits the voice coding format to the opposite end.

Description

VOIP self-adaptive voice coding system

Technical Field

The application relates to the technical field of language coding, in particular to a VOIP self-adaptive language coding system.

Background

VoIP is a technology for achieving the purpose of voice communication over an IP network by digitally encoding, compressing and framing voice signals, and then converting the voice signals into IP packets for transmission over the IP network. VOIP has the greatest advantage of being able to widely utilize the Internet and global IP interconnection environments, and is very inexpensive to provide voice, fax, video, and data services.

Session Initiation Protocol (SIP) is an application-layer control protocol for multimedia communications over an IP network. SIP is used to set up, change and terminate calls between users of IP-based networks, and in order to provide telephony services, SIP also requires the incorporation of different standards and protocols, particularly the real-time transport protocol (RTP).

The existing VOIP phone system cannot make dynamic adjustment according to the requirements of different encoding formats for different bandwidths, that is, cannot select an appropriate encoding format according to the condition of the SIP client, so that high-quality voice call service cannot be provided for the user.

Disclosure of Invention

The present application is directed to a VOIP adaptive speech coding system that provides high quality voice call services.

To achieve the above object, the basic scheme of the present application is as follows:

a VOIP adaptive speech coding system, comprising: the SIP server side, the client side and the opposite-end client side;

the SIP server side is used for receiving a registration request from the client side, detecting the network available bandwidth of the client side, sequencing the voice coding sequence according to the detected network available bandwidth of the client side and sending the voice coding sequence to the client side of the opposite side;

the client is used for selecting a self-supported coding format according to the received voice coding sequence to carry out voice coding on the real-time transmission protocol RTP stream;

the SIP server compares the voice coding formats of the two parties of the conversation of the client and the client of the opposite terminal, and if the coding formats of the two parties are the same, the media relay server is controlled to carry out transparent transmission on the RTP stream; if the coding formats are different, the SIP server end carries out voice coding conversion on the RTP stream, converts the voice coding format into the coding format of the client end of the opposite end and then transmits the voice coding format to the opposite end.

Further, the SIP server side is configured with a session information interaction module, a bandwidth monitoring module and a data packet control module,

in the call process, the network fluctuation condition monitors the bandwidth condition to be sent to the SIP server terminal at regular time through the bandwidth monitoring module, the SIP server terminal adjusts the voice coding sequence to rearrange according to the bandwidth condition of the client terminal, the data packet after the modification is sent to the client terminal of the opposite terminal, and the coding sequence of the client terminal of the opposite terminal is readjusted to the coding format.

Further, the SIP server is configured with a processing module, where when speech coding formats of two parties of the call of the client and the opposite-end client are different, the processing module screens abnormal features, and when the abnormal features are abnormal in the bandwidth condition of the client and/or abnormal in the bandwidth condition of the opposite-end client, the processing module performs priority configuration on available bandwidths of the client and the opposite-end client according to a pre-configured priority policy.

Further, the SIP server is configured with a first server and a second server, the first server transmits unidirectional data to the second server, the first server is connected to the client, and the second server is connected to the client;

the first server is configured with an encryption unit, the encryption unit is configured with an encryption strategy, and the encryption unit encrypts the voice code according to the preset encryption strategy and forms a plurality of encrypted voice data packets to be transmitted to the second server;

the second server is configured with a decryption unit, the decryption unit is configured with a decryption strategy, and the decryption unit decrypts the encrypted data packet according to the preset decryption strategy to form a decrypted voice data packet and transmits the decrypted voice data packet to the opposite-end client.

Further, the encryption unit is configured with an encryption policy and an encryption algorithm, the encryption algorithm is configured with a plurality of dynamic encryption factors, the dynamic encryption factors comprise network available bandwidth of the client, voice coding sequence, voice coding format and device characteristic factors of the client, and the device characteristic factors of the client are sequence numbers configured when the client leaves the factory.

Further, the encrypted voice data packet comprises a plurality of encrypted voice data fragments, the encrypted voice data fragments comprise a voice data fragment set, and the voice data fragment set is associated with available bandwidth of a client network;

each encrypted voice data segment is configured with feature index information including the available bandwidth of the client network and the device feature factor of the client, the feature index information being interrelated with the set of voice data segments.

Further, the decryption strategy is configured with a decryption algorithm, and the decryption algorithm decrypts the encrypted voice data segment to form a decrypted voice data segment;

the decryption unit configures a voice coding format conversion strategy, the voice coding format conversion strategy adjusts the rearrangement of the voice coding sequence according to the bandwidth condition of the client corresponding to the decrypted voice data segment, the data packet after the modification is sent to the opposite client, and the coding format of the opposite client is readjusted.

Compared with the prior art, the beneficial effect of this scheme is:

1. the voice coding system of the scheme utilizes the SIP server side to compare the voice coding formats of the two parties of the conversation of the client side and the client side of the opposite end, and if the coding formats of the two parties are the same, the media relay server is controlled to carry out transparent transmission on RTP streams; if the coding formats are different, the SIP server end carries out voice coding conversion on the RTP stream, converts the voice coding format into the coding format of the client end of the opposite end and transmits the opposite end to the opposite end, thereby realizing the purpose of providing high-quality voice call service for users;

2. in the call process, the network fluctuation condition monitors the bandwidth condition to be sent to the SIP server terminal at regular time through the bandwidth monitoring module, the SIP server terminal adjusts the voice coding sequence to rearrange according to the bandwidth condition of the client terminal, the data packet after the modification is sent to the client terminal of the opposite terminal, and the coding sequence of the client terminal of the opposite terminal is readjusted to the coding format.

3. The SIP server side is configured with a first server and a second server, and the first server and the second server are respectively configured with an encryption unit and a decryption unit, so that high call security in the call process of the client side and the opposite-end client side is realized.

Drawings

FIG. 1 is a schematic diagram of an embodiment of the present application.

Detailed Description

The application is described in further detail below with reference to the drawings in the specification and by way of specific embodiments:

examples:

a VOIP adaptive speech coding system, as shown in fig. 1, comprising: the SIP server side, the client side and the opposite-end client side;

The SIP server is configured with a session information interaction module, a bandwidth monitoring module and a data packet control module,

session information interaction module: for processing interaction information from the client;

and a bandwidth monitoring module: the method comprises the steps of sending a negotiation request to a client when the registration of the client is detected; receiving a network available bandwidth detection request from a client, detecting the current network available bandwidth of the client network by using a detection algorithm carried in the negotiation request, and storing the obtained network available bandwidth of the client;

in the call process, the network fluctuation condition sends bandwidth condition to the SIP server terminal at regular time through network monitoring, the SIP server terminal adjusts the rearrangement of the voice coding sequence according to the bandwidth condition of the client terminal, the data packet after the modification is sent to the client terminal of the opposite terminal, and the coding sequence of the client terminal of the opposite terminal is readjusted to the coding format.

And the data packet control module: and the voice coding sequence is ordered according to the detected network available bandwidth of the opposite terminal when the client terminal is detected to initiate the invitation, and the voice coding sequence is sent to the opposite terminal client terminal through the session information interaction module.

The SIP server side is configured with a first server and a second server, the first server transmits unidirectional data to the second server, the first server is connected with the client side, and the second server is connected with the opposite-end client side;

the encryption unit is configured with an encryption strategy and an encryption algorithm, the encryption algorithm is configured with a plurality of dynamic encryption factors, the dynamic encryption factors comprise network available bandwidth of a client, voice coding sequence, voice coding format and device characteristic factors of the client, and the device characteristic factors of the client are sequence numbers configured when the client leaves a factory.

The encrypted voice data packet comprises a plurality of encrypted voice data fragments, the encrypted voice data fragments comprise a voice data fragment set, and the voice data fragment set is associated with available bandwidth of a client network;

The second server is configured with a decryption unit, the decryption unit is configured with a decryption strategy, and the decryption unit decrypts the encrypted data packet according to the preset decryption strategy to form a decrypted voice data packet and transmits the decrypted voice data packet to the opposite-end client;

the second server is configured with a voice data calling unit, a voice data packaging unit and a voice data verification unit, the voice data calling unit is configured with a voice data calling strategy, the voice data calling strategy takes the network available bandwidth of the client as an index template, an encrypted voice data fragment corresponding to characteristic index information matched with the index template is called as a voice data fragment to be decrypted, and the decryption unit decrypts the target data fragment to be decrypted according to a preset decryption strategy to form a decrypted voice data fragment;

the voice data packaging unit is configured with an import module and a packaging module, the import module is configured with a plurality of data templates, each data template corresponds to the network available bandwidth of the client, the import module imports the corresponding voice data template according to the network available bandwidth of the client corresponding to the decrypted data fragment, and the packaging module packages the voice data template imported with the decrypted voice data fragment into a comprehensive data packet;

the voice data verification unit is configured with a voice data verification strategy, and the voice data verification strategy is used for verifying the consistency of the network available bandwidth of the client corresponding to the decrypted voice data segment in the comprehensive data packet and the characteristic index information configured on the encrypted data segment;

when the data verification strategy verifies that the feature index information corresponding to the decrypted voice data segment in the comprehensive data packet is consistent with the feature index information configured with the encrypted voice data segment, the corresponding voice encrypted data packet in the first server is automatically deleted.

The decryption strategy is configured with a decryption algorithm, and the decryption algorithm decrypts the encrypted voice data segment to form a decrypted voice data segment;

The SIP server is configured with a processing module, when the voice coding formats of the two parties of the conversation of the client and the opposite terminal client are different, the processing module screens abnormal characteristics, and when the abnormal characteristics are abnormal in the bandwidth condition of the client and/or abnormal in the bandwidth condition of the opposite terminal client, the processing module carries out priority configuration on the available bandwidths of the client and the opposite terminal client according to a pre-configured priority strategy; when the client priority is higher than the opposite client priority, executing the client speech coding format; when the client-side priority is higher than the client-side priority, the client-side speech coding format is executed.

When a user registers to a server through a client, the SIP server is used for receiving a registration request from the client, detecting the network available bandwidth of the client, sequencing the voice coding sequence according to the detected network available bandwidth of the client and sending the voice coding sequence to the client;

the SIP server compares the voice coding formats of the two parties of the conversation of the client and the client of the opposite terminal, and if the coding formats of the two parties are the same, the media relay server is controlled to carry out transparent transmission on the RTP stream; if the coding formats are different, the SIP server end carries out voice coding conversion on the RTP stream, converts the voice coding format into the coding format of the client end of the opposite end and then transmits the voice coding format to the opposite end;

The voice coding system of the application realizes that the voice coding format matched with the available network bandwidth is always kept on the client and the opposite terminal client network, thereby better ensuring that high-quality voice call service is provided for users.

The foregoing is merely exemplary embodiments of the present application, and specific structures and features that are well known in the art are not described in detail herein. It should be noted that modifications and improvements can be made by those skilled in the art without departing from the structure of the present application, and these should also be considered as the scope of the present application, which does not affect the effect of the implementation of the present application and the utility of the patent. The protection scope of the present application is subject to the content of the claims, and the description of the specific embodiments and the like in the specification can be used for explaining the content of the claims.

Claims

1. A VOIP adaptive speech coding system, characterized by: comprising the following steps: the SIP server side, the client side and the opposite-end client side;

2. The VOIP adaptive speech coding system according to claim 1, wherein: the SIP server is configured with a session information interaction module, a bandwidth monitoring module and a data packet control module,

3. The VOIP adaptive speech coding system according to claim 1, wherein: the SIP server is configured with a processing module, when the voice coding formats of the two parties of the conversation of the client and the opposite terminal client are different, the processing module screens abnormal characteristics, and when the abnormal characteristics are abnormal in the bandwidth condition of the client and/or abnormal in the bandwidth condition of the opposite terminal client, the processing module carries out priority configuration on the available bandwidths of the client and the opposite terminal client according to a pre-configured priority strategy.

4. The VOIP adaptive speech coding system according to claim 1, wherein: the SIP server side is configured with a first server and a second server, the first server transmits unidirectional data to the second server, the first server is connected with the client side, and the second server is connected with the opposite-end client side;

5. The VOIP adaptive speech coding system according to claim 4, wherein: the encryption unit is configured with an encryption strategy and an encryption algorithm, the encryption algorithm is configured with a plurality of dynamic encryption factors, the dynamic encryption factors comprise network available bandwidth of a client, voice coding sequence, voice coding format and device characteristic factors of the client, and the device characteristic factors of the client are sequence numbers configured when the client leaves a factory.

6. The VOIP adaptive speech coding system according to claim 5, wherein: the encrypted voice data packet comprises a plurality of encrypted voice data fragments, the encrypted voice data fragments comprise a voice data fragment set, and the voice data fragment set is associated with available bandwidth of a client network;

7. The VOIP adaptive speech coding system according to claim 4, wherein: the decryption strategy is configured with a decryption algorithm, and the decryption algorithm decrypts the encrypted voice data segment to form a decrypted voice data segment;