CN107979482B

CN107979482B - Information processing method, device, sending end, jitter removal end and receiving end

Info

Publication number: CN107979482B
Application number: CN201610940605.8A
Authority: CN
Inventors: 王凤玲
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2016-10-25
Filing date: 2016-10-25
Publication date: 2021-06-11
Anticipated expiration: 2036-10-25
Also published as: CN107979482A

Abstract

The invention discloses an information processing method, an information processing device, a sending end, a jitter removing end and a receiving end, wherein the method comprises the following steps: acquiring offline network data, and extracting at least one network parameter for representing network characteristics from the offline network data; constructing a network model according to the at least one network parameter, and determining a first debouncing strategy according to the network model; correcting the first jitter removal strategy according to characteristic parameters for evaluating the voice over internet protocol (Voip) call quality to obtain a second jitter removal strategy; and obtaining a debounce parameter according to the current real-time network condition and the second debounce strategy, and setting the size of a buffer area for transmitting Voip call data according to the debounce parameter so as to enable the time delay of the Voip call to be in accordance with expectation.

Description

Information processing method, device, sending end, jitter removal end and receiving end

Technical Field

The present invention relates to processing technologies in the internet, and in particular, to an information processing method, an information processing apparatus, a sending end, a jitter removal end, and a receiving end.

Background

With the development of Internet technology, end-to-end communication technology based on Voice over Internet Protocol (Voip) is becoming more and more popular. The Voip call technology is to transmit data in packets through an IP network, and due to the inherent characteristics of the IP network, the time used for transmitting each packet on the network is uncertain, and this difference in transmission time is called jitter, and the resulting jitter parameters and factors such as network performance parameters will generate end-to-end call delay, thereby affecting the network call quality. In order to solve the problem, in the prior art, a single parameter-constructed debounce algorithm is adopted, various complex conditions in a network call environment are not sufficiently estimated, the size of a buffer area is set based on the debounce parameter obtained by the debounce algorithm, so that in a scene of improving the network call quality, the size of the buffer area is unreasonable due to the fact that the debounce algorithm and the debounce parameter are not accurate enough, and the improvement on the network call quality is not referable. However, in the related art, there is no effective solution to this problem.

Disclosure of Invention

In view of this, embodiments of the present invention provide an information processing method, an information processing apparatus, a sending end, a de-jitter end, and a receiving end, which at least solve the problems in the prior art.

The technical scheme of the embodiment of the invention is realized as follows:

an information processing method according to an embodiment of the present invention includes:

acquiring offline network data, and extracting at least one network parameter for representing network characteristics from the offline network data;

constructing a network model according to the at least one network parameter, and determining a first debouncing strategy according to the network model;

correcting the first jitter removal strategy according to the characteristic parameters for evaluating the Voip call quality of the network protocol to obtain a second jitter removal strategy;

and obtaining a debounce parameter according to the current real-time network condition and the second debounce strategy, and setting the size of a buffer area for transmitting Voip call data according to the debounce parameter so as to enable the time delay of the Voip call to be in accordance with expectation.

In the foregoing solution, the modifying the first debouncing policy according to the characteristic parameter for evaluating the Voip call quality includes:

acquiring historical data of the call;

and correcting the first debouncing strategy according to the historical data of the call.

acquiring the signal content of the call;

and correcting the first debouncing strategy according to the signal content of the call.

acquiring a perceptual auditory result of the call;

and modifying the first de-jitter strategy according to the perception auditory result.

In the above scheme, the method further comprises:

acquiring different processing capacities of terminal equipment and/or scheduling characteristics of application serving as a Voip call medium when Voip call data of the call is acquired;

and correcting the first jitter removal strategy according to different processing capacities of the terminal equipment and/or scheduling characteristics of the application serving as the Voip call medium.

In the above scheme, the method further comprises:

when Voip call data of the call is played, acquiring different processing capacities of terminal equipment and/or scheduling characteristics of application serving as a Voip call medium;

An information processing apparatus according to an embodiment of the present invention includes:

the device comprises a collecting unit, a judging unit and a judging unit, wherein the collecting unit is used for collecting offline network data and extracting at least one network parameter for representing network characteristics from the offline network data;

the strategy determining unit is used for constructing a network model according to the at least one network parameter and determining a first debouncing strategy according to the network model;

the strategy correction unit is used for correcting the first jitter removal strategy according to the characteristic parameters for evaluating the Voip call quality of the network protocol voice to obtain a second jitter removal strategy;

and the buffer area adjusting unit is used for obtaining a debounce parameter according to the current real-time network condition and the second debounce strategy, and setting the size of a buffer area for transmitting Voip call data according to the debounce parameter so as to enable the time delay of the Voip call to be in accordance with expectation.

In the foregoing solution, the policy modification unit is further configured to:

acquiring historical data of the call;

acquiring the signal content of the call;

acquiring a perceptual auditory result of the call;

In the above scheme, the apparatus further comprises:

the call acquisition unit is used for acquiring Voip call data of the call;

the policy modification unit is further configured to:

when the collection of Voip call data of the call is triggered, acquiring different processing capacities of terminal equipment and/or scheduling characteristics of application serving as a Voip call medium;

In the above scheme, the apparatus further comprises:

the call playing unit is used for playing Voip call data of the call;

the policy modification unit is further configured to:

when the Voip call data of the call is triggered to be played, acquiring different processing capabilities of terminal equipment and/or scheduling characteristics of application serving as a Voip call medium;

acquiring offline network data, extracting at least one network parameter for representing network characteristics from the offline network data, and using the at least one network parameter to construct a network model, wherein the network model is used for determining a first jitter removal strategy in voice over internet protocol (Voip) call data;

The sending end of information processing in the embodiment of the invention comprises:

the system comprises a collecting unit, a determining unit and a processing unit, wherein the collecting unit is used for collecting offline network data, extracting at least one network parameter for representing network characteristics from the offline network data, and using the at least one network parameter to construct a network model, and the network model is used for determining a first jitter removal strategy in Voip call data transmission;

the call acquisition unit is used for acquiring different processing capacities of terminal equipment and/or scheduling characteristics of application serving as a Voip call medium when Voip call data of the call are acquired;

and the first strategy correcting unit is used for correcting the first de-jitter strategy according to different processing capacities of the terminal equipment and/or scheduling characteristics of the application serving as the Voip call medium.

constructing a network model according to at least one network parameter, and determining a first debouncing strategy according to the network model, wherein the at least one network parameter is derived from parameters which are extracted from acquired offline network data and used for representing network characteristics;

acquiring historical data of the call;

acquiring the signal content of the call;

acquiring a perceptual auditory result of the call;

An information processing jitter removal end according to an embodiment of the present invention includes:

the strategy determination unit is used for constructing a network model according to at least one network parameter, and determining a first debouncing strategy according to the network model, wherein the at least one network parameter is derived from a parameter which is extracted from acquired offline network data and used for representing network characteristics;

the second strategy correction unit is used for correcting the first jitter removal strategy according to the characteristic parameters for evaluating the Voip call quality of the network protocol voice to obtain a second jitter removal strategy;

In the foregoing solution, the second policy modification unit is further configured to:

acquiring historical data of the call;

acquiring the signal content of the call;

acquiring a perceptual auditory result of the call;

acquiring a first jitter removal strategy determined in voice over internet protocol (Voip) call data of a transmission network, wherein the first jitter removal strategy is obtained according to a network model constructed by at least one network parameter, and the at least one network parameter is derived from a parameter extracted from acquired offline network data and used for representing network characteristics;

The receiving end of information processing of the embodiment of the invention comprises:

the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a first jitter removal strategy determined in voice over internet protocol (Voip) call data of a transport network protocol, the first jitter removal strategy is obtained according to a network model constructed by at least one network parameter, and the at least one network parameter is derived from a parameter extracted from acquired offline network data and used for representing network characteristics;

the call playing unit is used for acquiring different processing capacities of terminal equipment and/or scheduling characteristics of application serving as a Voip call medium when Voip call data of the call is played;

and the third strategy correcting unit is used for correcting the first jitter removing strategy according to different processing capacities of the terminal equipment and/or the scheduling characteristics of the application serving as the Voip call medium.

The information processing method of the embodiment of the invention comprises the following steps: acquiring offline network data, and extracting at least one network parameter for representing network characteristics from the offline network data; constructing a network model according to the at least one network parameter, and determining a first debouncing strategy according to the network model; correcting the first jitter removal strategy according to the characteristic parameters for evaluating the Voip call quality to obtain a second jitter removal strategy; and obtaining a debounce parameter according to the current real-time network condition and the second debounce strategy, and setting the size of a buffer area for transmitting Voip call data according to the debounce parameter so as to enable the time delay of the Voip call to be in accordance with expectation.

By adopting the embodiment of the invention, the off-line network data is collected, and at least one network parameter for representing the network characteristics is extracted from the off-line network data; and constructing a network model according to the at least one network parameter, determining a first debounce strategy according to the network model, and constructing a debounce algorithm by adopting a plurality of parameters, so that various complex conditions in a network communication environment are fully estimated, the obtained first debounce strategy (or called initial debounce strategy) tends to be accurate, and accordingly, related parameters such as debounce parameters and the like obtained by the initial debounce strategy tend to be accurate. In order to further improve the accuracy, the first jitter removal strategy is corrected according to characteristic parameters for evaluating the Voip call quality to obtain a second jitter removal strategy; and obtaining a debounce parameter according to the current real-time network condition and the second debounce strategy, setting the size of a buffer area for transmitting Voip call data according to the debounce parameter, enabling the time delay of the Voip call to be in accordance with expectation, optimizing a series of debounce strategies to enable the size of the buffer area set according to the size to be reasonable, improving the network call quality according to the size of the buffer area, and improving the network call quality.

Drawings

FIG. 1 is a diagram of hardware entities performing information interaction in an embodiment of the present invention;

FIG. 2 is a schematic diagram of a process flow of an embodiment of the present invention;

FIG. 3 is a schematic diagram of another method implementation flow according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a process flow for implementing another method according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of a process flow for implementing another method according to an embodiment of the present invention;

FIG. 6 is a block diagram of a system architecture according to an embodiment of the present invention;

FIG. 7 is a schematic diagram of an end-to-end module of a Voip call in the prior art;

FIGS. 8-9 are schematic diagrams of the implementation of prior art schemes one and two;

FIG. 10 is a diagram illustrating a scenario in which an embodiment of the present invention is applied;

FIGS. 11-12 are graphs comparing the results of de-dithering processes after applying embodiments of the present invention.

Detailed Description

The following describes the embodiments in further detail with reference to the accompanying drawings.

A mobile terminal implementing various embodiments of the present invention will now be described with reference to the accompanying drawings. In the following description, suffixes such as "module", "component", or "unit" used to denote elements are used only for facilitating the description of the embodiments of the present invention, and have no specific meaning in themselves. Thus, "module" and "component" may be used in a mixture.

In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced without these specific details. In other instances, well-known methods, procedures, components, circuits, and networks disclosed have not been described in detail as not to unnecessarily obscure aspects of the embodiments.

In addition, although the terms "first", "second", etc. are used herein several times to describe various elements (or various thresholds or various applications or various instructions or various operations), etc., these elements (or thresholds or applications or instructions or operations) should not be limited by these terms. These terms are only used to distinguish one element (or threshold or application or instruction or operation) from another element (or threshold or application or instruction or operation). For example, a first operation may be referred to as a second operation, and a second operation may be referred to as a first operation, without departing from the scope of the invention, the first operation and the second operation being operations, except that they are not the same operation.

The steps in the embodiment of the present invention are not necessarily processed according to the described step sequence, and may be optionally rearranged in a random manner, or steps in the embodiment may be deleted, or steps in the embodiment may be added according to requirements.

The term "and/or" in embodiments of the present invention refers to any and all possible combinations including one or more of the associated listed items. It is also to be noted that: when used in this specification, the term "comprises/comprising" specifies the presence of stated features, integers, steps, operations, elements and/or components but does not preclude the presence or addition of one or more other features, integers, steps, operations, elements and/or components and/or groups thereof.

The intelligent terminal (e.g., mobile terminal) of the embodiments of the present invention may be implemented in various forms. For example, the mobile terminal described in the embodiments of the present invention may include a mobile terminal such as a mobile phone, a smart phone, a notebook computer, a Digital broadcast receiver, a Personal Digital Assistant (PDA), a tablet computer (PAD), a Portable Multimedia Player (PMP), a navigation device, and the like, and a fixed terminal such as a Digital TV, a desktop computer, and the like. In the following, it is assumed that the terminal is a mobile terminal. However, it will be understood by those skilled in the art that the configuration according to the embodiment of the present invention can be applied to a fixed type terminal in addition to elements particularly used for moving purposes.

Fig. 1 is a schematic diagram of hardware entities performing information interaction in an embodiment of the present invention, where fig. 1 includes: terminal equipment 1, server 2, terminal equipment 3. The terminal device 1 is called a sending terminal device and is composed of terminal devices 11-14; the terminal device 3 is called a receiving terminal device and is composed of terminal devices 31-35; the server 2 is used to perform the de-jittering processing logic. And the terminal equipment performs information interaction with the server through a wired network or a wireless network. The terminal equipment comprises mobile phones, desktop computers, PC machines, all-in-one machines and the like. By adopting the embodiment of the invention, the terminal device 1 carries out information transmission and interaction through the server 2 and the terminal device 3, specifically, taking a Voip network call as an example, the terminal devices 11-14 send network data in the Voip network call, and the network data is played through the terminal devices 31-35 after being subjected to jitter removal processing through the server 2, so that the Voip network call is completed. In the embodiment of the invention, off-line network data of the existing network is adopted, at least one network parameter for representing network characteristics is extracted from the off-line network data, and a network model is constructed according to the at least one network parameter, so that a first de-jitter strategy (or called a de-jitter strategy) determined according to the network model tends to be accurate. Specifically, the processing logic 10 in the server 2 that performs the debounce process includes: s1, acquiring off-line network data, and extracting at least one network parameter for representing network characteristics from the off-line network data; s2, constructing a network model according to the at least one network parameter, and determining a first debouncing strategy according to the network model; s3, correcting the first jitter removal strategy according to the characteristic parameters for evaluating the Voip call quality to obtain a second jitter removal strategy; and S4, obtaining a jitter removing parameter according to the current real-time network condition and the second jitter removing strategy, and setting the size of a buffer area for transmitting Voip call data according to the jitter removing parameter to enable the time delay of the Voip call to be in accordance with expectation.

The above example of fig. 1 is only an example of a system architecture for implementing the embodiment of the present invention, and the embodiment of the present invention is not limited to the system architecture described in the above fig. 1, and various embodiments of the method of the present invention are proposed based on the system architecture described in the above fig. 1.

As shown in fig. 2, an information processing method according to an embodiment of the present invention includes: the method comprises the steps of collecting offline network data, extracting at least one network parameter used for representing network characteristics from the offline network data, constructing a network model according to the at least one network parameter, measuring or simulating Voip call quality according to the network model, and determining a first jitter removal strategy (101) according to the network model. Specifically, in practical application, a large amount of existing network related network data are collected through different network types, and the network model is constructed through offline training, and the network model can determine the initial debouncing strategy. And modifying the first debouncing strategy according to characteristic parameters (such as historical data of the call, signal content of the call, a perceptual auditory result of the call and the like) for evaluating the Voip call quality to obtain a second debouncing strategy (102). Wherein, as for the historical data of the call, the historical data can reflect the network characteristics of the call; for the signal content of the call, whether the current frame is an important frame or not is determined, the voice data content is the important frame and needs to be focused, the mute data content does not need to be focused, and the de-jitter processing is different for different contents; different perceptual auditory results differ with respect to the way and magnitude of the de-jitter adjustment. And obtaining a debounce parameter according to the current real-time network condition and the second debounce strategy, and setting the size of a buffer area for transmitting Voip call data according to the debounce parameter, so that the time delay of the Voip call is in accordance with expectation and tends to be reasonable (103). In practical application, the size of the de-jitter buffer is determined according to the de-jitter parameters obtained by the second de-jitter strategy, and finally, the data in the buffer is adjusted based on the size of the de-jitter buffer.

It should be noted that, the logic of acquisition, policy determination, policy modification, and the like in the processing logic of the method is not limited to be located in the sending end, the receiving end, or the server, and part or all of the logic may be located in the sending end, the receiving end, or the server.

As shown in fig. 3, an information processing method according to an embodiment of the present invention includes: the method comprises the steps of collecting offline network data, extracting at least one network parameter used for representing network characteristics from the offline network data, constructing a network model according to the at least one network parameter, measuring or simulating Voip call quality according to the network model, and determining a first jitter removal strategy (201) according to the network model. Specifically, in practical application, a large amount of existing network related network data are collected through different network types, and the network model is constructed through offline training, and the network model can determine the initial debouncing strategy. And acquiring historical data of the call, taking the historical data of the call as a characteristic parameter for evaluating Voip call quality, and correcting the first jitter removal strategy according to the historical data of the call to obtain a second jitter removal strategy (202). In terms of the historical data of the call, it may reflect the network characteristics of the call, and in a single call, the network parameter settings in the first debounce policy, such as debounce parameters and delay processing parameters, may be adjusted according to the historical data of the call. And obtaining a debounce parameter according to the current real-time network condition and the second debounce strategy, and setting the size of a buffer area for transmitting Voip call data according to the debounce parameter, so that the time delay of the Voip call is in accordance with expectation and tends to be reasonable (203). In practical application, the size of the de-jitter buffer is determined according to the de-jitter parameters obtained by the second de-jitter strategy, and finally, the data in the buffer is adjusted based on the size of the de-jitter buffer.

As shown in fig. 4, an information processing method according to an embodiment of the present invention includes: the method comprises the steps of collecting offline network data, extracting at least one network parameter used for representing network characteristics from the offline network data, constructing a network model according to the at least one network parameter, measuring or simulating Voip call quality according to the network model, and determining a first jitter removal strategy (301) according to the network model. Specifically, in practical application, a large amount of existing network related network data are collected through different network types, and the network model is constructed through offline training, and the network model can determine the initial debouncing strategy. And acquiring the signal content of the call, taking the signal content of the call as a characteristic parameter for evaluating Voip call quality, and correcting the first de-jitter strategy according to the signal content of the call to obtain a second de-jitter strategy (302). In terms of the signal content of the call, it is determined whether the current frame is an important frame, the voice data content is an important frame, and important attention is needed, while the mute data content does not need important attention, and the processing for removing jitter is different for different contents, and in a single call, the network parameter setting in the first jitter removing strategy, such as a jitter removing parameter and a delay processing parameter, can be adjusted. Of course, after the first debounce strategy is modified by the characteristic parameters (such as the historical data of the call, the perceptual auditory result of the call, and the like) for evaluating the voice over ip call quality, the modified debounce strategy may be modified again according to the signal content of the call, so as to improve the accuracy of the debounce strategy. And obtaining a debounce parameter according to the current real-time network condition and the second debounce strategy, and setting the size of a buffer area for transmitting Voip call data according to the debounce parameter, so that the time delay of the Voip call is in accordance with expectation and tends to be reasonable (303). In practical application, the size of the de-jitter buffer is determined according to the de-jitter parameters obtained by the second de-jitter strategy, and finally, the data in the buffer is adjusted based on the size of the de-jitter buffer.

As shown in fig. 5, an information processing method according to an embodiment of the present invention includes: the method comprises the steps of collecting offline network data, extracting at least one network parameter used for representing network characteristics from the offline network data, constructing a network model according to the at least one network parameter, measuring or simulating Voip call quality according to the network model, and determining a first jitter removal strategy (401) according to the network model. Specifically, in practical application, a large amount of existing network related network data are collected through different network types, and the network model is constructed through offline training, and the network model can determine the initial debouncing strategy. And acquiring a perceptual auditory result of the call, which can also be called as a traditional perceptual auditory evaluation parameter, taking the perceptual auditory result of the call as a characteristic parameter for evaluating Voip call quality, and correcting the first debouncing strategy according to the perceptual auditory result of the call to obtain a second debouncing strategy (402). In terms of the perceptual auditory result, different perceptual auditory results are different in the way and amplitude of debounce adjustment, and in a single call, the network parameter setting in the first debounce strategy, such as a debounce parameter and a delay processing parameter, can be adjusted. Of course, after the first debounce policy is modified by the characteristic parameters (such as the historical data of the call, the signal content of the call) for evaluating the voice over ip call quality, the modified debounce policy may be modified again according to the signal content of the call, so as to improve the accuracy of the debounce policy. And obtaining a debounce parameter according to the current real-time network condition and the second debounce strategy, and setting the size of a buffer area for transmitting Voip call data according to the debounce parameter, so that the time delay of the Voip call is in accordance with expectation and tends to be reasonable (403). In practical application, the size of the de-jitter buffer is determined according to the de-jitter parameters obtained by the second de-jitter strategy, and finally, the data in the buffer is adjusted based on the size of the de-jitter buffer.

In practical applications, in addition to the processing at the debounce end, in the whole Voip network call, different delay processing methods and parameters may be set at the sending end and the receiving end (or called playing end) according to different processing capabilities of the device, scheduling characteristics of application program threads, and the like, respectively, so as to continuously modify the first debounce policy to improve the accuracy of the debounce policy, as shown in the following embodiments.

As for a sending end in the whole Voip network call, in the information processing method according to the embodiment of the present invention, when collecting Voip call data of the current call, different processing capabilities of a terminal device and/or scheduling characteristics of an application serving as a Voip call medium are obtained, and the first debounce policy is modified according to the different processing capabilities of the terminal device and/or the scheduling characteristics of the application serving as the Voip call medium.

As for a receiving end (or called a playing end) in the entire Voip network call, in the information processing method according to the embodiment of the present invention, when playing Voip call data of the current call, different processing capabilities of a terminal device and/or a scheduling characteristic of an application serving as a Voip call medium are obtained, and the first debounce policy is modified according to the different processing capabilities of the terminal device and/or the scheduling characteristic of the application serving as the Voip call medium.

By adopting the embodiments, in practical application, the corresponding parameter representation network characteristics can be extracted through offline packet capturing, different network model parameters are established through a large amount of offline training, an initial debouncing algorithm and related parameters are determined according to the established network parameter model, and then the debouncing strategy and the related parameters are adjusted according to the historical data of the current call. Because the overall characteristics of the network in the whole call process are considered and the burstiness within a period of time is also considered in the modeling of the network model, the network characteristics can be estimated more accurately.

Regarding the debounce strategy, taking the system architecture shown in fig. 1 as an example, when the server 2 executes the debounce process, the debounce strategy always works in the best state, where JB _ len refers to the buffer size, AD _ up refers to the upper buffer limit, AD _ dw refers to the lower buffer limit, and F1-F4 refers to the empirical values of the tuning parameters, and the specific contents are as follows:

first, in case of JB _ len > AD _ up:

when JB _ len > AD _ up is multiplied by F1, if the current frame signal content is an important frame (such as a speech segment), the current buffer data is compressed; if the current frame is non-important data (such as mute data), the current frame is directly dropped. When JB _ len > AD _ up multiplied by F2 (F1 > F2), if the current frame signal content is an important frame (such as a speech segment), no processing is performed on the current buffer data; if the current frame is non-important data (such as mute data), the current buffer data is compressed.

The compression amplitude is determined according to the sizes of F1 and F2, and the compression amplitude of each time is smaller than the data length of the current frame.

The reason for this processing is that the signal is compressed or directly discarded, which actually damages the call quality, and the damage of the direct packet loss is larger than that of the compression; the compression algorithm based on a single packet, the amplitude of each compression is smaller than the data length of 1 frame, so that compared with directly discarding the current frame, the data compression reduces the data length of the buffer area less quickly, namely: making the reduction of the end-to-end delay slower. Therefore, only when the data length of the buffer area is very large and the current data is non-important data, a direct frame loss method is adopted; if the data length of the buffer area is very large and the current data is important data, the buffer area length is adjusted by compressing in a mode with less damage; if the data length of the buffer area is larger than a certain threshold value, but the current frame is important data, a strategy that nothing is done is adopted, so that the conversation quality of the voice section is ensured to the maximum extent. The extra time delay can be processed quickly after the non-mute section, so that the aim of reducing the end-to-end time delay is achieved, and the perception quality of the conversation is ensured to the maximum extent.

Two, when JB _ len < AD _ dw:

when JB _ len < AD _ dw x F3, if the current frame is an unimportant frame, directly carrying out repeated copying on the current frame, wherein the copying times are determined according to the size of F3; and if the current frame is an important frame, performing expansion processing on the data of the current buffer area. When JB _ len < AD _ dw × F4 (F3< F4), the current buffer is subjected to expansion processing. The amplitude of each expansion is determined according to the sizes of F3 and F4.

The reason for this is that although data is expanded or directly copied, it is a kind of damage to voice, but compared with voice jamming caused by buffer data being empty, the impact of this damage on the conversation experience is much smaller, so when the length of buffer data is found to be smaller than the lower adjustment limit, the size of buffer data is adjusted as soon as possible in principle with fast response.

When AD _ up > -JB _ len > -AD _ dw:

at this time, the data in the buffer area is directly decoded and then sent to the sound card equipment without any de-jitter processing.

In the adjustment of the de-jittering strategy referred to in the first and second parts of content, whether expansion or compression is used, it is necessary to look at the content of the signal and the adjustment algorithm at the time, e.g. because the expansion and compression algorithms are based on pitch periods and the music signal is not suitable for such expansion or compression algorithms, so if it is detected that the current signal is a music signal rather than a speech signal, it is necessary to make appropriate adjustments to the adjustment parameters (AD _ up, AD _ dw, F1-F4). Meanwhile, if too much continuous expansion/compression is performed, the audio perception may have a fast playing or slow playing effect, so that, in adjusting the debounce policy related to the first and second portions of content, it is necessary to make appropriate adjustments (such as specifying the maximum number of continuous expansion or compression) according to the duration adjustment policy, so as to ensure that the fast playing or slow playing effect cannot be heard in the final audio perception.

An information processing apparatus according to an embodiment of the present invention includes: the acquisition unit is used for acquiring the offline network data and extracting at least one network parameter for representing the network characteristics from the offline network data. The network model can determine the initial debouncing strategy, and because the relevant parameters output based on the initial debouncing strategy comprise debouncing parameters, delay parameters and the like, the network model can also be used for determining the initial debouncing strategy and the relevant parameters, and the relevant parameters comprise debouncing parameters and delay parameters. And the strategy correction unit is used for correcting the first debouncing strategy according to characteristic parameters (such as historical data of the call, signal content of the call, a perceptual auditory result of the call and the like) for evaluating the Voip call quality to obtain a second debouncing strategy. Wherein, as for the historical data of the call, the historical data can reflect the network characteristics of the call; for the signal content of the call, whether the current frame is an important frame or not is determined, the voice data content is the important frame and needs to be focused, the mute data content does not need to be focused, and the de-jitter processing is different for different contents; different perceptual auditory results differ with respect to the way and magnitude of the de-jitter adjustment. And the buffer area adjusting unit is used for obtaining a jitter removing parameter according to the current real-time network condition and the second jitter removing strategy, and setting the size of a buffer area for transmitting Voip call data according to the jitter removing parameter, so that the time delay of the Voip call is in accordance with expectation and tends to be reasonable. In practical application, the size of the de-jitter buffer is determined according to the de-jitter parameters obtained by the second de-jitter strategy, and finally, the data in the buffer is adjusted based on the size of the de-jitter buffer.

It should be noted here that the acquisition unit, the policy determination unit, and the policy modification unit in the above apparatus are not limited to be located in a transmitting end, a receiving end, or a server, and some or all of these units may be located in the transmitting end, the receiving end, or the server.

In an implementation manner of the embodiment of the present invention, the policy modification unit is further configured to: acquiring historical data of the call; and correcting the first debouncing strategy according to the historical data of the call.

In an implementation manner of the embodiment of the present invention, the policy modification unit is further configured to: and acquiring the signal content of the call, and correcting the first debouncing strategy according to the signal content of the call.

In an implementation manner of the embodiment of the present invention, the policy modification unit is further configured to: and acquiring a perceptual hearing result of the call, and correcting the first debouncing strategy according to the perceptual hearing result.

In an implementation manner of an embodiment of the present invention, the apparatus further includes: and the call acquisition unit is used for acquiring Voip call data of the call. The policy modification unit is further configured to: and when the collection of the Voip call data of the call is triggered, acquiring different processing capacities of terminal equipment and/or scheduling characteristics of the application serving as the Voip call medium, and correcting the first debounce strategy according to the different processing capacities of the terminal equipment and/or the scheduling characteristics of the application serving as the Voip call medium.

In an implementation manner of an embodiment of the present invention, the apparatus further includes: and the call playing unit is used for playing the Voip call data of the call. The policy modification unit is further configured to: and when the Voip call data of the current call is triggered to be played, acquiring different processing capacities of terminal equipment and/or scheduling characteristics of the application serving as the Voip call medium, and correcting the first jitter removal strategy according to the different processing capacities of the terminal equipment and/or the scheduling characteristics of the application serving as the Voip call medium.

An information processing system according to an embodiment of the present invention includes a sending end (or called acquisition end) 41, a debounce end 42, and a receiving end (or called playback end) 43. Wherein, the processing logic of the sending end (or called acquisition end) includes: acquiring offline network data, extracting at least one network parameter for representing network characteristics from the offline network data, and using the at least one network parameter to construct a network model, wherein the network model is used for determining a first jitter removal strategy in Voip call data transmission; acquiring different processing capacities of terminal equipment and/or scheduling characteristics of application serving as a Voip call medium when Voip call data of the call is acquired; and correcting the first jitter removal strategy according to different processing capacities of the terminal equipment and/or scheduling characteristics of the application serving as the Voip call medium.

The processing logic of the debounce end comprises: constructing a network model according to at least one network parameter, and determining a first debouncing strategy according to the network model, wherein the at least one network parameter is derived from parameters which are extracted from acquired offline network data and used for representing network characteristics; correcting the first jitter removal strategy according to the characteristic parameters for evaluating the Voip call quality of the network protocol to obtain a second jitter removal strategy; and obtaining a debounce parameter according to the current real-time network condition and the second debounce strategy, and setting the size of a buffer area for transmitting Voip call data according to the debounce parameter, so that the time delay of the Voip call is in accordance with expectation and tends to be reasonable.

In practical applications, the modifying the first debouncing policy according to the characteristic parameter for evaluating the Voip call quality includes: and acquiring historical data of the call, and correcting the first debouncing strategy according to the historical data of the call.

In practical applications, the modifying the first debouncing policy according to the characteristic parameter for evaluating the Voip call quality includes: and acquiring the signal content of the call, and correcting the first debouncing strategy according to the signal content of the call.

In practical applications, the modifying the first debouncing policy according to the characteristic parameter for evaluating the Voip call quality includes: and acquiring a perceptual hearing result of the call, and correcting the first debouncing strategy according to the perceptual hearing result.

The processing logic of the receiving end (or called playing end) includes: acquiring a first jitter removal strategy determined in Voip call data transmission, wherein the first jitter removal strategy is obtained according to a network model constructed by at least one network parameter, and the at least one network parameter is derived from a parameter extracted from acquired offline network data and used for representing network characteristics; when Voip call data of the call is played, acquiring different processing capacities of terminal equipment and/or scheduling characteristics of application serving as a Voip call medium; and correcting the first jitter removal strategy according to different processing capacities of the terminal equipment and/or scheduling characteristics of the application serving as the Voip call medium.

As shown in fig. 6, the information processing system includes a sending end (or called acquisition end) 41, a debounce end 42, and a receiving end (or called playback end) 43. The transmitting end (or called acquisition end) 41 includes: the acquiring unit 411 is configured to acquire offline network data, extract at least one network parameter used for characterizing network characteristics from the offline network data, and use the at least one network parameter to construct a network model, where the network model is used to determine a first debounce policy in transmission of Voip call data; the call acquisition unit 412 is configured to acquire different processing capabilities of the terminal device and/or scheduling characteristics of an application serving as a Voip call medium when acquiring Voip call data of the call; a first policy modification unit 413, configured to modify the first de-jitter policy according to different processing capabilities of the terminal device and/or a scheduling characteristic of an application serving as the Voip call medium. The debounce port 42 includes: a policy determining unit 421, configured to construct a network model according to at least one network parameter, and determine a first debounce policy according to the network model, where the at least one network parameter is derived from a parameter extracted from acquired offline network data and used for characterizing a network feature; a second policy modification unit 422, configured to modify the first debounce policy according to a characteristic parameter used for evaluating Voip call quality, to obtain a second debounce policy; and the buffer adjusting unit 423 is configured to obtain a debounce parameter according to the current real-time network condition and the second debounce policy, and set a buffer size for transmitting Voip call data according to the debounce parameter, so that the delay of the Voip call is expected and tends to be reasonable. The receiving end (or playing end) 43 includes: an obtaining unit 431, configured to obtain a first debounce policy determined in transmission of the Voip call data, where the first debounce policy is obtained according to a network model constructed by at least one network parameter, and the at least one network parameter is derived from a parameter extracted from acquired offline network data and used for representing a network feature; the call playing unit 432 is configured to, when playing Voip call data of the current call, obtain different processing capabilities of the terminal device and/or a scheduling characteristic of an application serving as a Voip call medium; a third policy modification unit 433, configured to modify the first debounce policy according to different processing capabilities of the terminal device and/or a scheduling characteristic of an application serving as the Voip call medium.

As for the Processor for data Processing, when executing Processing, the Processor can be implemented by a microprocessor, a Central Processing Unit (CPU), a Digital Signal Processor (DSP), or a Programmable logic Array (FPGA); for the storage medium, the storage medium contains operation instructions, which may be computer executable codes, and the operation instructions implement the steps in the flow of the information processing method according to the above-described embodiment of the present invention.

Here, it should be noted that: the above description related to the terminal and the server items is similar to the above description of the method, and the description of the beneficial effects of the same method is omitted for brevity. For technical details not disclosed in the embodiments of the terminal and the server of the present invention, please refer to the description of the embodiments of the method flow of the present invention.

The embodiment of the invention is explained by taking a practical application scene as an example as follows:

in a Voip network call scene, the embodiment of the invention can be a scheme for end-to-end delay processing in the Voip call. The general Voip call end-to-end includes modules as shown in fig. 7, and the end-to-end delay refers to the time difference from the beginning of speaking by speaker a to the time when hearing by listener B to the time when hearing sound. The Voip call technology performs packet transmission of data in the form of packets through an IP network, and due to the inherent characteristics of the IP network, the time used for transmitting each packet on the network is uncertain, and such a difference in transmission time is called jitter. The link with small jitter can be selected for transmission through reasonable routing scheduling; for the selected link, jitter can be handled by increasing the buffering delay; however, if the cache delay is too large, the total end-to-end delay is increased, and the experience effect of real-time conversation is influenced; if the buffering delay is too small, the voice will be jammed, which affects the communication quality. The module to handle dithering is mainly the "dedithering & decoding" module in fig. 7.

As can be seen from fig. 7, the end-to-end delay mainly includes: the buffering delay of the device (mainly the buffering delay of sound card acquisition and the buffering delay of sound card playing), the data buffering delay processed by each module of the Voip application program (mainly the delay generated by the debounce module), and the network transmission delay (uncontrollable). The embodiment of the invention can realize the reduction of end-to-end time delay in real-time communication, relates to each link from acquisition to playing, and comprises the following contents:

one, for the debounce module of the application:

a) collecting a large amount of current network related data according to different network types, performing off-line training, establishing a network model, and setting a time delay processing method and parameters according to different big data network models;

b) in a single call, adjusting the network parameter setting and the time delay processing parameter in a) according to the historical data of the call;

c) in a single call, adjusting the time delay processing parameters in b) according to the perception auditory result;

d) in a single call, adjusting the time delay processing parameters in b) according to the signal content;

secondly, for the device:

and setting different delay processing methods and parameters according to different processing capacities of the equipment, the scheduling characteristics of application program threads and the like.

For the above application scenarios, most of the solutions in the prior art are de-jittering solutions for network transmission, and are specifically implemented by using a "de-jittering & decoding" module in fig. 7, and the implementation block diagrams are respectively shown in fig. 8 to 9.

As shown in fig. 8, the implementation flow of the first scheme includes: determining a network jitter parameter for representing the current network jitter situation; adjusting a delay parameter of Jitter Buffer according to the current network Jitter parameter; and carrying out delay processing on the data packet in the Jitter Buffer according to the adjusted delay parameter of the Jitter Buffer. Specifically, first, parameters for representing the current network jitter are determined as follows: recording the number of 10ms packets reaching the Jitter Buffer each time by using the PktComeThisTime, recording a plurality of PktComeThisTimes and determining the maximum value of the PktComeThisTimes, and marking the maximum value as Pm; then, a parameter representing the network Jitter J is obtained by a series of weighted averages Pm, and the size of the Jitter Buffer is adjusted according to J.

As shown in fig. 9, in the implementation flow of the second scheme, first, at the receiving end, the network delay dn is predicted or estimated according to the historical data, and meanwhile, the packet loss rate of the receiving end is counted; then, according to the estimated network delay and the statistical packet loss rate, obtaining the size of the current ideal de-jitter buffer area based on the E-Model; finally, the buffer data is adjusted based on the buffer size.

The problems of the two schemes include:

1) network estimation aspect: the estimation of the network characteristics plays an important guiding role for a debounce algorithm, in the two prior art schemes, the size of a debounce buffer area is determined according to the network characteristics estimated by historical data of the current call, although the network characteristic estimation methods are different, the common defects are that the used parameters are single, and the complexity simulation of the network is not enough.

By adopting the embodiment of the invention, the corresponding parameter representation network characteristics are extracted through offline packet capturing, different network model parameters are established through a large amount of offline training, and an initial debouncing algorithm and related parameters are determined according to the established network parameter model; then, according to the historical data of the current call, the debounce algorithm and the relevant parameters are adjusted. Meanwhile, in the modeling of the network model, the overall characteristics of the network in the whole call process are considered, and the burstiness in a period of time is also considered. Thus, the network characteristics can be estimated more accurately.

2) On the debounce algorithm: for the adjustment scheme of the buffer data, in the first scheme, the buffer is adjusted only according to the network estimation value, and the influence of different data contents on the auditory perception of human ears is not considered, for example, in the scheme, under certain conditions, in order to ensure time delay, the buffer data needs to be discarded, at this time, the signal type at the time is not considered, and the voice message or the mute data is discarded directly, so that the method is simple and violent, and the conversation experience effect is not good; in the second scheme, although the E-model is used for guidance, the complexity of the E-model is too high in a single call, and the practicability is limited. Moreover, the de-jitter algorithms in both schemes are adjusted in units of "packets", which also has limited flexibility.

By adopting the embodiment of the invention, the selection of the debouncing algorithm is determined according to the content of the signal at the adjusting moment and the traditional perceptual auditory evaluation parameters, so that the processing is more flexible, and the final effect on perceptual auditory is better.

3) In terms of acquisition and playing: the two technical schemes do not consider the influence of different acquisition and playing strategies and thread scheduling on jitter removal, but the embodiment of the invention fully considers the influence of the acquisition and playing strategies and the thread scheduling on jitter removal.

For the above application scenario, with the embodiment of the present invention, a general schematic diagram is shown in fig. 10, and includes: and determining the lower limit value AD _ dw of the current buffer area size adjustment and the upper limit value AD _ up of the adjustment according to the current network estimation condition. And then, determining the adjustment mode and the adjustment amplitude of the data in the current buffer area according to the size JB _ len and AD _ up/AD _ dw of the data in the current buffer area, the current signal content and the human ear perception auditory model. Meanwhile, during acquisition and playing, the acquisition and playing strategy is adjusted according to the performance of the equipment, so that the data sending speed is more uniform, the data transmission speed to the buffer area is also more uniform, and the jitter removal module works in the best state, and the method is specifically realized as follows:

1) when JB _ len > AD _ up:

when JB _ len > AD _ up is multiplied by F1, if the current frame signal content is an important frame (such as a speech segment), the current buffer data is compressed; if the current frame is non-important data (such as mute data), the current frame is directly dropped. When JB _ len > AD _ up × F2 (F1) F2), if the current frame signal content is an important frame (e.g., a speech segment), no processing is performed on the current buffer data; if the current frame is non-important data (such as mute data), the current buffer data is compressed.

The reason for this processing is that the signal is compressed or directly discarded, which actually damages the call quality, and the damage of the direct packet loss is larger than that of the compression; the compression algorithm based on a single packet has the amplitude of each compression smaller than the data length of 1 frame, so that compared with the method of directly discarding the current frame, the data compression has the advantages that the reduction of the data length of the buffer area is not so fast, namely the reduction of the end-to-end delay is slow. Therefore, only when the data length of the buffer area is very large and the current data is non-important data, a direct frame loss method is adopted; if the data length of the buffer area is very large and the current data is important data, the buffer area length is adjusted by compressing in a mode with small damage; if the data length of the buffer area is larger than a certain threshold value, but the current frame is important data, a strategy that nothing is done is adopted, so that the conversation quality of the voice section is ensured to the maximum extent. The extra time delay can wait until the non-mute section is processed quickly, so that the aim of reducing the end-to-end time delay is fulfilled, and the perception quality of the conversation is ensured to the maximum extent.

2) When JB _ len < AD _ dw:

3) When AD _ up > -JB _ len > -AD _ dw:

In the adaptation algorithms of 1) and 2), expansion or compression, it is also necessary to look at the content of the signal and the adaptation algorithm at the time, for example, since the expansion and compression algorithms are based on pitch periods, whereas music signals are not suitable for such expansion or compression algorithms, it is also necessary to make appropriate adjustments to the adaptation parameters (AD _ up, AD _ dw, F1-F4) if it is detected that the current signal is a music signal rather than a speech signal.

Meanwhile, if too much continuous expansion/compression is performed, the auditory perception has the effect of fast playing or slow playing, so the adjustment algorithms in 1) and 2) also need to make appropriate adjustment (such as specifying the maximum number of continuous expansion or compression) according to a duration adjustment strategy, so as to ensure that the final auditory perception cannot hear the effect of fast playing or slow playing.

In the scheme, modeling is carried out through offline network characteristics: and analyzing a large amount of current network data, extracting parameters and establishing different network models through offline packet capturing.

For example, fig. 11 and 12 extract "the time difference between the arrival of two packets before and after" in the offline data as one of the model characteristic parameters, and compared with fig. 12, the value range of fig. 11 has a large fluctuation, which indicates that the jitter of the network is large. In fig. 11, the arrival time difference between the two packets before and after the time of the large jitter is greatly and suddenly jittered, and is small; in fig. 12, the difference in arrival time between the two packets before and after the time when the jitter is small is large, and the amount of the burst jitter is large. However, the maximum burst jitter in fig. 12 is large (the number of times that the difference between the arrival times of the two packets before and after the packet is larger than 1000ms in the figure is large). The conventional Jitter value can be calculated by the method as in RFC 3550, which represents the network Jitter of the current "time of day", but this is often not enough, because the total Jitter of fig. 12 is small, but the burst Jitter is large, and it can be calculated by performing cumulative histogram statistics, variance statistics, smooth envelope value in the whole call process, burst times, etc. on the "time difference between the arrival of two packets before and after", so as to distinguish the two network models of fig. 11 and 12.

Besides the time difference of the arrival of the two packets, the number of continuous packet losses, the overall packet loss rate, the disorder length and the like can be used as modeling parameters for analysis.

Adjusting the debounce parameter according to the network parameter of the current call history: preliminarily determining debounce parameters AD _ up and AD _ dw according to the result of the step 1); then, AD _ up and AD _ dw are adjusted according to the history data of the current call.

For example, from analysis of a large amount of offline data, we find that different network types, such as 2g, 3g, 4g, wifi, can characterize different network characteristic trends in a large direction, such as 2g network is more prone to large jitter due to network congestion than 4g network. At this time, 2g is relative to 4g, and we can set larger AD _ up and AD _ dw at initialization; then, according to the historical data of the current call, the sizes of the AD _ up and the AD _ dw and the parameters F1-F4 are adjusted according to different network models by the network parameters analyzed in the step 1). Even if the wifi network is the same, the characteristics are different, for example, for a network type similar to fig. 6, that is, a network type with small overall jitter but large burst jitter is more, we can set smaller AD _ up and AD _ dw to ensure that the overall end-to-end delay is smaller, but when JB _ len < AD _ dw, we can adjust F3 and F4 to make the extension policy more aggressive (extend service larger or copy more data at a time), and response is faster to achieve the purpose of better and faster resisting large burst jitter.

Adjusting the de-jitter parameter according to the signal content: the debounce parameters are adjusted (i.e., AD _ up and AD _ dw, F1 to F4 are adjusted) according to the content (music, voice, etc.) and the degree of importance (mute, non-mute, etc.) of the current signal. Such as: for music signals, larger AD _ up and AD _ dw are used as much as possible for the same network. The overall principle is as follows: in the place of the important frame, the dithering removal processing is performed as little as possible; when the length of the buffer area is larger than AD _ up, the adjustment strategy can be slightly slowed down, and the processing is carried out when the non-important frame is processed; when the length of the buffer is smaller than AD _ dw, the buffer needs to be adjusted as soon as possible to avoid jamming. And under the condition of ensuring the auditory perception quality as much as possible, the dithering removal processing is carried out when necessary.

The debounce parameters are adjusted based on the auditory perception: when the signal is expanded, compressed or time length adjusted, the adjusting frequency is controlled, so that the effect of fast playing or slow playing cannot be heard in perception hearing.

Self-adaptation of the acquisition/playback device: in fig. 10, the speed of transmitting packets is not uniform or regular because of the difference in processing power of the devices and the difference in scheduling characteristics of the applications. The de-jitter module is designed based on the fact that the transmission speed of the packets is uniform or regular. The uniformity of the sending speed is mainly determined by the acquisition mode of the sound card and the scheduling characteristics of the thread. For example, if the application is driven to encode/transmit by using a sound card callback mode, the uneven time interval between two sound card callbacks of the android device is more, and the more the devices with poorer performance are, the more the situation is. At this time, the application program can be driven to encode/transmit by adopting a sound card callback or timer callback method according to different device performances, so that the transmission interval of the packets is more uniform. Similarly, at the playing end, the data of the data to be sent to the buffer by the application program is made uniform as much as possible, so that the jitter removing module can work in the optimal state, and the end-to-end delay is minimized. For the difference of thread scheduling, for example, for the same device, when the audio and video call is compared with the pure audio call, due to the video acquisition, encoding and decoding, the processing capacity of the handheld device is limited, and when the audio and video call is performed, the uniformity of the thread scheduling is not good when the audio and video call is performed, and in this time, after the thread scheduling method is fully optimized, under the same network condition, the parameters of the debounce algorithm can be properly increased to reduce the blocking.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of the unit is only a logical functional division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, all the functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.

Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: a mobile storage device, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

Alternatively, the integrated unit of the present invention may be stored in a computer-readable storage medium if it is implemented in the form of a software functional unit and sold or used as a separate product. Based on such understanding, the technical solutions of the embodiments of the present invention may be essentially implemented or a part contributing to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present invention. And the aforementioned storage medium includes: a removable storage device, a ROM, a RAM, a magnetic or optical disk, or various other media that can store program code.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims

1. An information processing method, characterized in that the method comprises:

acquiring offline network data, and respectively extracting network parameters corresponding to different network types from the offline network data;

respectively constructing corresponding network models according to the network parameters of the network types, and determining a first debouncing strategy according to the network models;

correcting the first de-jitter strategy according to the characteristic parameters for evaluating the Voip call quality of the network protocol voice to obtain a corrected first de-jitter strategy;

in the Voip call, according to the processing capacity of terminal equipment and the scheduling characteristic of the application serving as the Voip call medium, continuously correcting the corrected first jitter removal strategy to obtain a second jitter removal strategy;

2. The method according to claim 1, wherein the modifying the first de-jitter strategy according to the characteristic parameter for evaluating Voip call quality comprises:

acquiring historical data of the call;

3. The method according to claim 1, wherein the modifying the first de-jitter strategy according to the characteristic parameter for evaluating Voip call quality comprises:

acquiring the signal content of the call;

4. The method according to claim 1, wherein the modifying the first de-jitter strategy according to the characteristic parameter for evaluating Voip call quality comprises:

acquiring a perceptual auditory result of the call;

5. The method according to claim 1, wherein the continuing to modify the modified first de-jitter strategy according to the processing capability of the terminal device and the scheduling characteristics of the application as the Voip call medium comprises:

acquiring the processing capacity of terminal equipment and/or the scheduling characteristic of application serving as a Voip call medium when Voip call data of the call is acquired;

and correcting the corrected first jitter removal strategy according to the processing capacity of the terminal equipment and/or the scheduling characteristic of the application serving as the Voip call medium.

6. The method according to claim 1, wherein the continuing to modify the modified first de-jitter strategy according to the processing capability of the terminal device and the scheduling characteristics of the application as the Voip call medium comprises:

when Voip call data of the call is played, acquiring the processing capacity of terminal equipment and/or the scheduling characteristic of the application serving as the Voip call medium;

7. An information processing apparatus characterized in that the apparatus comprises:

the acquisition unit is used for acquiring offline network data and respectively extracting network parameters corresponding to different network types from the offline network data;

the strategy determining unit is used for respectively constructing corresponding network models according to the network parameters of the network types and determining a first debouncing strategy according to the network models;

the strategy correction unit is used for correcting the first jitter removal strategy according to the characteristic parameters for evaluating the Voip call quality of the network protocol voice to obtain a corrected first jitter removal strategy;

8. The apparatus of claim 7, wherein the policy modification unit is further configured to:

acquiring historical data of the call;

9. The apparatus of claim 7, wherein the policy modification unit is further configured to:

acquiring the signal content of the call;

10. The apparatus of claim 7, wherein the policy modification unit is further configured to:

acquiring a perceptual auditory result of the call;

11. The apparatus of claim 7, further comprising:

the call acquisition unit is used for acquiring Voip call data of the call;

the policy modification unit is further configured to:

when the collection of Voip call data of the call is triggered, acquiring the processing capacity of terminal equipment and/or the scheduling characteristic of the application serving as the Voip call medium;

12. The apparatus of claim 7, further comprising:

the call playing unit is used for playing Voip call data of the call;

the policy modification unit is further configured to:

when the Voip call data of the call is triggered to be played, acquiring the processing capacity of terminal equipment and/or the scheduling characteristic of the application serving as the Voip call medium;

13. An information processing method, characterized in that the method comprises:

acquiring offline network data, and respectively extracting network parameters corresponding to different network types from the offline network data, wherein the network parameters of different network types are used for constructing corresponding network models, and the network models are used for determining a first jitter removal strategy in voice over transport network protocol (Voip) call data;

14. A transmitting end of information processing, the transmitting end comprising:

the system comprises a collecting unit, a processing unit and a control unit, wherein the collecting unit is used for collecting offline network data and respectively extracting network parameters corresponding to different network types from the offline network data, the network parameters of different network types are used for constructing corresponding network models, and the network models are used for determining a first jitter removal strategy in Voip call data transmission;

the first strategy correction unit is used for correcting the first de-jitter strategy according to the characteristic parameters for evaluating the Voip call quality of the network protocol voice to obtain a corrected first de-jitter strategy;

the call acquisition unit is used for acquiring the Voip call data of the call, and acquiring the processing capacity of terminal equipment and/or the scheduling characteristic of the application serving as the Voip call medium;

and the second strategy correcting unit is used for correcting the corrected first jitter removing strategy according to the processing capacity of the terminal equipment and/or the scheduling characteristic of the application serving as the Voip call medium.

15. An information processing method, characterized in that the method comprises:

constructing a corresponding network model according to network parameters corresponding to different network types, and determining a first debouncing strategy according to the network model, wherein the at least one network parameter is extracted from acquired offline network data;

in the Voip call, the corrected first jitter removal strategy is continuously corrected according to the processing capacity of terminal equipment and the scheduling characteristic of the application serving as the Voip call medium to obtain a second jitter removal strategy;

16. The method according to claim 15, wherein the modifying the first de-jitter strategy according to the characteristic parameter for evaluating the Voip call quality comprises:

acquiring historical data of the call;

17. The method according to claim 15, wherein the modifying the first de-jitter strategy according to the characteristic parameter for evaluating the Voip call quality comprises:

acquiring the signal content of the call;

18. The method according to claim 15, wherein the modifying the first de-jitter strategy according to the characteristic parameter for evaluating the Voip call quality comprises:

acquiring a perceptual auditory result of the call;

19. A dejittering end for information processing, characterized in that the dejittering end comprises:

the strategy determining unit is used for constructing corresponding network models according to network parameters corresponding to different network types, and determining a first debouncing strategy according to the network models, wherein the at least one network parameter is extracted from acquired offline network data;

the third strategy correction unit is used for correcting the first jitter removal strategy according to the characteristic parameters for evaluating the Voip call quality of the network protocol voice to obtain a corrected first jitter removal strategy;

20. The debounce terminal according to claim 19, wherein the third strategy modification unit is further configured to:

acquiring historical data of the call;

21. The debounce terminal according to claim 19, wherein the third strategy modification unit is further configured to:

acquiring the signal content of the call;

22. The debounce terminal according to claim 19, wherein the third strategy modification unit is further configured to:

acquiring a perceptual auditory result of the call;

23. An information processing method, characterized in that the method comprises:

acquiring a first jitter removal strategy determined in voice over internet protocol (Voip) call data of a transmission network protocol, wherein the first jitter removal strategy is obtained according to corresponding network models constructed by network parameters corresponding to different network types, and the at least one network parameter is extracted from acquired offline network data;

obtaining a corrected first jitter removal strategy obtained by correcting the first jitter removal strategy according to characteristic parameters for evaluating the Voip call quality of the network protocol;

24. A receiving end for information processing, the receiving end comprising:

the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a first jitter removal strategy determined in voice over internet protocol (Voip) call data of a transport network protocol, the first jitter removal strategy is obtained according to corresponding network models constructed by network parameters corresponding to different network types, and the at least one network parameter is extracted from acquired offline network data;

the call playing unit is used for acquiring the processing capacity of the terminal equipment and/or the scheduling characteristic of the application serving as the Voip call medium when playing the Voip call data of the call;

and the third strategy correcting unit is used for correcting the corrected first jitter removing strategy according to the processing capacity of the terminal equipment and/or the scheduling characteristic of the application serving as the Voip call medium.

25. A computer-readable storage medium storing executable instructions for implementing the information processing method according to any one of claims 1 to 6, 13, 15 to 18, and 23 when executed by a processor.