CN117692366A - Call processing method and device, electronic equipment and storage medium - Google Patents

Call processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN117692366A
CN117692366A CN202211078726.8A CN202211078726A CN117692366A CN 117692366 A CN117692366 A CN 117692366A CN 202211078726 A CN202211078726 A CN 202211078726A CN 117692366 A CN117692366 A CN 117692366A
Authority
CN
China
Prior art keywords
length
voice data
delay
call
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211078726.8A
Other languages
Chinese (zh)
Inventor
史彦斌
邹文波
杨砚
杨峰
吴海英
蒋宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mashang Xiaofei Finance Co Ltd
Original Assignee
Mashang Xiaofei Finance Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mashang Xiaofei Finance Co Ltd filed Critical Mashang Xiaofei Finance Co Ltd
Priority to CN202211078726.8A priority Critical patent/CN117692366A/en
Publication of CN117692366A publication Critical patent/CN117692366A/en
Pending legal-status Critical Current

Links

Abstract

The disclosure provides a call processing method, a call processing device, electronic equipment and a storage medium, wherein the call processing method comprises the following steps: acquiring a plurality of voice data packets transmitted in a historical reference period corresponding to the current conversation time; calculating the transmission delay time length of each voice data packet to obtain delay distribution data corresponding to a plurality of voice data packets; determining a predicted delay time length corresponding to a preset predicted delay hit rate, and calculating a cache length corresponding to the predicted delay time length; dynamically adjusting the call buffer area to be a length matched with the buffer length; the call buffer area is used for buffering the voice data packet received in the current call process. In this embodiment, a call buffer area with a buffer length dynamically adjustable according to the delay distribution of the voice data packets is provided, so that the number of the voice data packets buffered in the call buffer area can be ensured to meet the call quality requirement.

Description

Call processing method and device, electronic equipment and storage medium
Technical Field
The disclosure relates to the field of communication technologies, and in particular, to a call processing method, a call processing device, an electronic device and a storage medium.
Background
At present, with the continuous development of internet technology, network bandwidth is also continuously improved, and real-time voice technology is widely applied in life. Based on the voice service transmitted by the Internet, the user can access the Internet through various networks such as 2G/3G/4G/WIFI and the like so as to communicate in a network mode.
However, as the network environment accessed by the user is more and more complex, there are interference factors in the network, such as delay, jitter, etc., which may have a great influence on the voice quality, so the voice quality is more and more interesting. Therefore, how to improve the voice quality during the call is a challenge to be solved.
Disclosure of Invention
The disclosure provides a call processing method, a call processing device, electronic equipment and a storage medium, which are used for solving the problem that the existing call processing mode is easily affected by delay or jitter.
In a first aspect, the present disclosure provides a call processing method, including the steps of:
acquiring a plurality of voice data packets transmitted in a historical reference period corresponding to the current conversation time;
calculating the transmission delay time length of each voice data packet, and obtaining delay distribution data corresponding to the voice data packets according to the transmission delay time length of each voice data packet;
Determining a predicted delay time length corresponding to a preset predicted delay hit rate according to the delay distribution data, and calculating a cache length corresponding to the predicted delay time length;
dynamically adjusting a call buffer area to be a length matched with the buffer length; the call buffer area is used for buffering the voice data packet received in the current call process.
In a second aspect, the present disclosure provides a call processing apparatus, including:
the acquisition module is suitable for acquiring a plurality of voice data packets transmitted in a historical reference period corresponding to the current conversation time;
the calculation module is suitable for calculating the transmission delay time length of each voice data packet and obtaining delay distribution data corresponding to the voice data packets according to the transmission delay time length of each voice data packet;
the determining module is suitable for determining estimated delay time length corresponding to a preset estimated delay hit rate according to the delay distribution data, and calculating the buffer memory length corresponding to the estimated delay time length;
the adjusting module is suitable for dynamically adjusting the call buffer area to be the length matched with the buffer length; the call buffer area is used for buffering the voice data packet received in the current call process.
In a third aspect, the present disclosure provides an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores one or more computer programs executable by the at least one processor, one or more of the computer programs being executable by the at least one processor to enable the at least one processor to perform the above-described method.
In a fourth aspect, the present disclosure provides a computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor/processing core implements the above-described method.
According to the embodiment provided by the disclosure, the estimated delay time length corresponding to the preset estimated delay hit rate can be determined according to the delay distribution data corresponding to the voice data packets, so that the buffer memory length corresponding to the estimated delay time length is calculated, and the call buffer memory area is dynamically adjusted to be matched with the buffer memory length. Therefore, in this embodiment, a call buffer area with a buffer length dynamically adjustable according to the delay distribution of the voice data packets is provided, and since the length of the call buffer area dynamically changes according to the delay distribution data of the voice data packets in the historical reference period, the number of the voice data packets buffered in the call buffer area can be ensured to meet the call quality requirement. For example, under the condition of low call delay, the length of a call buffer area is reduced, so that the real-time performance of the call is improved; under the condition of higher call delay, the length of a call buffer area is increased, so that the problem of call quality caused by packet loss is avoided. Therefore, the method for dynamically adjusting the length of the call buffer area can give consideration to the reliability and the instantaneity of the call.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The accompanying drawings are included to provide a further understanding of the disclosure, and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure, without limitation to the disclosure. The above and other features and advantages will become more readily apparent to those skilled in the art by describing in detail exemplary embodiments with reference to the attached drawings, in which:
fig. 1 is a flowchart of a call processing method according to an embodiment of the present disclosure;
fig. 2 is a flowchart of a call processing method according to another embodiment of the present disclosure;
fig. 3 is a block diagram of a call processing apparatus according to an embodiment of the present disclosure;
fig. 4 is a block diagram of an electronic device according to an embodiment of the present disclosure.
Detailed Description
For a better understanding of the technical solutions of the present disclosure, exemplary embodiments of the present disclosure will be described below with reference to the accompanying drawings, in which various details of the embodiments of the present disclosure are included to facilitate understanding, and they should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Embodiments of the disclosure and features of embodiments may be combined with each other without conflict.
As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used herein, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect.
Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and the present disclosure, and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
The call processing method according to the embodiment of the present disclosure may be performed by an electronic device such as a terminal device or a server, and the terminal device may be a vehicle-mounted device, a User Equipment (UE), a mobile device, a User terminal, a cellular phone, a cordless phone, a personal digital assistant (Personal Digital Assistant, PDA), a handheld device, a computing device, a vehicle-mounted device, a wearable device, or the like; the server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud computing service. The method may in particular be implemented by means of a processor calling a computer program stored in a memory.
In the related art, in order to improve the voice quality, a general method for reducing the network jitter is to add a jitter buffer at the receiving end, that is, to put a buffer with a fixed length into the receiving end before decoding and playing, where the buffer is used for buffering voice data packets for a certain time, and to eliminate the network jitter by adding a certain delay. The algorithm is simple to implement and low in algorithm complexity. However, in the course of research and practice of the related art, the inventors of the present invention found that since a fixed length of jitter buffer is added to the receiving end, if the buffer is set to be smaller than the jitter, it would result in loss of data to cause voice distortion and reduce voice quality. For example, in the case of smaller buffer, packets with slightly longer delay time will be lost due to the inability to buffer, which results in incomplete voice data transmission and poor voice quality. If the buffer area is set larger, the method is equivalent to introducing a large call delay, and the call quality of the user is also influenced, even normal real-time call is influenced. For example, in the case of a large buffer, although it is ensured that the data packet with a slightly long delay time can be reliably buffered, the receiving end needs to receive the voice content after the data in the buffer is full, so that a certain delay exists between the two parties of the call, so that the voice heard by the receiving end generates a larger delay, and the real-time performance of the call is further affected. In order to solve the above problems, the present application proposes a call processing method implemented based on dynamically adjusting the size of a buffer, by which the length of the buffer can be dynamically adjusted according to delay distribution data, so as to avoid the above problems.
Fig. 1 is a flowchart of a call processing method according to an embodiment of the present disclosure. Referring to fig. 1, the method includes:
step S110: a plurality of voice data packets transmitted in a historical reference period corresponding to a current talk time are acquired.
Step S110 may be a step that is periodically performed, for example, step S110 may be performed once every preset time period; alternatively, step S110 may be performed at the initial stage of each call process. In summary, the triggering method and triggering period of step S110 are not limited in the present application.
Wherein, the historical reference period corresponding to the current talk time refers to: a preset period of time that is located before the current talk time and that can provide a reference for the current talk process. The longer the duration of the historical reference period, the higher the reference accuracy; the closer the historical reference period is to the current talk time, the higher the reference value. The duration of the historical reference period can be flexibly set by a person skilled in the art according to service requirements. In the historical reference period, a plurality of voice data packets are transmitted, and correspondingly, in the step, a plurality of voice data packets transmitted in the historical reference period corresponding to the current conversation time are acquired.
Step S120: and calculating the transmission delay time length of each voice data packet, and obtaining delay distribution data corresponding to a plurality of voice data packets according to the transmission delay time length of each voice data packet.
Wherein the transmission delay time of each voice data packet is determined according to the interval between the receiving time and the transmitting time of the voice data packet. The transmission time of the voice data packet can be obtained based on the time stamp information contained therein. After the transmission delay time length of each voice data packet is obtained, the distribution situation of the transmission delay time length of each voice data packet is counted, and delay distribution data corresponding to a plurality of voice data packets is obtained.
The delay distribution data is used for reflecting the distribution condition of the transmission delay time length of a plurality of voice data packets. For example, the delay profile data is used to reflect the occurrence probabilities of various transmission delay durations. The delay profile data may be generated in various ways by those skilled in the art, for example, by a function operation method, a curve drawing method, etc., and the present invention is not limited to the generation method and specific form of the delay profile data.
Step S130: and determining the estimated delay time length corresponding to the preset estimated delay hit rate according to the delay distribution data, and calculating the cache length corresponding to the estimated delay time length.
Wherein, the delay hit rate means: probability that a voice packet can be effectively buffered by the talk buffer. The delay hit rate is determined mainly by the following two factors: on one hand, the delay hit rate is influenced by the transmission delay time length of the voice data packet, and the voice data packet with longer transmission delay time length is more likely to be lost due to insufficient buffer length of a call buffer area; on the other hand, the delay hit rate is affected by the buffer length of the call buffer, and the shorter the buffer length of the call buffer is, the more easily a large number of voice data packets are lost. It can be seen that the delay hit rate is determined by the transmission delay time of the voice data packet and the buffer length of the call buffer. Since the transmission delay time length of each voice data packet is different, the delay hit rate is a continuously variable value.
As the name suggests, the estimated late hit rate is a value of the estimated late hit rate. Specifically, the estimated delay hit rate can be flexibly set according to the service requirement. Under the condition that the estimated delay hit rate is preset, the estimated delay time length corresponding to the preset estimated delay hit rate can be determined according to the delay distribution data. Since the delay distribution data can reflect the probability distribution condition of the transmission delay time length of each voice data packet in the history reference period, the estimated delay time length corresponding to the estimated delay hit rate can be determined according to the delay distribution data assuming that the transmission delay condition of the voice data packet in the subsequent call process is the same as the transmission delay condition of each voice data packet in the history reference period. Wherein, the estimated delay time length refers to: and under the condition that the probability that the voice data packet can be effectively cached by the call cache area is equal to the estimated delay hit rate, the estimated value of the transmission delay duration of the voice data packet. Correspondingly, according to the estimated delay time length, the buffer memory length corresponding to the estimated delay time length can be calculated. The larger the estimated delay time length is, the larger the buffer memory length is; the smaller the estimated delay time, the smaller the cache length.
Step S140: dynamically adjusting the call buffer area to be a length matched with the buffer length; the call buffer area is used for buffering the voice data packet received in the current call process.
The call buffer area is used for buffering the voice data packets received in the current call process, and the voice data packets in the call buffer area are sequentially subjected to subsequent processing under the condition that the residual buffer space of the call buffer area is not larger than a preset lower limit threshold value. The preset lower threshold may be flexibly set, for example, may be set to 0. The subsequent processing performed on the voice data packet may include: transmitting the voice data packet to a receiving end for processing by the receiving end; or, decoding, playing, etc. are performed with respect to the voice data packet.
In the embodiment provided by the disclosure, the estimated delay time length corresponding to the preset estimated delay hit rate can be determined according to the delay distribution data corresponding to the plurality of voice data packets, so that the buffer memory length corresponding to the estimated delay time length is calculated, and the call buffer memory area is dynamically adjusted to be a length matched with the buffer memory length. Therefore, in this embodiment, a call buffer area with a buffer length dynamically adjustable according to the delay distribution of the voice data packets is provided, and since the length of the call buffer area dynamically changes according to the delay distribution data of the voice data packets in the historical reference period, the number of the voice data packets buffered in the call buffer area can be ensured to meet the call quality requirement.
Fig. 2 is a flowchart of a call processing method according to another embodiment of the present disclosure. Referring to fig. 2, the method includes:
step S210: a plurality of voice data packets transmitted in a historical reference period corresponding to a current talk time are acquired.
Wherein the historical reference period corresponding to the current talk time may be determined by at least one of:
in a first implementation, the first N time periods corresponding to the current talk time are taken as historical reference periods; the length of the time period is determined according to the processing performance of the equipment and/or the network transmission speed; wherein N is a natural number. In this manner, the time period is set in advance, and for example, the time period may be set to various lengths of 10 minutes, 1 hour, one day, one week, or the like. Accordingly, at least one time period before the current talk time is taken as a historical reference period. The larger the value of N is, the longer the length of the historical reference period is, and the higher the calculation accuracy is, but the calculation amount carried by the server is also increased, so that the duration and the number of the time periods can be comprehensively set according to factors such as the processing performance of the equipment. In addition, in the case where N is greater than 1, different weights may be set for different time periods according to the order of the respective time periods included in the history reference period. For example, the closer the time period from the current talk time is, the higher the weight of the time period; the more distant time periods from the current talk time are weighted lower.
In a second implementation manner, taking the conversation time corresponding to the previous M conversation processes corresponding to the current conversation process as a history reference period; wherein M is a natural number. In this method, statistics are not performed in units of time periods, but directly in units of call progress. Similar to the previous implementation, the larger the value of M, the longer the length of the historical reference period and the higher the calculation accuracy, but the calculation amount carried by the server increases. In addition, when M is greater than 1, different weights may be set for different call processes according to the sequence of each call process included in the history reference period. For example, the closer the current talk time is, the higher the weight of the talk process; the farther from the current talk time the lower the weight of the talk process.
Step S220: and calculating the transmission delay time length of each voice data packet, and obtaining delay distribution data corresponding to a plurality of voice data packets according to the transmission delay time length of each voice data packet.
When calculating the transmission delay time of each voice data packet, firstly, acquiring time stamp information in each voice data packet, and determining the transmission time of the voice data packet according to the time stamp information; and then, obtaining the transmission delay time of each voice data packet according to the difference value between the receiving time and the sending time of each voice data packet.
In this embodiment, the delay profile data is a delay probability profile. Accordingly, when delay distribution data corresponding to a plurality of voice data packets is obtained according to the transmission delay time length of each voice data packet, the following manner is adopted: firstly, calculating a first delay parameter and a second delay parameter corresponding to a plurality of voice data packets according to the transmission delay time of each voice data packet; then, a delay probability distribution curve corresponding to the plurality of voice data packets is generated according to the first delay parameter and the second delay parameter. The first delay parameter (also called mean value parameter) is used for reflecting the average condition of the transmission delay duration of each voice data packet, and specifically includes: delay mean or delay mean. The second delay parameter (also called bias class parameter) is used for reflecting the degree of deviation of the transmission delay duration of each voice data packet from the average value, and specifically includes: delay variance or delay standard deviation. Accordingly, the delay probability distribution curve may be a normal distribution curve.
For example, in one implementation, a delay average value and a delay standard deviation of each voice data packet are calculated according to a transmission delay duration of each voice data packet. Accordingly, a normal distribution curve for reflecting the delay distribution probability can be drawn based on the delay average value and the delay standard deviation.
In addition, in the case where the history reference period includes the first N time periods (N is greater than 1) corresponding to the current talk time, the delay period average value and the delay period standard deviation of the plurality of voice data packets in each time period included in the history reference period may also be calculated for that time period, respectively. Then, different weights are set for the respective time periods included in the history reference period according to the lengths of the respective time periods included in the history reference period from the current talk time. Accordingly, the delay average value and the delay standard deviation of the plurality of voice data packets in the history reference period are calculated based on the weights of the respective time periods and the delay period average value and the delay period standard deviation of the plurality of voice data packets in the respective time periods.
In this aspect, the delay average value and the delay standard deviation of the plurality of voice data packets in the history reference period are calculated from the delay period average value and the delay period standard deviation of the plurality of voice data packets in each time period, and the delay average value and the delay standard deviation obtained finally are not equal to the delay average value and the delay standard deviation of the plurality of voice data packets in the history reference period calculated directly because the weights of the respective time periods are different. For example, it is assumed that the historical reference period includes three time periods in total, and the time sequence includes a first time period, a second time period and a third time period, wherein the first time period includes 5 voice data packets, the second time period includes 6 voice data packets, and the third time period includes 4 voice data packets. Correspondingly, firstly, calculating a corresponding delay period average value and a delay period standard deviation aiming at the voice data packet of each time period; then, according to the weight of each time period, carrying out weighted operation on the delay period average value of each time period to obtain the delay average value of the historical reference period; and carrying out weighted operation on the standard deviation of the delay period of each time period according to the weight of each time period to obtain the standard deviation of the delay of the historical reference period. The weight of each time period may be set according to various factors such as the time sequence of the time period, the number of data packets, and the like.
Step S230: and determining the estimated delay time length corresponding to the preset estimated delay hit rate according to the delay distribution data.
The estimated deferred hit rate is a value of the estimated deferred hit rate. Specifically, the estimated delay hit rate can be flexibly set according to the service requirement. For example, the estimated delay hit rate may be set based on call traffic type and/or real-time sensitivity. Wherein the call service type is used for representing the importance of the call service, for example, the estimated delay corresponding to the call service with higher importanceThe higher the hit rate. Specifically, the call service types may be divided into a first service type, a second service type, and a third service type, where the estimated delay hit rate of the first service type is greater than the second service type, and the estimated delay hit rate of the second service type is greater than the third service type. For example, the first service type is a type related to payment, and because the user has a high requirement on real-time payment, the estimated delay hit rate is set to be high so as to prevent packet loss; the second service type is related to the call, and the real-time requirement of the call process is lower than that of the payment service, so that the estimated delay hit rate is set to be slightly lower; the third service type is entertainment service related to news and on demand, and the real-time requirement of the service is minimum, so that the estimated delay hit rate is set to be minimum. The real-time sensitivity is used for representing the sensitivity degree of a call user to call delay and real-time, and the higher the real-time sensitivity is, the higher the real-time requirement of the user to the user is, and correspondingly, the higher the set estimated delay hit rate is. The real-time sensitivity can be set specifically according to the service type. In the present embodiment, the delay profile data is a normal profile for reflecting the delay profile probability. Assuming that the transmission delay time length is represented by a variable t, the normal distribution curve is F 1 (t), the estimated delay hit rate b may be determined by:
wherein t1 is the estimated delay time. Therefore, under the condition that the normal distribution curve is known, the estimated delay hit rate b and the estimated delay time length t1 have a determined functional relation, so that under the condition that the estimated delay hit rate is set, the value of the estimated delay time length t1 can also be calculated.
Step S240: and calculating the buffer length corresponding to the estimated delay time length.
Specifically, a voice coding mode of the current call is obtained; determining the voice coding length in unit time according to the voice coding mode; and obtaining the buffer memory length corresponding to the estimated delay time length according to the product of the speech coding length in unit time and the estimated delay time length.
Step S250: dynamically adjusting the call buffer area to be a length matched with the buffer length; the call buffer area is used for buffering the voice data packet received in the current call process.
The call buffer area is used for buffering the voice data packets received in the current call process, and the voice data packets in the call buffer area are sequentially subjected to subsequent processing under the condition that the residual buffer space of the call buffer area is not larger than a preset lower limit threshold value. The preset lower threshold may be flexibly set, for example, may be set to 0.
Specifically, the current length of the call buffer area is obtained, and whether the absolute value of the difference value between the current length and the buffer area is larger than a preset adjustment threshold value is judged; if so, dynamically shrinking the call buffer area to a length matched with the buffer length under the condition that the current length is larger than the buffer length; and under the condition that the current length is smaller than the buffer length, dynamically expanding the call buffer area to a length matched with the buffer length. The preset adjustment threshold can be set to avoid frequent adjustment of the call buffer, and the specific value can be set according to service requirements. The call buffer area can be realized by a buffer queue mode, and correspondingly, the dynamic expansion and contraction capacity processing of the call buffer area can be realized by a mode of adjusting the number of queue elements contained in the buffer queue.
The method in the embodiment can be applied to various network devices such as voice media gateways and the like; the voice data packet is transmitted to the voice media gateway by the transmitting end through a voice transmission mode based on IP; and the call buffer is arranged in the voice media gateway.
As can be seen from the above description, in the above embodiment, the call buffer with a buffer length dynamically adjustable according to the delay distribution of the voice data packets is provided, and since the length of the call buffer dynamically changes according to the delay distribution data of the voice data packets in the historical reference period, the number of the voice data packets buffered in the call buffer can be ensured to meet the call quality requirement. In addition, the method can accurately estimate the estimated delay time length corresponding to the estimated delay hit rate through the normal distribution curve, and can more accurately calculate the buffer length by combining the voice coding length in unit time corresponding to the voice coding method of the current call.
In the following, specific implementation details of the above embodiment are described in detail by taking a specific example as an example for understanding.
The call procedure in this example is implemented based on VoIP (voice over IP, voice over Internet Protocol) technology and RTP (Real-time transport protocol ) protocol. VoIP is a voice call technology that enables voice calls and multimedia conferences via internet protocols, i.e., communication via the internet. RTP is a network transport protocol that provides end-to-end transport services with real-time features for data, such as interactive video audio or analog data under multicast or unicast network services. An application typically runs RTP over UDP to use its multipath nodes and verification services; both protocols provide the functionality of the transport layer protocol. RTP may be used with other suitable underlying networks or transport protocols. If the underlying network provides a multicast mode, RTP can use the multicast table to transmit data to multiple destinations.
During a call, speech jitter, defined as the delay variation of the received data packets, typically occurs. At the transmitting end, the data packets are transmitted in a uniformly spaced continuous stream. This smooth flow may fluctuate or vary due to network congestion, incorrect queuing, or configuration errors, etc., or delay instability between each packet. In VoIP technology, voice jitter refers to the delay in receiving voice data packets. This delay can affect voice quality and transmission of voice data. At present, voice network communication has become a common communication mode, after voice interacted with each other is encoded into a digital signal from an analog signal, the digital signal is transmitted to an opposite terminal through an IP network, the opposite terminal decodes and plays the digital signal after receiving the digital signal, and the voice call based on IP is called VoIP. VoIP voice transmission interactions use the RTP transmission protocol for transmitting data in real time, carrying voice encoded data based on the protocol. The voice transmission based on the network VoIP brings convenience and also has the defects of network congestion, unstable transmission line, network delay and the like, so that the voice call quality is reduced.
In order to solve the problem of the degradation of the call quality caused by the transmission line, in the related art, the voice data packet transmitted from the opposite terminal is not directly used, but is stored in the call buffer first, and is processed after the call buffer is full, so as to achieve the output of smooth and normal voice. However, setting the call buffer area can alleviate the problem of voice jitter, but has a defect that because the network jitter has randomness and does not fix the delay for a few milliseconds, if the call buffer area is set too small and the voice delay is larger, the voice quality is still poor; if the call buffer area is too large, the delay of the subsequent voice processing is serious.
In this example, history curve fitting can be performed according to the delay condition of the history call process in the call process, so as to obtain a normal distribution curve condition of the history voice delay, and then the size of the current call buffer area is estimated according to the estimated delay hit rate. This example is implemented by:
first, a statistical period is set.
The statistical period may be represented by a time T, and may be dynamically adjusted, such as the last 10 minutes, 1 hour, 1 day, 1 week, etc. The longer the statistics period, the more the data volume, the more fit of the history curve, but the larger the calculation amount carried by the server, so the dynamic adjustment can be performed according to the machine performance.
Then, the transmission delay time length of each voice data packet in the statistical period is calculated.
VOIP is based on RTP transmission voice, the RTP voice data packet carries a time stamp, the sending time of the current voice data packet can be calculated through the correlation of the time stamp and RTCP, and the delay time of the current voice data packet can be obtained by subtracting the sending time from the receiving time.
And calculating a delay average value E1 of transmission delay time lengths of all voice data packets in the time T, wherein n represents the number of the voice data packets in the time T.
x1 represents the delay time of the 1 st received voice data packet in the T time;
x2 represents the delay time of the 2 nd received voice data packet in the T time;
……
xn represents the delay time of the nth received voice packet within the T time;
calculating the variance D1 of the delay time of the voice data packet in the current T time;
the normal distribution curve with a delay average value E1 and a delay variance D1 corresponds to the following function:
then the estimated delay hit rate b, the value range of b can be (0.5-0.997)
The estimated delay time t1 is calculated by the above formula.
For example, in an alternative calculation mode, knowing that the average delay time is E1 and the maximum delay DMax in the sample data, the difference between DMax and E1 is obtained, and then the average is divided into ten thousand parts (the specific parts can be adjusted according to the accuracy), the sum of the increment of E1 and each part is taken as the value t1, and the value b is obtained by respectively taking the sum into the above formula, and then the mapping relationship between b and t1 is established. Correspondingly, based on the mapping relation, after setting the value of the estimated delay hit rate b, the corresponding t1 value can be calculated, namely the estimated delay time length set by the current call.
Wherein, the value of b can be flexibly set according to the service type of the call. Essentially, when b=0.5, t1=e1; when b=0.997, t1=e1+ 3*x; where x is the delay standard deviation of n voice data packets.
When the length of the call buffer area is expected to be smaller, b=0.5 can be set, so that the buffer overhead can be reduced; when a large length of the call buffer is desired, b=0.997 can be set, thereby ensuring that network jitter can be properly handled in most cases. And obtaining the estimated buffer time t of the received voice data packet based on the calculation.
And obtaining the buffer memory size of the voice data packet according to the product of the time t and the current call voice coding unit size. And the delay data of the current voice call is recorded, the estimated voice delay calculation data statistical information is updated, and the subsequent call is recalculated through the latest data, so that the purposes of estimating voice jitter in real time and setting the length of an ideal call buffer area are achieved.
For example, in a call center system, an agent terminal communicates with a user terminal through a voice media gateway, the agent terminal uses a VOIP technology to access to the voice media gateway, a user uses a mobile phone, a solidifying terminal and the like to access to the voice media gateway through an operator line, voice of the agent is transmitted to the voice media gateway through an RTP voice data packet during communication, the voice media gateway sets a dynamically adjustable communication buffer area in a memory in advance when processing the voice data packet, and the voice media gateway starts to forward the buffered voice data packet to the operator line after the buffer area is full. When the call buffer is set too small, no voice is sent to the client in the middle due to factors such as network jitter, and the voice call quality is poor. When the call buffer is set to be too large, the media gateway will delay for a long time to forward to the user due to buffering a large block of voice data under normal conditions of the network, and the voice call delay is serious. The example uses a dynamic estimated voice delay method, avoids the defects of setting fixed delay size and fixed buffer size, can count according to historical network delay information, estimates to obtain delay values possibly occurring in the actual use of the seat network, and sets the buffer size according to the estimated delay values, thereby improving voice quality. In addition, the method can set parameters such as a statistics period, a pre-estimated delay hit rate and the like by itself, thereby being convenient for real-time adjustment according to actual service conditions.
It will be appreciated that the above-mentioned method embodiments of the present disclosure may be combined with each other to form a combined embodiment without departing from the principle logic, and are limited to the description of the present disclosure. It will be appreciated by those skilled in the art that in the above-described methods of the embodiments, the particular order of execution of the steps should be determined by their function and possible inherent logic.
In addition, the disclosure further provides a call processing device, an electronic device, and a computer readable storage medium, where the foregoing may be used to implement any call processing method provided by the disclosure, and corresponding technical schemes and descriptions and corresponding descriptions referring to method parts are not repeated.
Fig. 3 is a block diagram of a call processing apparatus according to an embodiment of the present disclosure.
Referring to fig. 3, an embodiment of the present disclosure provides a call processing apparatus 30, the call processing apparatus 30 including:
an obtaining module 31, adapted to obtain a plurality of voice data packets transmitted in a historical reference period corresponding to a current talk time;
a calculating module 32, adapted to calculate a transmission delay time length of each voice data packet, and obtain delay distribution data corresponding to the plurality of voice data packets according to the transmission delay time length of each voice data packet;
The determining module 33 is adapted to determine a predicted delay duration corresponding to a preset predicted delay hit rate according to the delay distribution data, and calculate a cache length corresponding to the predicted delay duration;
the adjusting module 34 is adapted to dynamically adjust the call buffer to a length matching the buffer length; the call buffer area is used for buffering the voice data packet received in the current call process.
In an alternative implementation, the historical reference period corresponding to the current talk time is determined by;
taking the first N time periods corresponding to the current conversation time as the historical reference time periods; the length of the time period is determined according to the processing performance of the equipment and/or the network transmission speed; wherein N is a natural number; or,
taking the conversation time corresponding to the previous M conversation processes corresponding to the current conversation process as the history reference period; wherein M is a natural number.
In an alternative implementation, the delay profile data includes: the calculation module is particularly adapted to:
calculating a first delay parameter and a second delay parameter corresponding to the voice data packets according to the transmission delay time of each voice data packet;
And generating delay probability distribution curves corresponding to the voice data packets according to the first delay parameters and the second delay parameters.
In an alternative implementation, the first delay parameter includes: delay mean or delay mean; the second delay parameter includes: delay variance or delay standard deviation; the delay probability distribution curve includes: normal distribution curve.
In an alternative implementation, the estimated delay hit rate is set according to the call service type and/or real-time sensitivity;
the device is a voice media gateway; the voice data packet is transmitted to the voice media gateway by a transmitting end in a voice transmission mode based on IP; and the call buffer area is arranged in the voice media gateway.
In an alternative implementation, the determining module is specifically adapted to:
acquiring a voice coding mode of a current call;
determining the voice coding length in unit time according to the voice coding mode;
and obtaining the buffer memory length corresponding to the estimated delay time length according to the product of the speech coding length in the unit time and the estimated delay time length.
In an alternative implementation, the adjustment module is specifically adapted to:
acquiring the current length of the call buffer area, and judging whether the absolute value of the difference between the current length and the buffer area is larger than a preset adjustment threshold value;
if so, dynamically shrinking the call buffer area to a length matched with the buffer length under the condition that the current length is larger than the buffer length; and under the condition that the current length is smaller than the buffer length, dynamically expanding the call buffer area to a length matched with the buffer length.
In an alternative implementation, the computing module is specifically adapted to:
acquiring time stamp information in each voice data packet, and determining the sending time of the voice data packet according to the time stamp information;
and obtaining the transmission delay time of each voice data packet according to the difference value between the receiving time and the sending time of each voice data packet.
Fig. 4 is a block diagram of an electronic device according to an embodiment of the present disclosure.
Referring to fig. 4, an embodiment of the present disclosure provides an electronic device including: at least one processor 501; at least one memory 502, and one or more I/O interfaces 503, coupled between the processor 501 and the memory 502; the memory 502 stores one or more computer programs executable by the at least one processor 501, and the one or more computer programs are executed by the at least one processor 501 to perform the call processing method described above.
The disclosed embodiments also provide a computer readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor/processing core, implements the data migration method described above. The computer readable storage medium may be a volatile or nonvolatile computer readable storage medium.
Embodiments of the present disclosure also provide a computer program product comprising computer readable code, or a non-transitory computer readable storage medium carrying computer readable code, which when executed in a processor of an electronic device, performs the above-described data migration method.
Those of ordinary skill in the art will appreciate that all or some of the steps, systems, functional modules/units in the apparatus, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between the functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed cooperatively by several physical components. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer-readable storage media, which may include computer storage media (or non-transitory media) and communication media (or transitory media).
The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable program instructions, data structures, program modules or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, random Access Memory (RAM), read Only Memory (ROM), erasable Programmable Read Only Memory (EPROM), static Random Access Memory (SRAM), flash memory or other memory technology, portable compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical disc storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable program instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and may include any information delivery media.
The computer readable program instructions described herein may be downloaded from a computer readable storage medium to a respective computing/processing device or to an external computer or external storage device over a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmissions, wireless transmissions, routers, firewalls, switches, gateway computers and/or edge servers. The network interface card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium in the respective computing/processing device.
Computer program instructions for performing the operations of the present disclosure can be assembly instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, c++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer readable program instructions may be executed entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present disclosure are implemented by personalizing electronic circuitry, such as programmable logic circuitry, field Programmable Gate Arrays (FPGAs), or Programmable Logic Arrays (PLAs), with state information of computer readable program instructions, which can execute the computer readable program instructions.
The computer program product described herein may be embodied in hardware, software, or a combination thereof. In an alternative embodiment, the computer program product is embodied as a computer storage medium, and in another alternative embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK), or the like.
Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.
These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable medium having the instructions stored therein includes an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Example embodiments have been disclosed herein, and although specific terms are employed, they are used and should be interpreted in a generic and descriptive sense only and not for purpose of limitation. In some instances, it will be apparent to one skilled in the art that features, characteristics, and/or elements described in connection with a particular embodiment may be used alone or in combination with other embodiments unless explicitly stated otherwise. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the disclosure as set forth in the appended claims.

Claims (10)

1. A call processing method, comprising:
acquiring a plurality of voice data packets transmitted in a historical reference period corresponding to the current conversation time;
calculating the transmission delay time length of each voice data packet, and obtaining delay distribution data corresponding to the voice data packets according to the transmission delay time length of each voice data packet;
determining a predicted delay time length corresponding to a preset predicted delay hit rate according to the delay distribution data, and calculating a cache length corresponding to the predicted delay time length;
Adjusting the call buffer area to a length matched with the buffer length; the call buffer area is used for buffering the voice data packet received in the current call process.
2. The method of claim 1, wherein the historical reference period corresponding to the current talk time is determined by;
taking the first N time periods corresponding to the current conversation time as the historical reference time periods; the length of the time period is determined according to the processing performance of the equipment and/or the network transmission speed; wherein N is a natural number; or,
taking the conversation time corresponding to the previous M conversation processes corresponding to the current conversation process as the history reference period; wherein M is a natural number.
3. The method of claim 1, wherein the delay profile data comprises: and obtaining delay distribution data corresponding to the plurality of voice data packets according to the transmission delay time length of each voice data packet comprises the following steps:
calculating a first delay parameter and a second delay parameter corresponding to the voice data packets according to the transmission delay time of each voice data packet;
And generating delay probability distribution curves corresponding to the voice data packets according to the first delay parameters and the second delay parameters.
4. The method according to claim 1, wherein the method is applied to a voice media gateway;
the voice data packet is transmitted to the voice media gateway by a transmitting end through a voice transmission mode based on IP; the call buffer area is arranged in the voice media gateway.
5. The method of claim 1, wherein calculating the cache length corresponding to the estimated delay time length comprises:
acquiring a voice coding mode of a current call;
determining the voice coding length in unit time according to the voice coding mode;
and obtaining the buffer memory length corresponding to the estimated delay time length according to the product of the speech coding length in the unit time and the estimated delay time length.
6. The method according to any one of claims 1-5, wherein dynamically adjusting the call buffer to a length matching the buffer length comprises:
acquiring the current length of the call buffer area, and judging whether the absolute value of the difference between the current length and the buffer area is larger than a preset adjustment threshold value;
If so, dynamically shrinking the call buffer area to a length matched with the buffer length under the condition that the current length is larger than the buffer length; and under the condition that the current length is smaller than the buffer length, dynamically expanding the call buffer area to a length matched with the buffer length.
7. The method of any one of claims 1-5, wherein calculating the transmission delay duration of each voice data packet comprises:
acquiring time stamp information in each voice data packet, and determining the sending time of the voice data packet according to the time stamp information;
and obtaining the transmission delay time of each voice data packet according to the difference value between the receiving time and the sending time of each voice data packet.
8. A call processing apparatus, comprising:
the acquisition module is suitable for acquiring a plurality of voice data packets transmitted in a historical reference period corresponding to the current conversation time;
the calculation module is suitable for calculating the transmission delay time length of each voice data packet and obtaining delay distribution data corresponding to the voice data packets according to the transmission delay time length of each voice data packet;
the determining module is suitable for determining estimated delay time length corresponding to a preset estimated delay hit rate according to the delay distribution data, and calculating the buffer memory length corresponding to the estimated delay time length;
The adjusting module is suitable for dynamically adjusting the call buffer area to be the length matched with the buffer length; the call buffer area is used for buffering the voice data packet received in the current call process.
9. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores one or more computer programs executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-7.
10. A computer readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the method according to any of claims 1-7.
CN202211078726.8A 2022-09-05 2022-09-05 Call processing method and device, electronic equipment and storage medium Pending CN117692366A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211078726.8A CN117692366A (en) 2022-09-05 2022-09-05 Call processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211078726.8A CN117692366A (en) 2022-09-05 2022-09-05 Call processing method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117692366A true CN117692366A (en) 2024-03-12

Family

ID=90137797

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211078726.8A Pending CN117692366A (en) 2022-09-05 2022-09-05 Call processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117692366A (en)

Similar Documents

Publication Publication Date Title
KR100902456B1 (en) Method and apparatus for managing end-to-end voice over internet protocol media latency
WO2017148260A1 (en) Voice code sending method and apparatus
US8605620B2 (en) System for transmitting high quality speech signals on a voice over internet protocol network
US7379466B2 (en) In band signal detection and presentation for IP phone
US7817625B2 (en) Method of transmitting data in a communication system
KR101121212B1 (en) Method of transmitting data in a communication system
US10492085B2 (en) Real-time transport protocol congestion control techniques in video telephony
US8306015B2 (en) Technique for identifying RTP based traffic in core routing switches
US7450593B2 (en) Clock difference compensation for a network
TW201316814A (en) Methods for transmitting and receiving a digital signal, transmitter and receiver
US20240114177A1 (en) System and method for capturing and distributing a live audio stream of a live event in real-time
CN113573003B (en) Audio and video real-time communication method, device and equipment based on weak network
JP2004535115A (en) Dynamic latency management for IP telephony
US20060095612A1 (en) System and method for implementing a demand paging jitter buffer algorithm
CN117692366A (en) Call processing method and device, electronic equipment and storage medium
CN112153322B (en) Data distribution method, device, equipment and storage medium
US7525914B2 (en) Method for down-speeding in an IP communication network
CN107113357B (en) Improved method and apparatus relating to speech quality estimation
EP2053765B1 (en) Apparatus and method for playout scheduling in voice over internet protocol (VoIP) System
Muyambo De-Jitter Control Methods in Ad-Hoc Networks
US9118743B2 (en) Media rendering control
KR20190118714A (en) Method for managing qos and receiving apparatus for executing the same
Ramos Robust and reliable multimedia transmission over the Internet
Kos et al. VoIP over Ethernet–Theoretical Analysis and Simulation
KR20170085712A (en) Method and apparatus for processing of packet

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination