CN108737349B

CN108737349B - Voice data packet processing method and device

Info

Publication number: CN108737349B
Application number: CN201710272196.3A
Authority: CN
Inventors: 程岳
Original assignee: Datang Mobile Communications Equipment Co Ltd
Current assignee: Datang Mobile Communications Equipment Co Ltd
Priority date: 2017-04-24
Filing date: 2017-04-24
Publication date: 2020-08-28
Anticipated expiration: 2037-04-24
Also published as: CN108737349A

Abstract

The invention discloses a method and a device for processing a voice data packet, which are used for determining the type of the received data packet of a first voice data packet and the TS _ STRIDE value of a first voice data packet header compressed message corresponding to the type of the first voice data packet, sequentially determining the time stamp TS value of the first voice data packet header compressed message within the range of the set number according to the TS _ SCALED value of the first voice data packet header compressed message with the set number and the TS _ STRIDE value of the first voice data packet header compressed message until the determined TS value can be restored to obtain a correct first voice data packet. Namely, the correct TS value is determined through multiple attempts, so that the accuracy of restoring the voice data packet according to the determined TS value is improved.

Description

Voice data packet processing method and device

Technical Field

The present invention relates to the field of wireless communication technologies, and in particular, to a method and an apparatus for processing a voice data packet.

Background

In a Long Term Evolution (LTE) mobile communication network, Voice communication uses a Voice over Internet Protocol (VoIP) technology based on an internetwork Protocol, that is, a Voice data packet is transmitted on a wireless link according to an Internetwork Protocol (IP) data packet format.

When a voice data packet is transmitted based on a VoIP technology, firstly, an Adaptive Multi-Rate (AMR) scheme is adopted to perform compression coding, and then, the compressed and coded voice data packet is packaged into an IP data packet for transmission, because the AMR scheme carries the voice data packet on a Real-time Transport Protocol (RTP) and a User Datagram Protocol (UDP), compared with a voice payload, a packet header of the AMR-compressed voice data packet occupies a large bandwidth overhead (for example, a packet header of a voice data packet of an Internet Protocol Version 4(Internet Protocol Version 4, IPv4) occupies 40 bytes, and a packet header of a voice data packet of an Internet Protocol Version 6(Internet Protocol Version 6, IPv6) occupies 60 bytes), which brings a large waste to wireless air interface resources, that is, an extra packet header is large, resulting in a low bandwidth utilization ratio.

The LTE system can effectively improve the utilization rate of air interface bandwidth by compressing the message header of the voice data packet. When the ROHC scheme is adopted to compress the message Header of the voice data packet, the sending of a Timestamp Scaled down according to a proportion (TS _ SCALED) value is adopted to replace the sending of an original Timestamp (Timestamp, TS) value, the message Header of the voice data packet can be compressed to the degree of 1 to 3 bytes, the bandwidth utilization rate can reach more than 90 percent, and the decompression party can restore the original TS value according to a negotiated fixed value Timestamp multiple (TS _ STRIDE) and a saved offset relative to TS _ STRIDE, so that the number of bits required to be transmitted is reduced, and the purpose of Compression is achieved.

Generally, voice data packets in an LTE system are transmitted in an Unacknowledged Mode (UM) in an access network protocol, and because the quality of an air interface is unstable, the voice data packets transmitted in the UM mode often have a packet loss condition on a wireless link. If the packet loss occurs in the voice silent period and the voice activation period, the TS value increment in the RTP packet header of the voice data packet is increased in a non-sequential mode, so that the TS value is reduced wrongly when the current voice data packet is decompressed, the reduced wrong current voice data packet also causes the reduction mistake of the subsequent voice data packet, the error is diffused, and the reduction accuracy of the voice data packet is reduced.

Disclosure of Invention

The invention aims to provide a method and a device for processing a voice data packet so as to improve the accuracy of restoring the voice data packet.

The purpose of the invention is realized by the following technical scheme:

in a first aspect, a method for processing a voice data packet is provided, where a data packet type of a received first voice data packet and a TS _ STRIDE value of a first voice data packet header compressed packet corresponding to the first voice data packet type are determined, and a time stamp TS value of the first voice data packet header compressed packet is sequentially determined within a set number range according to a TS _ scan value of a set number of first voice data packet header compressed packets and the TS _ STRIDE value of the first voice data packet header compressed packet until the determined TS value can be restored to obtain a correct first voice data packet.

The data packet types comprise a broadband voice data packet and a narrowband voice data packet, and the TS _ STRIDE value of the broadband voice data packet header compressed message is different from the TS _ STRIDE value of the narrowband voice data packet header compressed message.

In the embodiment of the invention, the TS _ SCALED values of the first voice data packet header compressed messages and the TS _ STRIDE values of the first voice data packet header compressed messages are set, the TS values are determined successively by using the TS _ SCALED values of the first voice data packet header compressed messages in the set number range, the first voice data packets are restored one by one according to the determined TS values until the determined TS values can be restored to obtain correct first voice data packets, namely, the correct TS values are determined through multiple attempts, and the accuracy of restoring the voice data packets according to the determined TS values is improved.

Preferably, the TS _ scand value of the set number of the first voice data header compressed messages is determined as follows:

determining a TS _ SCALED value of a compressed message of a second voice data packet header, wherein the second voice data packet is a voice data packet which is received before the first voice data packet is received and is correctly restored; according to the formula TS _ SCALED_i＝TS_SCALED_i-n+ n + p, determining at least one TS _ SCALED value of the first voice data packet header compressed message; wherein, TS _ SCALED_iTS _ SCALED value of the first voice data header compressed message_i-nAnd the value is the TS _ SCALED value of the compressed message of the second voice data packet header, p is a positive integer from 0 to 8 determined according to the voice sampling period, i and n are respectively positive integers, n is smaller than i, and n is the difference between the receiving sequence number of the first voice data packet and the receiving sequence number of the second voice data packet.

In the embodiment of the invention, the TS _ SCALED value of the first voice data packet is determined by successive attempts within the quantity range determined according to one voice sampling period through the TS _ SCALED value of the voice data packet which is correctly restored, and the TS _ SCALED value is determined according to the formula TS _ SCALED under the condition that the voice data packet is normally transmitted_i＝TS_SCALED_i-nAnd + n + p, determining that one of the TS _ SCALED values of the first voice data packet header compressed message is correct, and determining the correct TS value on the basis of correctly determining the TS _ SCALED value, thereby improving the accuracy of restoring the voice data packet according to the determined TS value.

Preferably, in the range of the set number, sequentially determining the TS values of the first voice data packet header compressed packets until the determined TS values can be restored to obtain correct first voice data packets, including:

setting the initial value of p to be 0, and gradually determining the TS value of the first voice data packet header compressed message by gradually adding 1 to p until the determined TS value can be restored to obtain a correct first voice data packet.

Preferably, the step of determining that the TS value can be restored to obtain the correct first voice data packet is:

and the first voice data packet obtained by the determined TS value restoration can pass CRC check and UDP check.

In the embodiment of the invention, because the check digit used in the CRC check is limited, the CRC check is easy to make mistakes, even if the first voice data packet obtained by restoration can pass the CRC check, the first voice data packet obtained by restoration can also be wrong.

Preferably, the first voice data packet is a voice data packet received when a voice activation period and a voice silence period are switched.

In a second aspect, a device for processing voice data packets is provided, the device comprising a determining unit and a processing unit, wherein:

the determining unit is used for determining the type of a received data packet of a first voice data packet and a TS _ STRIDE value of a first voice data packet header compressed message corresponding to the type of the first voice data packet, wherein the type of the data packet comprises a broadband voice data packet and a narrowband voice data packet, and the TS _ STRIDE value of the broadband voice data packet header compressed message is different from the TS _ STRIDE value of the narrowband voice data packet header compressed message.

And the processing unit is used for sequentially determining the time stamp TS value of the first voice data packet header compressed message within the range of the set number according to the TS _ SCALED value of the first voice data packet header compressed message with the set number and the TS _ STRIDE value of the first voice data packet header compressed message determined by the determining unit until the determined TS value can be restored to obtain a correct first voice data packet.

Preferably, the determining unit is configured to determine the TS _ scand value of the set number of first voice packet header compressed messages by using the following method:

Preferably, the processing unit sequentially determines the TS values of the first voice data packet header compressed packets within the set number range in the following manner until the determined TS values can be restored to obtain correct first voice data packets:

Preferably, the step of determining, by the processing unit, that the TS value can be restored to obtain the correct first voice data packet includes:

Preferably, the first voice data packet processed by the processing unit is a voice data packet received when a voice activation period and a voice silence period are switched.

Drawings

Fig. 1 is a schematic diagram of a working flow of a conventional dual-mode base station;

fig. 2 is a flowchart of a method for processing a voice data packet according to an embodiment of the present invention;

fig. 3 is a flowchart of a method for determining a TS _ SCALED value of a first voice packet header compact packet according to an embodiment of the present invention;

fig. 4 is a schematic diagram of a voice data packet received by a voice data packet receiving end according to an embodiment of the present invention;

fig. 5 is a schematic diagram of a restored voice data packet according to an embodiment of the present invention;

fig. 6 is a flow chart of a voice packet processing according to an embodiment of the present invention;

fig. 7 is a processing apparatus for voice data packets according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be described in detail below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only some embodiments, and not all embodiments, of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In the LTE system, when an IP data packet format is used to transmit a voice data packet, the voice message format is as shown in fig. 1, the voice message includes a message header and an effective load, the message header includes an IP message header part, a UDP message header part, and an RTP message header part, the effective load is the voice to be transmitted, it can be known from fig. 1 that the message header occupies 40 or 60 bytes when the IP data packet format is used to transmit the voice data packet, and the overhead of the message header is quite large. In order to reduce the overhead of the message header and improve the utilization rate of the air interface bandwidth, the LTE system compresses the message header of the voice data packet. In the following embodiments of the present invention, only the compression of the header by using the ROHC scheme is described in detail.

When an ROHC compression scheme is adopted to compress an RTP message header, a TS _ SCALED value which is reduced in proportion is sent instead of the original TS value, the message header of a voice data packet can be compressed to the degree of 1-3 bytes, wherein the voice data packet with the highest compression ratio is a UO-0 packet in a bidirectional optimization mode (O mode). In case the TS _ scand value changes, it is necessary to use larger compressed packets to convey the TS _ scand value changes, which can be conveyed using extended packets that send UOR-2 packets carrying longer bytes. For example, 3 packets are continuously transmitted, TS phase difference between SN1 and SN2 packets is 320, TS phase difference between SN2 and SN3 is 2560, UO-0 packets are used for transmitting the first packet and the second packet, TS _ scand value of the second packet is 2, and when SN2 packets are transmitted, TS _ scand value becomes 2+ 8-10, so that it is necessary to transmit the packets by using a Window-based Least Significant bit encoding (WLSB) method based on 2 TS _ scand reference value, and at this time, only 3 bytes are needed to indicate the change of difference value between TSs. This can lead to better compression efficiency and bandwidth utilization when voice silence periods and voice active periods are switched frequently. Generally, after the same TS difference value repeatedly appears after setting for N times (N >3) continuously, the TS _ STRIDE value change is informed by carrying an extension data packet through UOR-2.

In an LTE system, in a voice activation period of user conversation, a voice data packet is transmitted every 20ms, in a voice silent period of user conversation pause, a voice silent data packet is transmitted every 160ms, in order to save bandwidth, some sampling points (usually the first 7 sampling points) are abandoned by the voice silent data packet, in the voice activation period, a voice activation period is switched again after the voice data packet of the voice silent period is occasionally inserted (namely, when the voice silent period and the voice activation period are switched, the TS _ STRIDE value does not exceed N (N >3), if packet loss occurs at an air interface at the moment, TS value increment in an RTP message header of the voice data packet is increased out of sequence, when a voice data packet sender only modifies the TS _ SCALED value to reflect the change of TS difference value, the TS _ SCALED value is still delivered according to 1, the TS _ SCALED value is incorrect, and the TS value is recovered to be wrong when the current voice data packet is decompressed, restoring the wrong current voice data packet can also cause the restoring error of the subsequent voice data packet, and cause error diffusion, so that the restoring accuracy of the voice data packet is reduced.

Because the receiver of the voice data packet is incorrect to submit the TS _ SCALED value according to the increased 1 when the voice data packet of the voice silent period is occasionally inserted in the voice activation period and then the voice activation period is switched, the problems of error recovery and error diffusion of the voice data packet can be caused, and in order to improve the recovery accuracy rate of the voice data packet, the embodiment of the invention provides a processing method of the voice data packet.

Fig. 2 is a flowchart of a processing method of a voice data packet according to an embodiment of the present invention, and as shown in fig. 2, the processing method includes:

s101: and determining the data packet type of the received first voice data packet and the TS _ STRIDE value of the first voice data packet header compressed message corresponding to the first voice data packet type.

It should be noted that the first voice data packet is a voice data packet received when a voice activation period and a voice silence period are mutually converted, and the voice activation period and the voice silence period mutually converted includes a conversion from a voice activation period to a voice silence period or a conversion from a voice silence period to a voice activation period. During the conversation of the user, the conversation of the user is a voice activation period when the conversation is continuous, the conversation pause of the user is a voice silent period, and the conversion between the voice activation period and the voice silent period can also be understood as the conversion between the conversation continuity and the conversation pause during the conversation of the user.

In the embodiment of the present invention, the ROHC scheme is used to compress the voice message, which is substantially to compress the TS value of the RTP message header part included in the voice message header, and the RFC3095 protocol provides the relationship between the TS value and the compressed TS value (i.e., TS _ scand), as shown in the following formula:

TS _ scand _ TS _ scand + TS _ OFFSET, where TS _ OFFSET is a fixed value.

In the embodiment of the invention, when the voice data packet receiving end restores the voice data packet, the TS value needs to be determined, and the key for determining the TS value is to determine the TS _ SCALED value and the TS _ SCALED value.

In the embodiment of the invention, the TS _ STRIDE value of the compressed message of the header of the received first voice data packet is determined by determining the type of the data packet of the received first voice data packet to be a broadband voice data packet or a narrowband voice data packet. Specifically, a first voice data packet payload is obtained by analyzing a received first voice data packet, and according to a size of the first voice data packet payload and a Frame Type (FT) field in the first voice data packet payload, it is determined that a data packet Type of the first voice data packet is a wideband voice data packet or a narrowband voice data packet, if it is determined that the data packet Type of the first voice data packet is a wideband voice data packet, a TS _ STRIDE value corresponding to the wideband voice data packet may be determined, for example, the TS _ STRIDE value of a first voice data packet header compressed packet may be determined as 320, and if it is determined that the data packet Type of the first voice data packet is a narrowband voice data packet, the TS _ STRIDE value corresponding to the narrowband voice data packet may be determined, for example, the TS _ STRIDE value of the first voice data packet header compressed packet may be determined as 160.

S102: and sequentially determining the TS value of the first voice data packet header compressed message within a set quantity range according to the TS _ SCALED value of the first voice data packet header compressed message with a set quantity and the TS _ STRIDE value of the first voice data packet header compressed message until the determined TS value can be restored to obtain a correct first voice data packet.

In the embodiment of the invention, under the condition that the TS _ STRIDE value of the first voice data packet header compressed message is determined, the key for determining the TS value of the first voice data packet header compressed message is to determine the TS _ SCALED value of the first voice data packet header compressed message.

In the embodiment of the present invention, the TS _ scand value of the set number of the first voice data header compressed messages may specifically be determined by using a manner shown in fig. 3:

s201: and determining the TS _ SCALED value of the compressed message of the second voice data packet header, wherein the second voice data packet is the voice data packet which is received before the first voice data packet is received and is correctly restored.

In the embodiment of the present invention, the voice data packet receiving end receives the voice data packets in sequence, and the second voice data packet may be a previous voice data packet of the first voice data packet, or may be the first N (N is greater than or equal to 1) voice data packets of the first voice data packet. If the second voice data packet is the previous voice data packet of the first voice data packet, that is, no packet loss of the voice data packet occurs, at this time, the first voice data packet may be restored by using the prior art, or the first voice data packet may be restored by using the voice data packet processing method provided in the present invention. The following description focuses on the first N (N is 1 or more) voice packets in which the second packet is the first voice packet.

S202: according to the formula TS _ SCALED_i＝TS_SCALED_i-nAnd + n + p, determining at least one TS _ SCALED value of the first voice data packet header compressed message.

Wherein, TS _ SCALED_iTS _ SCALED value of the compressed packet for the header of the first voice data packet_i-nThe value is TS _ SCALED of the compressed message of the second voice data packet header, p is a positive integer determined according to the voice sampling period, i and n are respectively positive integers, n is smaller than i, and n is the difference between the receiving sequence number of the first voice data packet and the receiving sequence number of the second voice data packet.

In the embodiment of the present invention, the value of p may be determined to be 8 according to the voice sampling period. Assuming that the second voice data packet is the first two voice data packets of the first voice data packet, it may be determined that the difference between the receive sequence number of the first voice data packet and the receive sequence number of the second voice data packet is 2, and further it may be determined that a formula for calculating the TS _ scand value of the first voice data packet header compressed packet is:

TS_SCALED_i＝TS_SCALED_i-2+2+ p, TS _ SCALED can be directly determined from the second voice data packet that has been correctly restored_i-2Value, therefore, by the above formula9 TS _ SCALED are obtained through calculation_iThe value is obtained.

In the embodiment of the invention, a plurality of TS _ SCALEDs can be obtained through the calculation formula_iAccording to the RFC3095 protocol, the TS value and the TS _ scan and TS _ STRIDE values satisfy the following relations:

TS _ scan × TS _ STRIDE + TS _ OFFSET, where TS _ OFFSET is a fixed value, and therefore, the TS value can be determined by determining the TS _ scan and TS _ STRIDE values.

In the embodiment of the invention, the SCALED is controlled according to a formula TS _ SCALED_i＝TS_SCALED_i-2+2+ p may determine a plurality of TS _ SCALED_iThe invention can determine a plurality of TS values by sequentially determining the TS values of the compressed messages of the first voice data packet header within the set quantity range until the determined TS values can be restored to obtain the correct first voice data packet. In a possible implementation manner, an initial value of p may be set to 0, and by successively adding 1 to p, a TS value of the first voice data packet header compressed packet is successively determined until the determined TS value can be restored to obtain a correct first voice data packet.

In the embodiment of the present invention, the determination that the TS value can be restored to obtain the correct first voice data packet means: and the first voice data packet obtained by the determined TS value restoration can pass authentication verification.

In one embodiment, the determination that the TS value can be restored to obtain the correct first voice packet is: and the first voice data packet obtained by the determined TS value restoration can pass CRC check.

In another embodiment, the step of determining that the TS value can be restored to obtain the correct first voice data packet is: and the first voice data packet obtained by the determined TS value restoration further passes through UDP (user Datagram protocol) verification on the basis of passing through CRC (Cyclic redundancy check).

In the embodiment of the present invention, the following description mainly takes an example that the first voice data packet obtained by restoring the determined TS value can pass CRC check and UDP check.

It should be noted that, in the embodiment of the present invention, the first voice data packet is a voice data packet received when the voice activation period and the voice silence period are switched.

The following describes the voice packet processing method provided by the present invention in detail by using a specific example.

It should be noted that, at present, voice data packets transmitted based on the VoIP technology are generally encapsulated as IPv6 data packets for transmission, and hereinafter, the encapsulation is all described as IPv6 data packets.

Assuming that a voice packet receiving end receives 7 voice packets from 0x8633 to 0x863a, as shown in fig. 4, a voice packet with a length of 66 is a packet in a voice active period, voice packets with lengths of 12 and 17 are packets in a voice silent period, UO-0 and UOR-2 respectively represent different packet formats, wherein a voice packet with a length of 0x8637 is lost when switching between the voice silent period and the active period.

Fig. 5 shows a schematic diagram of a voice data packet obtained by a voice data packet receiving end restoring a received voice data packet in fig. 4 according to the prior art, where the specific restoring process is as follows:

as can be seen from fig. 4, no packet loss occurs from 0x8633 to 0x8634, SN is sequentially incremented by 1, and TS _ SCALE is sequentially incremented by 1, so that the TS _ STRIDE value, TS _ OFFSET value, and TS _ SCALE 1 value of the voice packet 0x8633 can be calculated from the two voice packets 0x8633 and 0x8634 as follows:

TS_STRIDE＝0x041d24d4-0x041d2394＝0x140＝320；

TS_OFFSET＝0x041d24d4％TS_STRIDE＝0x114＝276；

TS _ scand 1 TS1/TS _ STRIDE 215683, where TS1 indicates the TS value of packet 0x 8633;

in addition, as can be seen from fig. 5, the SN value of the voice packet 0x8633 is: SN1 ═ 0x0a 43;

further, the voice data packets 0x8633 and 0x8634 can be recovered by the correctly recovered values.

For the third received voice packet 0x8635, which is in format of UOR-2, the TS _ scand value of the voice packet 0x8635 may be determined as: TS _ scan _ led3 is 215686, so the TS value of voice packet 0x8635 can be calculated: TS3 ═ 215686 ═ 320+276 ═ 0x4182d94, and voice packets 0x8635 were correctly recovered.

For the fourth received voice packet 0x8636, having a format of UOR-2, the TS _ scand value of the voice packet 0x8636 may be determined as: TS _ scan _ led4 is 215694, so the TS value of voice packet 0x8636 can be calculated: TS4 ═ 215694 ═ 320+276 ═ 0x41d3294, and voice packets 0x8636 were correctly recovered.

Since the voice data packet 0x8637 is lost and it is unclear what type of data packet receiver the lost voice data packet 0x8637 is, according to the prior art, the receiver can recover the sixth voice data packet 0x8638 with format UO-0 received according to the fourth voice data packet 0x8636, and can calculate the TS value of the voice data packet 0x 8638: the TS6 is (215694+2) × 320+276 is 0x41d3514, and in practice, the fifth voice data packet 0x8637 may be a UOR-2 packet, and the TS _ scand value has been updated to a value other than (215694+2), so the TS value restored by the above method may have an error, which causes the restored voice data packet 0x8638 to have an error, which further causes a packet loss due to a failure of checking a subsequent voice data packet, or a CRC error check passes, and further causes more packet loss due to a failure of continuing the CRC check of the subsequent voice data packet after updating the reference value.

As shown in fig. 6, it is assumed that a voice data packet receiving end receives a second voice data packet a (i-2) with a data packet format of UOR-2 in the voice silent period and submits the second voice data packet a (i-2) normally, and then receives a first voice data packet a (i) with a data packet format of UO-0, calculates a current voice data packet SN value according to a Least Significant Bit (LSB) algorithm, and determines that a (i-1) voice data packet is lost.

And after the voice data packet is judged to be lost in the a (i-1), the first voice data packet payload is obtained by analyzing the first voice data packet, and under the condition that the RTP data packet format is determined to be byte alignment or efficiency priority, the first voice data packet is determined to be a broadband voice data packet or a narrowband voice data packet according to the length of the first voice data packet payload and a Frame Type (FT) field in the first voice data packet payload.

Specifically, if the FT field and the payload length of the first voice data packet match the wideband voice data packet in the following table, the wideband voice data packet is determined, and if the FT field and the payload length of the first voice data packet match the narrowband voice data packet in the following table, the narrowband voice data packet is determined.

Table 1 is a table showing the correspondence between FT and payload length of the wideband/narrowband voice packet when the RTP packet format is efficiency-first, and table 2 is a table showing the correspondence between FT and payload length of the wideband/narrowband voice packet when the RTP packet format is byte-aligned.

Table 1 is as follows:

table 2 is as follows:

if the first voice data packet is determined to be a wideband voice data packet through the corresponding relationship in the table, the TS _ STRIDE value carried by the first voice data packet is set to 320, and if the first voice data packet is determined to be a narrowband voice data packet, the TS _ STRIDE value carried by the first voice data packet is set to 160.

Fig. 6 is a detailed illustration of the recovery of the data packet a (i) by the data packet a (i-2).

According to the formula TS _ SCALED_i＝TS_SCALED_i-2+2+ p, setting the initial value of p to 0, restoring the packet header of the first voice data packet a (i), and further performing CRC check on the restored packet headerAnd judging whether the computed CRC checksum is equal to the CRC checksum carried in the first voice data packet a (i), if so, further performing UDP (user Datagram protocol) check on the restored voice data packet, and if the check is passed, determining that the first voice data packet is correctly restored. If the restored first voice data packet only passes the CRC check or passes the CRC check but cannot pass the UDP check, the above steps are continued by adding 1 to the value of p until the first voice data packet which can pass the CRC check and the UDP check is determined.

For example, for the above example of the missing voice data packet 0x8637, the process of restoring the voice data packet 0x8638 by the voice data packet processing method provided by the embodiment of the present invention is as follows:

first, the voice payload of the voice data packet 0x8638 is parsed: 111101000110, the FT field is 1000 ═ 0x8 for efficiency priority, and thus can be determined to be wideband voice packets, and therefore, the TS _ STRIDE value of voice packets 0x8638 can be determined to be 320.

Next, the TS value of the voice packet 0x8638 is calculated one by one with the initial value of p set to 0.

When p is 0, the calculated TS value is: TS6 is (215694+2) × 320+276 is 0x41d3514, and the CRC checksum calculated is 1, which is different from CRC checksum 4 carried in voice packet 0x8638, so that when p is 0, the CRC check fails.

Adding 1 to the p value, and calculating the TS value as: TS6 is (215694+2+1) × 320+276 is 0x41d3654, and the CRC checksum is calculated to be 7, which is different from CRC checksum 4 carried in voice packet 0x8638, so when p is 1, the CRC check fails.

Continuing to add 1 to the p value, and calculating the TS value as: TS6 is (215694+2+2) × 320+276 is 0x41d3794, and the CRC checksum calculated is 1, which is different from CRC checksum 4 carried in voice packet 0x8638, so that when p is 2, the CRC check fails.

Continuing to add 1 to the p value, and calculating the TS value as: TS6 is (215694+2+3) × 320+276 is 0x41d38d4, and the CRC checksum is calculated to be 4, which is the same as CRC checksum 4 carried in the voice packet 0x8638, so that when p is 3, the CRC check is successful.

Finally, the UDP checksum is calculated by:

clearing the checkSum field in the restored UDP packet header, if the length of the restored voice packet is determined to be odd bytes, zero padding the odd bytes into even bytes, accumulating and summing the restored voice data packet according to two bytes to obtain a four-byte value, if the two bytes above the four-byte value are determined not to be zero, the two-byte upper value and the two-byte lower value of the four-byte value are added to obtain a four-byte value again, the above steps are repeated until the two-byte upper value of the four-byte value is 0, the obtained two-byte upper value of 0 is inverted bit by bit to obtain a UDP checksum of 0x2d0e, the calculated UDP checksum is compared with the UDP checksum carried in the voice data packet 0x8638 to determine to be the same, so, in the above example, when p is 3, the determined TS value can correctly restore the voice data packet 0x8638, and based on the above, the receiving end can correctly restore the 7 th voice data packet and the 8 th voice data packet.

Based on the method for processing the voice data packet provided in the foregoing embodiment, an embodiment of the present invention provides a processing apparatus for a voice data packet, fig. 7 is a schematic structural diagram of the processing apparatus for a voice data packet provided in the embodiment of the present invention, and as shown in fig. 7, the apparatus includes a determining unit 101 and a processing unit 102, where:

a determining unit 101, configured to determine a type of a received data packet of a first voice data packet and a TS _ STRIDE value of a first voice data packet header compression packet corresponding to the type of the first voice data packet, where the type of the data packet includes a wideband voice data packet and a narrowband voice data packet, and the TS _ STRIDE value of the wideband voice data packet header compression packet is different from the TS _ STRIDE value of the narrowband voice data packet header compression packet.

The processing unit 102 is configured to sequentially determine, within a range of a set number, a TS value of the first voice packet header compressed packet according to the TS _ scand value of the first voice packet header compressed packet of the set number and the TS _ STRIDE value of the first voice packet header compressed packet determined by the determining unit 101, until the determined TS value can be restored to obtain a correct first voice packet.

In this embodiment of the present invention, the determining unit 101 is further configured to determine the TS _ scand value of the set number of the first voice data packet header compressed packets by using the following method:

determining TS _ SCALED value of compressed message of second voice data packet header, wherein the second voice data packet is received before the first voice data packet is received and is correctly restored according to formula TS _ SCALED_i＝TS_SCALED_i-nAnd + n + p, determining at least one TS _ SCALED value of the first voice data packet header compressed message.

Wherein, TS _ SCALED_iTS _ SCALED value of the compressed packet for the header of the first voice data packet_i-nThe value of TS _ SCALED of the compressed message of the second voice data packet header is shown, p is a positive integer from 0 to 8 determined according to the voice sampling period, i and n are respectively positive integers, n is smaller than i, and n is the difference between the receiving sequence number of the first voice data packet and the receiving sequence number of the second voice data packet.

Specifically, the processing unit 102 sequentially determines the TS values of the first voice data packet header compressed packets within the set quantity range in the following manner until the determined TS values can be restored to obtain correct first voice data packets:

setting the initial value of p to be 0, and gradually adding 1 to p to gradually determine the TS value of the first voice data packet header compressed message until the determined TS value can be restored to obtain a correct first voice data packet.

Specifically, the fact that the TS value determined by the processing unit 102 can be restored to obtain the correct first voice data packet means:

In this embodiment of the present invention, the first voice data packet processed by the processing unit 102 is a voice data packet received when the voice activation period and the voice silence period are switched.

In the embodiment of the invention, the TS _ SCALED values of the first voice data packet header compressed messages and the TS _ STRIDE values of the first voice data packet header compressed messages are set in quantity, the TS values are determined successively by using the TS _ SCALED values of the first voice data packet header compressed messages in quantity set within the quantity set range, the first voice data packets are restored one by one according to the determined TS values until the determined TS values can be restored to obtain correct first voice data packets, namely, the correct TS values are determined through multiple attempts, and the accuracy of restoring the voice data packets according to the determined TS values is improved.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims

1. A method for processing voice data packets, comprising:

determining the data packet type of a received first voice data packet and a time stamp multiple TS _ STRIDE value of a compressed message of a first voice data packet header corresponding to the first voice data packet type;

the data packet types comprise a broadband voice data packet and a narrowband voice data packet, and the TS _ STRIDE value of a broadband voice data packet header compressed message is different from the TS _ STRIDE value of a narrowband voice data packet header compressed message;

according to the time stamp proportion TS _ SCALED value of the first voice data packet header compressed message with the set quantity and the TS _ STRIDE value of the first voice data packet header compressed message, the time stamp TS value of the first voice data packet header compressed message is sequentially determined within the set quantity range until the determined TS value can be restored to obtain a correct first voice data packet;

the TS _ scand value of the compressed packet of the first voice data packet header of the set number is determined in the following manner:

determining a TS _ SCALED value of a compressed message of a second voice data packet header, wherein the second voice data packet is a voice data packet which is received before the first voice data packet is received and is correctly restored;

according to the formula TS _ SCALED_i＝TS_SCALED_i-n+n+pDetermining at least one TS _ SCALED value of the first voice data packet header compressed message;

wherein, TS _ SCALED_iTS _ SCALED value of the first voice data header compressed message_i-nAnd the value is the TS _ SCALED value of the compressed message of the second voice data packet header, p is a positive integer from 0 to 8 determined according to the voice sampling period, i and n are respectively positive integers, n is smaller than i, and n is the difference between the receiving sequence number of the first voice data packet and the receiving sequence number of the second voice data packet.

2. The method of claim 1, wherein the sequentially determining the TS values of the compressed packets in the first voice packet header within the set number range until the determined TS values can be restored to obtain the correct first voice packet includes:

3. The method of claim 2, wherein the determination that the TS value can be restored to obtain the correct first voice packet is:

4. The method of any of claims 1-3, wherein the first voice data packet is a voice data packet received while a voice active period and a voice silence period are switched.

5. An apparatus for processing voice data packets, comprising:

the determining unit is used for determining the type of a received data packet of a first voice data packet and a time stamp multiple TS _ STRIDE value of a first voice data packet header compressed message corresponding to the type of the first voice data packet, wherein the type of the data packet comprises a broadband voice data packet and a narrowband voice data packet, and the TS _ STRIDE value of the broadband voice data packet header compressed message is different from the TS _ STRIDE value of the narrowband voice data packet header compressed message;

the processing unit is used for sequentially determining the time stamp TS values of the first voice data packet header compressed messages within the range of the set number according to the time stamp proportion TS _ SCALED values of the first voice data packet header compressed messages of the set number and the TS _ STRIDE value of the first voice data packet header compressed messages determined by the determining unit until the determined TS values can be restored to obtain correct first voice data packets;

the determining unit is configured to determine the TS _ scand value of the set number of first voice data packet header compressed packets by using the following method:

according to the formula TS _ SCALED_i＝TS_SCALED_i-n+ n + p, determining at least one TS _ SCALED value of the first voice data packet header compressed message;

6. The apparatus according to claim 5, wherein the processing unit sequentially determines the TS values of the first voice packet header compact packets within the set number range in the following manner until the determined TS values can be restored to obtain correct first voice packets:

7. The apparatus of claim 6, wherein the determination by the processing unit that the TS value can be restored to obtain the correct first voice packet is:

8. The apparatus of any of claims 5-7, wherein the first voice data packet processed by the processing unit is a voice data packet received while a voice active period and a voice silence period are switched.