CN105187167B

CN105187167B - A kind of voice data communication method and device

Info

Publication number: CN105187167B
Application number: CN201510633375.6A
Authority: CN
Inventors: 周芳; 刘丽; 成家雄; 李博; 同鑫; 高盛
Original assignee: Guangzhou Baiguoyuan Information Technology Co Ltd
Current assignee: Bigo Technology Singapore Pte Ltd
Priority date: 2015-09-28
Filing date: 2015-09-28
Publication date: 2018-11-06
Anticipated expiration: 2035-09-28
Also published as: CN105187167A

Abstract

The embodiment of the invention discloses a kind of voice data communication methods, including：When voice communication both sides carry out real time phone call, voice data sender detects the network quality of communicating pair, and adjusts speech frames sample rate and speech frame group packet format according to network quality；When network quality is less than predetermined quality threshold, speech frames sample rate is reduced, and generate the voice packet of first group of packet format；The voice packet of wherein first group packet format includes a voice packet header and at least two speech frames；The voice packet of first group of packet format is sent to voice data recipient by voice data sender.The invention also discloses relevant apparatus, using the present invention, it solves the problems, such as ensureing the undistorted limitation for changing coding mode of voice in the prior art, while realizing transport overhead of the reduction voice packet in addition to voice data, greatly reduce transmission delay, the number for effectively preventing speech play interim card ensure that the fluency of speech play.

Description

A kind of voice data communication method and device

Technical field

The present invention relates to the communications field more particularly to a kind of voice data communication method and devices.

Background technology

With the fast development of internet and the communication technology, mobile Internet is increasingly becoming indispensable in people's life On one side.The networking telephone is widely used, is increasingly favored by people due to cheap in recent years.However, with traditional fortune Battalion's quotient's phone is compared, and the speech quality of the networking telephone is still limited by internet itself.

Currently, domestic network improves the stage also in development, respective mobile network is carried out in major operation commercial city, in addition The limitation network quality in area is inevitably irregular.In order to which the voice for effectively solving to be brought due to the unstability of internet is passed Defeated problem, sender detects current network conditions in real time, and sends code check according to current network quality adjust automatically voice so that Voice data can optimize transmission, and then ensure the fluency that recipient plays.

Traditional code check adjustment mode is the method by changing voice coding, in communication process, once there is network Ropy situation then reduces coded sample rate, until network recovery then restores.This mode can effectively reduce voice really Code check, but it is not sufficient to ensure that the fluency conversed under severe network, such as limit the 2g networks of transmission bandwidth, network congestion etc. Situation；And voice quality will necessarily be influenced by reducing coded sample rate, so in the case that ensureing the undistorted change coding of voice There are limitations for mode.

Invention content

Technical problem to be solved of the embodiment of the present invention is, provides a kind of voice data communication method and voice data Communication device solves the problems, such as ensureing the undistorted limitation for changing coding mode of voice in the prior art.

In a first aspect, an embodiment of the present invention provides a kind of voice data communication methods, including：

When voice communication both sides carry out real time phone call, voice data sender detects the network quality of communicating pair, and Speech frames sample rate and speech frame group packet format are adjusted according to the network quality；

When the network quality is less than predetermined quality threshold, speech frames sample rate is reduced, and generate first group of packet The voice packet of format；The voice packet of wherein described first group of packet format includes a voice packet header and at least two speech frames；

The voice packet of first group of packet format is sent to voice data recipient by the voice data sender.

With reference to first aspect, in the first possible implementation, it is described detection communicating pair network quality packet It includes：

Detect the network type of the voice data sender and the voice data recipient；And/or

Assess the communication link state between the voice data sender and the voice data recipient.

With reference to first aspect, in second of possible realization method, the voice data sender is by described first group The voice packet of packet format is sent to after voice data recipient, further includes：

When the network quality is not less than predetermined quality threshold, increase speech frames sample rate, and generate second group The voice packet of packet format；The voice packet of wherein described second group of packet format includes a voice packet header and at least one speech frame；

The voice packet of second group of packet format is sent to voice data recipient by the voice data sender.

Either second of the first possible realization method of first aspect or first aspect with reference to first aspect Possible realization method, in the third possible realization method, it is different that different speech frames sample rates corresponds to association Speech frame group packet format；It is described generate first group of packet format voice packet include：

According to the speech frames sample rate after reduction, corresponding associated speech frame group packet format is analyzed；The correspondence Associated speech frame group packet format instruction has to be packaged using N number of speech frame, and the N is the natural number more than or equal to 2；

According to the associated speech frame group packet format of the correspondence analyzed, the voice packet of first group of packet format is generated；Its Described in first group of packet format voice packet include a voice packet header and N number of speech frame.

The third possible realization method with reference to first aspect, in the 4th kind of possible realization method, the basis The network quality adjustment speech frames sample rate and speech frame group packet format include：

According to the network type of the voice data sender and the voice data recipient that detect, language is initialized The speech frame number to package in sound frame coded sample rate and default speech frame group packet format.

Second aspect, an embodiment of the present invention provides a kind of voice data communication device, the voice data communication device For voice data sender, including：

Detection module, for when voice communication both sides carry out real time phone call, detecting the network quality of communicating pair；

Module is adjusted, for adjusting speech frames sample rate and speech frame group packet format according to the network quality；

First generation module, for when the network quality is less than predetermined quality threshold, reducing speech frames sampling Rate, and generate the voice packet of first group of packet format；The voice packet of wherein described first group of packet format include voice packet header and At least two speech frames；

Voice sending module, for the voice packet of first group of packet format to be sent to voice data recipient.

In conjunction with second aspect, in the first possible implementation, the detection module includes：

Detection unit, the network type for detecting the voice data sender and the voice data recipient；With/ Or

Assessment unit, for assessing the communication link between the voice data sender and the voice data recipient State.

Further include in second of possible realization method in conjunction with second aspect：

Second generation module, for the voice packet of first group of packet format to be sent to language in the voice sending module After sound data receiver, when the network quality is not less than predetermined quality threshold, increase speech frames sample rate, and raw At the voice packet of second group of packet format；The voice packet of wherein described second group of packet format includes voice packet header and at least one Speech frame；

The voice sending module is additionally operable to the voice packet of second group of packet format being sent to voice data recipient.

In conjunction with second of second aspect either the first possible realization method of second aspect or second aspect Possible realization method, in the third possible realization method, it is different that different speech frames sample rates corresponds to association Speech frame group packet format；First generation module includes：

Analytic unit, for according to the speech frames sample rate after reduction, analyzing corresponding associated speech frame group packet Format；The associated speech frame group packet format instruction of correspondence has to be packaged using N number of speech frame, and the N is more than or equal to 2 Natural number；

Voice packet generation unit, for according to the associated speech frame group packet format of the correspondence analyzed, generating first The voice packet of group packet format；The voice packet of wherein described first group of packet format includes a voice packet header and N number of speech frame.

In conjunction with the third possible realization method of second aspect, in the 4th kind of possible realization method, the adjustment Module includes：

Initialization unit, for the net according to the voice data sender and the voice data recipient detected Network type initializes speech frames sample rate；

Speech frame presets unit, for according to the voice data sender and the voice data recipient detected Network type, the speech frame number to package in default speech frame group packet format.

By implementing the embodiment of the present invention, voice data sender detects the network quality of communicating pair, when the network When quality is less than predetermined quality threshold, speech frames sample rate is reduced, and generate the voice packet of first group of packet format；Wherein institute The voice packet for stating first group of packet format includes a voice packet header and at least two speech frames；Being sent to voice data recipient should The voice packet of first group of packet format；In such a way that reduction coded sample rate and more voice frame group packet are combined, solve existing In technology the problem of ensureing the limitation of the undistorted change coding mode of voice, coding and group packet can be adjusted in real time Strategy carries out voice transfer with best transmission code rate, and realizing reduces transport overhead of the voice packet in addition to voice data Meanwhile transmission delay is greatly reduced, the number of speech play interim card is effectively prevented, ensure that the smoothness of speech play Property.

Description of the drawings

In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with Obtain other attached drawings according to these attached drawings.

Fig. 1 is the flow diagram of voice data communication method provided by the invention；

Fig. 2 is the principle schematic of individual voice frame group packet format in the prior art；

Fig. 3 is the principle schematic of speech frame group packet format provided in an embodiment of the present invention；

Fig. 4 is the flow diagram of another embodiment of voice data communication method provided by the invention；

Fig. 5 is the structural schematic diagram of voice data communication device provided in an embodiment of the present invention；

Fig. 6 is the structural schematic diagram of detection module provided in an embodiment of the present invention；

Fig. 7 is the structural schematic diagram of another embodiment of voice data communication device provided by the invention；

Fig. 8 is the structural schematic diagram of the first generation module provided in an embodiment of the present invention；

Fig. 9 is the structural schematic diagram of adjustment module provided in an embodiment of the present invention；

Figure 10 is the structural schematic diagram of another embodiment of voice data communication device provided by the invention.

Specific implementation mode

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation describes, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, those of ordinary skill in the art are obtained every other without creative efforts Embodiment shall fall within the protection scope of the present invention.

It should be noted that the term used in embodiments of the present invention is the mesh only merely for description specific embodiment , it is not intended to limit the invention." the one of the embodiment of the present invention and singulative used in the attached claims Kind ", " described " and "the" are also intended to including most forms, unless context clearly shows that other meanings.It is also understood that this Term "and/or" used herein refers to and includes one or more associated any or all possible group of list items purposes It closes.

It is the flow diagram of voice data communication method provided by the invention referring to Fig. 1, this method includes：

Step S100：When voice communication both sides carry out real time phone call, voice data sender detects the net of communicating pair Network quality, and speech frames sample rate and speech frame group packet format are adjusted according to the network quality；

Specifically, real-time voice communication, such as user A and user's B phones can be carried out between user by internet Chat or multiple users carry out videoconference etc.；When voice data sender detects the network quality of communicating pair in real time, so Coded sample rate and speech frame group packet format that afterwards can be based on different network quality adjust automatically speech frames, when the voice number When acquiring voice data concurrency according to sender and giving voice data recipient, after voice data sender can be according to the adjustment Speech frames sample rate sampled, and according to the speech frame group packet format of adjustment generate voice packet.

Further, the network quality of detection communicating pair may include in the embodiment of the present invention：Detect the voice number According to the network type of sender and the voice data recipient；And/or assess the voice data sender and the voice Communication link state between data receiver.For example, being 2g nets when detecting that communicating pair has the network type of at least one party When network, then it can determine whether out that network quality is less than predetermined quality threshold；Or when the communication link for evaluating communicating pair is gathered around Situation is filled in, then can determine whether out network quality less than predetermined quality threshold, etc..

It should be noted that the embodiment of the present invention adjusts speech frame group packet format by setting different Group Package Policy, Transmission code rate can more efficiently be adjusted.Wherein, different speech frame group packet formats can specify that in voice packet comprising not With the speech frame of number.

Step S102：When the network quality is less than predetermined quality threshold, speech frames sample rate is reduced, and generate The voice packet of first group of packet format；

Specifically, the voice packet of first group of packet format includes a voice packet header and at least two speech frames；Communication is double Side can pre-set a quality threshold, to show whether present communications network quality congestion occurs or bandwidth becomes low Problem, when finding that network quality is less than the predetermined quality threshold, then speech frames sample rate can be reduced, and generate this The voice packet of one group of packet format.

It should be noted that the principle schematic of individual voice frame group packet format in the prior art as shown in Figure 2, voice The packet header of packet occupies and entirely wraps most proportion, and actually speech frame (frame) is only effective voice data, then single frames For the voice packet of group packet during network transmission, packet header occupies most of network bandwidth, when encountering network congestion or bandwidth Lower network, it is easy to cause packet loss or the prodigious situation of transmission delay, cause to play end sound interim card.As Fig. 3 is shown Speech frame group packet format provided in an embodiment of the present invention principle schematic, which can share a language Sound packet header, Fig. 3 reduce the network overhead of individual voice frame transmission process middle wrapping head for including 3 speech frames, section Transmission bandwidth is saved.

Step S104：The voice packet of first group of packet format is sent to voice data and connect by the voice data sender Debit.

Specifically, communicating pair can consult speech frame group packet format in advance, therefore be sent in voice data sender When the voice packet of first group of packet format, voice data recipient can know first group of packet format, and successfully parse the language Sound packet obtains voice data, and plays to user.

Through the embodiment of the present invention, voice data sender detects the network quality of communicating pair, when the network quality When less than predetermined quality threshold, speech frames sample rate is reduced, and generate the voice packet of first group of packet format；Wherein described The voice packet of one group of packet format includes a voice packet header and at least two speech frames；To voice data recipient send this first The voice packet of group packet format；In such a way that reduction coded sample rate and more voice frame group packet are combined, solves the prior art In in the case that ensure voice it is undistorted change coding mode limitation the problem of, can adjust in real time coding and group Bao Ce Slightly, voice transfer is carried out with best transmission code rate, realizing reduces the same of transport overhead of the voice packet in addition to voice data When, transmission delay is greatly reduced, the number of speech play interim card is effectively prevented, ensure that the fluency of speech play.

Still further, it is different that different speech frames sample rate correspondence associations can be arranged in the embodiment of the present invention Speech frame group packet format, the flow signal of another embodiment of voice data communication method provided by the invention as shown in Figure 4 Figure, including：

Step S400：When voice communication both sides carry out real time phone call, voice data sender detects the net of communicating pair Network quality, and speech frames sample rate and speech frame group packet format are adjusted according to the network quality；

Step S402：When the network quality is less than predetermined quality threshold, speech frames sample rate is reduced；

Specifically, the embodiment of step S400 and step S402 can be corresponded to reference to the description in above-described embodiment, this In repeat no more.

Step S404：According to the speech frames sample rate after reduction, corresponding associated speech frame group packet format is analyzed；

Specifically, voice data sender can pre-set multiple and different speech frames sample rates and correspond to association not Same speech frame group packet format, each different speech frame group packet format instruction carry out group using the speech frame of different numbers Packet；Corresponding associated speech frame group packet so can be found out with correspondence analysis shipment according to the speech frames sample rate after reduction Format, the associated speech frame group packet format instruction of the correspondence has to be packaged using N number of speech frame, and the N is more than or equal to 2 Natural number；

Step S406：According to the associated speech frame group packet format of the correspondence analyzed, first group of packet format is generated Voice packet；

Specifically, the voice packet of one group of packet format includes a voice packet header and above-mentioned speech frame.

Step S408：The voice packet of first group of packet format is sent to voice data and connect by the voice data sender Debit；

Step S4010：When the network quality is not less than predetermined quality threshold, increase speech frames sample rate, and Generate the voice packet of second group of packet format；

Specifically, the embodiment of the present invention is by detecting the network quality of communicating pair in real time, when finding that network quality obtains After recovery, i.e., when network quality is not less than predetermined quality threshold, it can be restored by increasing speech frames sample rate, then The voice packet of second group of packet format is generated according to the speech frames sample rate after increase, the voice packet of second group of packet format Including a sound packet header and at least one speech frame.

Step S4012：The voice packet of second group of packet format is sent to voice data by the voice data sender Recipient.

Still further, adjusting speech frames sample rate and speech frame according to the network quality in the embodiment of the present invention Group packet format can specifically include：According to the network of the voice data sender and the voice data recipient that detect Type initializes the speech frame number to package in speech frames sample rate and default speech frame group packet format.

Specifically, different speech frames can be initialized for the different network type of the communicating pair detected Sample rate, and the speech frame number to package in speech frame group packet format is preset according to modes such as test experiences, so as to To avoid frequently adjusting speech frames sample rate and speech frame under the mobile network that some are easy congestion or Bandwidth-Constrained Group packet format, further saves flow.

The above-mentioned method for illustrating the embodiment of the present invention, following for convenient for preferably implementing the embodiment of the present invention Said program is correspondingly also provided below for coordinating the relevant apparatus for implementing said program.

The structural schematic diagram of voice data communication device provided in an embodiment of the present invention as shown in Figure 5, voice data are logical T unit 50 is the corresponding voice data sender of method item embodiment, and voice data communication device 50 may include detection module 500, module 502, the first generation module 504 and voice sending module 506 are adjusted, wherein

Detection module 500 is used to, when voice communication both sides carry out real time phone call, detect the network quality of communicating pair；

Module 502 is adjusted to be used to adjust speech frames sample rate and speech frame group packet format according to the network quality；

First generation module 504 is used to, when the network quality is less than predetermined quality threshold, reduce speech frames and adopt Sample rate, and generate the voice packet of first group of packet format；The voice packet of wherein described first group of packet format includes a voice packet header With at least two speech frames；

Voice sending module 506 is used to the voice packet of first group of packet format being sent to voice data recipient.

Specifically, the structural schematic diagram of detection module provided in an embodiment of the present invention as shown in Figure 6, detection module 500 It may include detection unit 5000 and/or assessment unit 5002, illustrated for all including two units in Fig. 6, In

Detection unit 5000 is used to detect the network type of the voice data sender and the voice data recipient；

Assessment unit 5002 is used to assess the communication between the voice data sender and the voice data recipient Link state.

Further, the structure of another embodiment of voice data communication device provided by the invention as shown in Figure 7 is shown It is intended to, voice data communication device 50 includes that detection module 500, adjustment module 502, the first generation module 504 and voice are sent Can also include the second generation module 508 outside module 506, in voice sending module 506 by first group of packet format Voice packet is sent to after voice data recipient, when the network quality is not less than predetermined quality threshold, increases speech frame Coded sample rate, and generate the voice packet of second group of packet format；The voice packet of wherein described second group of packet format includes a language Sound packet header and at least one speech frame；

Voice sending module 506 is additionally operable to the voice packet of second group of packet format being sent to voice data recipient.

Still further, speech frames sample rate different in the embodiment of the present invention, which corresponds to, is associated with different speech frame groups Packet format；The structural schematic diagram of first generation module provided in an embodiment of the present invention as shown in Figure 8, the first generation module 504 May include analytic unit 5040 and voice packet generation unit 5042, wherein

Analytic unit 5040 is used to, according to the speech frames sample rate after reduction, analyze corresponding associated speech frame group Packet format；The associated speech frame group packet format instruction of the correspondence has to be packaged using N number of speech frame, the N be more than etc. In 2 natural number；

Voice packet generation unit 5042 is used for according to the associated speech frame group packet format of the correspondence that analyzes, generates the The voice packet of one group of packet format；The voice packet of wherein described first group of packet format includes a voice packet header and N number of speech frame.

Still further, the structural schematic diagram of adjustment module provided in an embodiment of the present invention as shown in Figure 9, adjusts module 502 may include that initialization unit 5020 and speech frame preset unit 5022, wherein

Initialization unit 5020 is used for according to the voice data sender and the voice data recipient detected Network type, initialize speech frames sample rate；

Speech frame presets unit 5022 and is used to be connect according to the voice data sender and the voice data that detect The network type of debit presets the speech frame number to package in speech frame group packet format.

The voice data communication device 50 of the embodiment of the present invention for example can be tablet computer, personal digital assistant, intelligence Mobile terminal or other user equipmenies that can carry out voice-over-net call.

It will be appreciated that the function of each function module of the voice data communication device 50 of the present embodiment can be according to above-mentioned side Method specific implementation in method embodiment, details are not described herein again.

Referring to Fig. 10, Figure 10 is the structural representation of another embodiment of voice data communication device provided by the invention Figure.Wherein, as shown in Figure 10, voice data communication device 100 may include：At least one processor 1001, such as CPU, until A few network interface 1004, user interface 1003, memory 1005, at least one communication bus 1002 and display screen 1006.Wherein, communication bus 1002 is for realizing the connection communication between these components.Wherein, user interface 1003 can wrap Include touch screen, keyboard or mouse etc..Network interface 1004 may include optionally standard wireline interface and wireless interface (such as WI-FI interfaces).Memory 1005 can be high-speed RAM memory, can also be non-labile memory (non- Volatile memory), a for example, at least magnetic disk storage, memory 1005 includes the flash in the embodiment of the present invention. Memory 1005 optionally can also be at least one storage system for being located remotely from aforementioned processor 1001.As shown in Figure 10, As may include operating system, network communication module, user interface mould in a kind of memory 1005 of computer storage media Block and voice data communication program.

Processor 1001 can be used for calling the voice data communication program stored in memory 1005, and execute following behaviour Make：

When voice communication both sides carry out real time phone call, the network quality of communicating pair is detected by network interface 1004, And speech frames sample rate and speech frame group packet format are adjusted according to the network quality；When the network quality is less than default When quality threshold, speech frames sample rate is reduced, and generate the voice packet of first group of packet format；Wherein described first group of packet lattice The voice packet of formula includes a voice packet header and at least two speech frames；By network interface 1004 by first group of packet format Voice packet be sent to voice data recipient.

Specifically, the network quality of the detection of processor 1001 communicating pair can specifically include：

The network type of the voice data sender and the voice data recipient are detected by network interface 1004； And/or

The communication between the voice data sender and the voice data recipient is assessed by network interface 1004 Link state.

Specifically, the voice packet of first group of packet format is sent to voice by processor 1001 in voice data sender After data receiver, it can also be performed：

When the network quality is not less than predetermined quality threshold, increase speech frames sample rate, and generate second group The voice packet of packet format；The voice packet of wherein described second group of packet format includes a voice packet header and at least one speech frame； The voice packet of second group of packet format is sent to voice data recipient by network interface 1004.

Specifically, different speech frames sample rates, which corresponds to, is associated with different speech frame group packet formats；Processor 1001 The voice packet for generating first group of packet format can specifically include：

According to the speech frames sample rate after reduction, corresponding associated speech frame group packet format is analyzed；The correspondence Associated speech frame group packet format instruction has to be packaged using N number of speech frame, and the N is the natural number more than or equal to 2；According to The associated speech frame group packet format of the correspondence analyzed generates the voice packet of first group of packet format；It is first group wherein described The voice packet of packet format includes a voice packet header and N number of speech frame.

Specifically, processor 1001 adjusts speech frames sample rate and speech frame group packet format according to the network quality It can specifically include：

It should be noted that the function of 100 each function module of voice data communication device in the embodiment of the present invention can root According to the method specific implementation in above method embodiment, the correlation that specific implementation process is referred to above method embodiment is retouched It states, details are not described herein again.

In conclusion by implementing the embodiment of the present invention, voice data sender detects the network quality of communicating pair, when When the network quality is less than predetermined quality threshold, speech frames sample rate is reduced, and generate the voice of first group of packet format Packet；The voice packet of wherein described first group of packet format includes a voice packet header and at least two speech frames；It is connect to voice data Debit sends the voice packet of first group of packet format；In such a way that reduction coded sample rate and more voice frame group packet are combined, It solves the problems, such as ensureing the undistorted limitation for changing coding mode of voice in the prior art, can adjust in real time Coding and Group Package Policy carry out voice transfer with best transmission code rate, and realizing reduces voice packet in addition to voice data While transport overhead, transmission delay is greatly reduced, effectively prevents the number of speech play interim card, ensure that voice is broadcast The fluency put.

One of ordinary skill in the art will appreciate that realizing all or part of flow in above-described embodiment method, being can be with Relevant hardware is instructed to complete by computer program, the program can be stored in a computer read/write memory medium In, the program is when being executed, it may include such as the flow of the embodiment of above-mentioned each method.Wherein, the storage medium can be magnetic Dish, CD, read-only memory (Read-Only Memory, ROM) or random access memory (Random Access Memory, RAM) etc..

It is above disclosed to be only a preferred embodiment of the present invention, the power of the present invention cannot be limited with this certainly Sharp range, those skilled in the art can understand all or part of the processes for realizing the above embodiment, and is weighed according to the present invention Equivalent variations made by profit requirement, still belong to the scope covered by the invention.

Claims

1. a kind of voice data communication method, which is characterized in that including：

When voice communication both sides carry out real time phone call, the network quality of voice data sender detection communicating pair, and according to The network quality adjustment speech frames sample rate and speech frame group packet format；

When the network quality is less than predetermined quality threshold, speech frames sample rate is reduced, and generate first group of packet format Voice packet；The voice packet of wherein described first group of packet format includes a voice packet header and at least two speech frames；

The voice packet of first group of packet format is sent to voice data recipient by the voice data sender；

Wherein, different speech frames sample rates, which corresponds to, is associated with different speech frame group packet formats；First group of packet of the generation The voice packet of format includes：

According to the speech frames sample rate after reduction, corresponding associated speech frame group packet format is analyzed；The corresponding association The instruction of speech frame group packet format have and packaged using N number of speech frame, the N is the natural number more than or equal to 2；

According to the associated speech frame group packet format of the correspondence analyzed, the voice packet of first group of packet format is generated；Wherein institute The voice packet for stating first group of packet format includes a voice packet header and N number of speech frame.

2. the method as described in claim 1, which is characterized in that it is described detection communicating pair network quality include：

3. the method as described in claim 1, which is characterized in that the voice data sender is by first group of packet format Voice packet is sent to after voice data recipient, further includes：

When the network quality is not less than predetermined quality threshold, increase speech frames sample rate, and generate second group of packet lattice The voice packet of formula；The voice packet of wherein described second group of packet format includes a voice packet header and at least one speech frame；

4. the method as described in claim 1, which is characterized in that described to adjust speech frames sampling according to the network quality Rate and speech frame group packet format include：

According to the network type of the voice data sender and the voice data recipient that detect, speech frame is initialized The speech frame number to package in coded sample rate and default speech frame group packet format.

5. a kind of voice data communication device, which is characterized in that the voice data communication device is voice data sender, packet It includes：

First generation module, for when the network quality is less than predetermined quality threshold, reducing speech frames sample rate, and Generate the voice packet of first group of packet format；The voice packet of wherein described first group of packet format includes a voice packet header and at least two A speech frame；

Voice sending module, for the voice packet of first group of packet format to be sent to voice data recipient；

Wherein, different speech frames sample rates, which corresponds to, is associated with different speech frame group packet formats；First generation module Including：

Analytic unit, for according to the speech frames sample rate after reduction, analyzing corresponding associated speech frame group packet format； The associated speech frame group packet format instruction of correspondence has the N number of speech frame of use to package, and the N is oneself more than or equal to 2 So number；

Voice packet generation unit, for according to the associated speech frame group packet format of the correspondence analyzed, generating first group of packet The voice packet of format；The voice packet of wherein described first group of packet format includes a voice packet header and N number of speech frame.

6. device as claimed in claim 5, which is characterized in that the detection module includes：

Detection unit, the network type for detecting the voice data sender and the voice data recipient；And/or

Assessment unit, for assessing the communication link shape between the voice data sender and the voice data recipient State.

7. device as claimed in claim 5, which is characterized in that further include：

Second generation module, for the voice packet of first group of packet format to be sent to voice number in the voice sending module After recipient, when the network quality is not less than predetermined quality threshold, increase speech frames sample rate, and generate the The voice packet of two groups of packet formats；The voice packet of wherein described second group of packet format includes a voice packet header and at least one voice Frame；

8. device as claimed in claim 5, which is characterized in that the adjustment module includes：

Initialization unit, for the network class according to the voice data sender and the voice data recipient detected Type initializes speech frames sample rate；

Speech frame presets unit, for the net according to the voice data sender and the voice data recipient detected Network type presets the speech frame number to package in speech frame group packet format.