CN109218083B - Voice data transmission method and device - Google Patents

Voice data transmission method and device Download PDF

Info

Publication number
CN109218083B
CN109218083B CN201810981340.5A CN201810981340A CN109218083B CN 109218083 B CN109218083 B CN 109218083B CN 201810981340 A CN201810981340 A CN 201810981340A CN 109218083 B CN109218083 B CN 109218083B
Authority
CN
China
Prior art keywords
packet loss
coding scheme
voice data
neural network
loss rate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810981340.5A
Other languages
Chinese (zh)
Other versions
CN109218083A (en
Inventor
王轶男
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou lieyou Information Technology Co.,Ltd.
Original Assignee
Guangzhou Lieyou Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Lieyou Information Technology Co ltd filed Critical Guangzhou Lieyou Information Technology Co ltd
Priority to CN201810981340.5A priority Critical patent/CN109218083B/en
Publication of CN109218083A publication Critical patent/CN109218083A/en
Application granted granted Critical
Publication of CN109218083B publication Critical patent/CN109218083B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/145Network analysis or design involving simulating, designing, planning or modelling of a network
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/0001Systems modifying transmission characteristics according to link quality, e.g. power backoff
    • H04L1/0009Systems modifying transmission characteristics according to link quality, e.g. power backoff by adapting the channel coding
    • H04L1/0011Systems modifying transmission characteristics according to link quality, e.g. power backoff by adapting the channel coding applied to payload information
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/004Arrangements for detecting or preventing errors in the information received by using forward error control
    • H04L1/0056Systems characterized by the type of code used
    • H04L1/0057Block codes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/14Network analysis or design
    • H04L41/147Network analysis or design for predicting network behaviour
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0823Errors, e.g. transmission errors
    • H04L43/0829Packet loss
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/16Threshold monitoring

Abstract

A voice data transmission method and device comprises the following steps: constructing a neural network prediction model for predicting the network packet loss rate; receiving transmission request information including voice data to be transmitted; obtaining pre-stored historical packet loss data, and predicting the current packet loss rate of the current network through a neural network prediction model according to the historical packet loss data; selecting a coding scheme matched with the current packet loss rate from a pre-stored coding scheme library, and coding the voice data to be transmitted according to the coding scheme to obtain coded data; and sending the coded data and the identification of the coding scheme to receiving equipment so that the receiving equipment selects a corresponding decoding scheme according to the identification of the coding scheme to decode the coded data to obtain decoded data and further finish the transmission of the voice data. By implementing the voice data transmission method and the voice data transmission device, the transmission bandwidth can be effectively saved on the premise of ensuring the communication quality, the voice data transmission efficiency is improved, and the voice data transmission performance is further improved.

Description

Voice data transmission method and device
Technical Field
The invention relates to the technical field of data communication, in particular to a voice data transmission method and device.
Background
Nowadays, as the mobile internet is more and more popularized, the application of real-time voice communication is more and more popular, but the problem of packet loss of real-time voice communication is inevitable due to the influence of network conditions and related factors. In order to solve the problem of poor communication quality caused by packet loss in real-time voice communication, in the conventional voice data transmission method, when audio data packet loss occurs, a filling packet (such as a mute packet, a noise packet or a repeat previous packet) is inserted to repair the packet loss; or the speech data is encoded using a fixed interleaving encoding technique or a fixed preceding error correction technique to reduce the packet loss rate. However, in practice, it is found that the existing method easily causes the increase of signal transmission bandwidth, the transmission efficiency is low, and the transmission performance is poor.
Disclosure of Invention
In view of the above problems, the present invention provides a method and an apparatus for transmitting voice data, which can effectively save transmission bandwidth, improve voice data transmission efficiency, and further improve voice data transmission performance on the premise of ensuring communication quality.
In order to achieve the purpose, the invention adopts the following technical scheme:
the first aspect of the present invention discloses a voice data transmission method, which includes:
constructing a neural network prediction model for predicting the network packet loss rate;
receiving transmission request information including voice data to be transmitted;
obtaining pre-stored historical packet loss data, and predicting the current packet loss rate of the current network through the neural network prediction model according to the historical packet loss data;
selecting a coding scheme matched with the current packet loss rate from a pre-stored coding scheme library, and coding the voice data to be transmitted according to the coding scheme to obtain coded data;
and sending the coded data and the identification of the coding scheme to receiving equipment, so that the receiving equipment selects a corresponding decoding scheme according to the identification of the coding scheme to decode the coded data to obtain decoded data, and further finishing the transmission of voice data.
As an optional implementation manner, in the first aspect of the present invention, the neural network prediction model is a multi-layer feedforward neural network, and the constructing the neural network prediction model for predicting the network packet loss ratio includes:
establishing an initial neural network model through an error back propagation algorithm;
and performing error back-propagation training on the initial neural network model through the selected non-lost packet network and the selected lost packet network to obtain a neural network prediction model for predicting the network packet loss rate.
As an optional implementation manner, in the first aspect of the present invention, the pre-stored historical packet loss data includes historical prediction data obtained by the error back propagation training and all network packet loss rates obtained by the neural network prediction model before the pre-stored historical packet loss data is obtained.
As an optional implementation manner, in the first aspect of the present invention, before the receiving transmission request information including voice data to be transmitted, the method further includes:
after a sending device establishes communication connection with a receiving device, sending a coding and decoding corresponding scheme to the receiving device, wherein the coding and decoding corresponding scheme comprises a plurality of coding scheme identifications and a decoding scheme corresponding to each coding scheme identification.
As an optional implementation manner, in the first aspect of the present invention, the selecting, from a pre-stored coding scheme library, a coding scheme matching the current packet loss rate includes:
judging whether the current packet loss rate is greater than a preset first packet loss threshold value or not;
if the current packet loss rate is judged to be greater than the first packet loss threshold value, selecting a previous error correction coding scheme with the redundancy depth as a preset first depth value from a pre-stored coding scheme library as a coding scheme;
if the current packet loss rate is judged to be not greater than the first packet loss threshold, judging whether the current packet loss rate is greater than a preset second packet loss threshold, wherein the first packet loss threshold is greater than the second packet loss threshold;
if the current packet loss rate is judged to be greater than the second packet loss threshold value, selecting a previous error correction coding scheme with the redundancy depth as a preset second depth value from the pre-stored coding scheme library as a coding scheme;
and if the current packet loss rate is judged to be not greater than the second packet loss threshold value, selecting a previous error correction coding scheme with the redundancy depth as a preset third depth value from the pre-stored coding scheme library as the coding scheme.
A second aspect of the present invention discloses a voice data transmission apparatus, including:
the building module is used for building a neural network prediction model for predicting the network packet loss rate;
the receiving module is used for receiving transmission request information comprising voice data to be transmitted;
the prediction module is used for acquiring pre-stored historical packet loss data and predicting the current packet loss rate of the current network through the neural network prediction model according to the historical packet loss data;
the scheme selection module is used for selecting a coding scheme matched with the current packet loss rate from a pre-stored coding scheme library;
the coding module is used for coding the voice data to be transmitted according to the coding scheme to obtain coded data;
and the sending module is used for sending the coded data and the identification of the coding scheme to receiving equipment so that the receiving equipment selects a corresponding decoding scheme according to the identification of the coding scheme to decode the coded data to obtain decoded data, and further the transmission of voice data is completed.
As an alternative implementation, in the second aspect of the present invention, the neural network prediction model is a multi-layer feedforward neural network, and the building module includes:
the first submodule is used for establishing an initial neural network model through an error back propagation algorithm;
and the second sub-module is used for carrying out error back-propagation training on the initial neural network model through the selected packet loss-free network and the selected packet loss network to obtain a neural network prediction model for predicting the network packet loss rate.
As an optional implementation manner, in the second aspect of the present invention, the scheme selection module includes:
a third sub-module, configured to determine whether the current packet loss rate is greater than a preset first packet loss threshold;
a fourth sub-module, configured to select, when the third sub-module determines that the current packet loss rate is greater than the preset first packet loss threshold, a previous error correction coding scheme with a redundancy depth being a preset first depth value from a pre-stored coding scheme library as a coding scheme; when the third sub-module judges that the current packet loss rate is not greater than the preset first packet loss threshold, judging whether the current packet loss rate is greater than a preset second packet loss threshold, wherein the preset first packet loss threshold is greater than the preset second packet loss threshold;
a fifth sub-module, configured to select, when the fourth sub-module determines that the current packet loss rate is greater than the preset second packet loss threshold, a previous error correction coding scheme with a redundancy depth being a preset second depth value from the pre-stored coding scheme library as a coding scheme; and when the fourth sub-module judges that the current packet loss rate is not greater than the preset second packet loss threshold, selecting a previous error correction coding scheme with the redundancy depth as a preset third depth value from the pre-stored coding scheme library as the coding scheme.
In a third aspect, the present invention discloses a computer device, which includes a memory for storing a computer program and a processor for executing the computer program to make the computer device execute part or all of the voice data transmission method disclosed in the first aspect.
A fourth aspect of the present invention discloses a computer-readable storage medium storing the computer program for use in the computer apparatus of the third aspect.
According to the voice data transmission method and the voice data transmission device, a neural network prediction model for predicting the network packet loss rate is constructed; when receiving transmission request information including voice data to be transmitted, acquiring pre-stored historical packet loss data, and predicting the current packet loss rate of the current network through a neural network prediction model according to the historical packet loss data; then selecting a coding scheme matched with the current packet loss rate from a pre-stored coding scheme library, and further coding the voice data to be transmitted according to the coding scheme to obtain coded data; and finally, the coded data and the identification of the coding scheme are sent to a receiving device, so that the receiving device selects the corresponding decoding scheme according to the identification of the coding scheme to decode the coded data to obtain decoded data, and then the transmission of the voice data is completed.
In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
To more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are required to be used in the embodiments will be briefly described below, and it should be understood that the following drawings only illustrate some embodiments of the present invention, and therefore should not be considered as limiting the scope of the present invention.
Fig. 1 is a flowchart illustrating a voice data transmission method according to an embodiment of the present invention;
fig. 2 is a flowchart illustrating a voice data transmission method according to a second embodiment of the present invention;
fig. 3 is a schematic structural diagram of a voice data transmission apparatus according to a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of a voice data transmission apparatus according to a fourth embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
Aiming at the problems in the prior art, the invention provides a voice data transmission method and a voice data transmission device; firstly, constructing a neural network prediction model for predicting the network packet loss rate; when receiving transmission request information including voice data to be transmitted, acquiring pre-stored historical packet loss data, and predicting the current packet loss rate of the current network through a neural network prediction model according to the historical packet loss data; then selecting a coding scheme matched with the current packet loss rate from a pre-stored coding scheme library, and further coding the voice data to be transmitted according to the coding scheme to obtain coded data; and finally, the coded data and the identification of the coding scheme are sent to a receiving device, so that the receiving device selects the corresponding decoding scheme according to the identification of the coding scheme to decode the coded data to obtain decoded data, and then the transmission of the voice data is completed. Also, the techniques may be implemented in associated software or hardware, as described below by way of example.
Example 1
Referring to fig. 1, fig. 1 is a flowchart illustrating a voice data transmission method according to an embodiment of the present invention. As shown in fig. 1, the voice data transmission method may include the following steps:
s101, constructing a neural network prediction model for predicting the network packet loss rate.
In this embodiment, the neural network prediction model is a multi-layer feedforward neural network trained by an error back propagation algorithm. The multi-layer feedforward neural network can train and modify the weight and the threshold value through known training samples.
In this embodiment, the training of the constructed initial neural network model is divided into two stages. The first stage is off-line training, prediction training is carried out on an initial neural network model under the condition of no packet loss network to obtain first-stage prediction data, then the first-stage prediction data is used for training the initial neural network model, and the trained initial neural model can be used as an initial prediction model; and the second stage is on-line training, the initial prediction model is subjected to prediction training under a network with the risk of packet loss to obtain second-stage prediction data, then the initial prediction model is trained by using the second-stage prediction data, and the trained initial prediction model is the neural network prediction model.
As an alternative embodiment, after the initial neural network model is constructed, the neural network prediction model may be optimized by a particle swarm algorithm. And obtaining an initial weight and an initial threshold of the initial neural network model in a global range through a particle swarm algorithm, and then training through an error back propagation algorithm to perform local search. Specifically, a weight and a threshold of an initial neural network model are randomly configured, then the particle dimension of a particle swarm algorithm is used for replacing the weights and the thresholds of all neurons in the initial neural network model, then a proper initial weight and an initial threshold are found for the initial neural network model through the particle swarm algorithm, and finally the initial weight and the initial threshold are set as the weights and the thresholds of the initial neural network model, so that the optimized initial neural network model is obtained.
As a further optional implementation manner, after the optimized initial neural network model is obtained, the mean square error output by the optimized initial neural network model can be further calculated, whether the mean square error meets the preset requirement or not is judged, if not, the mean square error is taken as a fitness function of the particle swarm, another appropriate initial weight and an initial threshold are found for the initial neural network model through the particle swarm algorithm, and then another optimized initial neural model is obtained, and iteration is repeated in this way until the mean square error output by the network meets the preset requirement, and finally, the optimized initial weight and the optimized initial threshold are obtained, and then the optimized initial neural network model is obtained.
S102, receiving transmission request information including voice data to be transmitted.
The main body of the voice data transmission method provided in this embodiment may be a communication device that transmits a voice output through a transmission band, or may be a communication device with signal processing, which is not limited in this embodiment.
S103, obtaining pre-stored historical packet loss data, and predicting the current packet loss rate of the current network through a neural network prediction model according to the historical packet loss data.
In this embodiment, the pre-stored historical packet loss data includes, but is not limited to, historical prediction data obtained when the neural network prediction model is constructed, and a network packet loss rate obtained through the neural network prediction model before the pre-stored historical packet loss data is obtained, which is not limited in this embodiment.
In this embodiment, after the current packet loss rate of the current network is predicted by the neural network prediction model according to the historical packet loss data, the current packet loss rate is added to the historical packet loss data and stored.
In this embodiment, the current packet loss rate may be 8%, 12%, 24%, and the like, which is not limited in this embodiment.
And S104, selecting a coding scheme matched with the current packet loss rate from a pre-stored coding scheme library, and coding the voice data to be transmitted according to the coding scheme to obtain coded data.
As an alternative embodiment, the encoding scheme includes a plurality of encoding parameters and an encoding technique corresponding to each encoding parameter. The encoding technique may be an error correction technique, an interleaving anti-packet technique, or the like, and this embodiment is not limited thereto.
As an optional implementation manner, when the encoding technique is a forward error correction technique, the corresponding encoding parameter is a redundant depth value, when the predicted current packet loss rate is low, a lower redundant depth value may be selected, and when the predicted current packet loss rate is high, a higher redundant depth value may be selected, so that two suitable redundancies can be selected according to the current communication network condition, thereby effectively improving the bandwidth utilization rate, improving the adaptability of communication encoding, and further improving the voice data transmission performance.
And S105, sending the coded data and the identification of the coding scheme to receiving equipment, so that the receiving equipment selects a corresponding decoding scheme according to the identification of the coding scheme to decode the coded data to obtain decoded data, and further completing the transmission of voice data.
In this embodiment, the receiving device is an electronic device with a communication function, and may specifically be a smart phone (such as an Android phone, an iOS phone, and the like), a tablet computer, a palm computer, a smart watch, a Mobile Internet Device (MID), a PC, and the like, which is not limited in the embodiment of the present invention.
In the voice data transmission method described in fig. 1, a neural network prediction model for predicting a network packet loss rate is first constructed; when receiving transmission request information including voice data to be transmitted, acquiring pre-stored historical packet loss data, and predicting the current packet loss rate of the current network through a neural network prediction model according to the historical packet loss data; then selecting a coding scheme matched with the current packet loss rate from a pre-stored coding scheme library, and further coding the voice data to be transmitted according to the coding scheme to obtain coded data; and finally, the coded data and the identification of the coding scheme are sent to receiving equipment, so that the receiving equipment selects a corresponding decoding scheme according to the identification of the coding scheme to decode the coded data to obtain decoded data, and further the transmission of voice data is completed. It can be seen that by implementing the voice data transmission method described in fig. 1, the current packet loss rate of the current network can be accurately predicted through the neural network prediction model, and meanwhile, the coding scheme is adjusted according to the current packet loss rate of the current network, so that unnecessary redundant information can be reduced, transmission bandwidth is effectively saved on the premise of ensuring communication quality, voice data transmission efficiency is improved, and further voice data transmission performance is improved.
Example 2
Referring to fig. 2, fig. 2 is a flowchart illustrating a voice data transmission method according to an embodiment of the present invention. As shown in fig. 2, the voice data transmission method may include the following steps:
s201, constructing a neural network prediction model for predicting the network packet loss rate.
As an alternative embodiment, the neural network prediction model is a multi-layer feedforward neural network.
As a further optional implementation, constructing a neural network prediction model for predicting a network packet loss rate may include the following steps:
establishing an initial neural network model through an error back propagation algorithm;
and performing error back-propagation training on the initial neural network model through the selected packet loss-free network and the selected packet loss network to obtain a neural network prediction model for predicting the network packet loss rate.
S202, receiving transmission request information including voice data to be transmitted.
As an optional implementation manner, before receiving the transmission request information including the voice data to be transmitted, the method further includes:
after the sending device establishes a communication connection with the receiving device, a coding and decoding corresponding scheme is sent to the receiving device, wherein the coding and decoding corresponding scheme comprises a plurality of coding scheme identifications and a coding scheme identification corresponding to each coding scheme identification.
In this embodiment, after receiving the encoded voice data and the identifier of the encoding scheme, the receiving device may select a decoding scheme corresponding to the identifier of the encoding scheme from the encoding and decoding corresponding schemes, and then perform decoding processing on the encoded voice data according to the decoding scheme to obtain decoded voice data, so as to complete communication of the voice data.
S203, obtaining pre-stored historical packet loss data, and predicting the current packet loss rate of the current network through a neural network prediction model according to the historical packet loss data.
In this embodiment, the pre-stored historical packet loss data includes historical prediction data obtained by error back propagation training and all network packet loss rates obtained by the neural network prediction model before the pre-stored historical packet loss data is obtained.
S204, judging whether the current packet loss rate is greater than a preset first packet loss threshold value or not, and if so, executing step S205 and step S209 to step S210; if not, step S206 is executed.
S205, selecting a previous error correction coding scheme with the redundant depth as a preset first depth value from a pre-stored coding scheme library as a coding scheme for coding the voice data to be transmitted, and executing the steps S209 to S210.
S206, judging whether the current packet loss rate is greater than a preset second packet loss threshold value, if so, executing the steps S207, S209 and S210, and if not, executing the steps S208 and S210.
In this embodiment, the preset first packet loss threshold is greater than the preset second packet loss threshold. The preset first packet loss threshold may be 20%, the preset second packet loss threshold may be 10%, and the like, which is not limited in this embodiment.
S207, selecting a previous error correction coding scheme with the redundant depth as a preset second depth value from a pre-stored coding scheme library as a coding scheme, and executing the steps S209 to S210.
S208, selecting a previous error correction coding scheme with the redundant depth as a preset third depth value from a pre-stored coding scheme library as a coding scheme, and executing the steps S209 to S210.
In this embodiment, the preset first depth value may be 2, the preset second depth value may be 1, the preset third depth value may be 0, and the like, which is not limited in this embodiment.
In this embodiment, by implementing the steps S204 to S208, a coding scheme matched with the current packet loss rate can be selected from a pre-stored coding scheme library.
S209, coding the voice data to be transmitted according to the coding scheme to obtain coded data.
S210, the coded data and the identification of the coding scheme are sent to a receiving device, so that the receiving device selects a corresponding decoding scheme according to the identification of the coding scheme to decode the coded data to obtain decoded data, and further transmission of voice data is completed.
It can be seen that by implementing the voice data transmission method described in fig. 2, the current packet loss rate of the current network can be accurately predicted through the neural network prediction model, and meanwhile, the coding scheme is adjusted according to the current packet loss rate of the current network, so that unnecessary redundant information can be reduced, transmission bandwidth is effectively saved on the premise of ensuring communication quality, voice data transmission efficiency is improved, and further voice data transmission performance is improved.
Example 3
Referring to fig. 3, fig. 3 is a schematic structural diagram of a voice data transmission apparatus according to an embodiment of the present invention. As shown in fig. 3, the voice data transmission apparatus includes:
the building module 301 is configured to build a neural network prediction model for predicting a network packet loss rate.
A receiving module 302, configured to receive transmission request information including voice data to be transmitted.
The predicting module 303 is configured to obtain pre-stored historical packet loss data, and predict a current packet loss rate of the current network through a neural network prediction model according to the historical packet loss data.
And a scheme selecting module 304, configured to select a coding scheme matching the current packet loss rate from a pre-stored coding scheme library.
And the encoding module 305 is configured to encode the voice data to be transmitted according to the encoding scheme to obtain encoded data.
The sending module 306 is configured to send the encoded data and the identifier of the encoding scheme to the receiving device, so that the receiving device selects a corresponding decoding scheme according to the identifier of the encoding scheme to decode the encoded data to obtain decoded data, thereby completing transmission of the voice data.
It can be seen that, with the implementation of the voice data transmission apparatus described in fig. 3, the current packet loss rate of the current network can be accurately predicted through the neural network prediction model, and meanwhile, the encoding scheme is adjusted according to the current packet loss rate of the current network, so that unnecessary redundant information can be reduced, transmission bandwidth is effectively saved on the premise of ensuring communication quality, voice data transmission efficiency is improved, and then voice data transmission performance is improved.
Example 4
Referring to fig. 4, fig. 4 is a schematic structural diagram of a voice data transmission apparatus according to a fourth embodiment of the present invention. The voice data transmission apparatus shown in fig. 4 is optimized by the voice data transmission apparatus shown in fig. 3. As shown in fig. 4, the building block 301 includes:
the first sub-module 3011 is configured to build an initial neural network model through an error back propagation algorithm.
In this embodiment, the neural network prediction model is a multi-layer feedforward neural network.
And the second sub-module 3012 is configured to perform error back-propagation training on the initial neural network model through the selected packet loss free network and the selected packet loss network, so as to obtain a neural network prediction model for predicting a network packet loss rate.
As an alternative embodiment, the scheme selection module 304 includes:
a third sub-module 3041, configured to determine whether the current packet loss rate is greater than a preset first packet loss threshold.
A fourth sub-module 3042, configured to select, when the third sub-module 3041 determines that the current packet loss rate is greater than the preset first packet loss threshold, a previous error correction coding scheme with a redundancy depth being a preset first depth value from a pre-stored coding scheme library as a coding scheme; and when the third sub-module 3041 determines that the current packet loss rate is not greater than the preset first packet loss threshold, determining whether the current packet loss rate is greater than a preset second packet loss threshold, where the preset first packet loss threshold is greater than the preset second packet loss threshold.
A fifth sub-module 3043, configured to, when the fourth sub-module 3042 determines that the current packet loss rate is not greater than the second packet loss threshold, select a previous error correction coding scheme with a redundancy depth being a preset second depth value from a pre-stored coding scheme library as a coding scheme; and if the fourth sub-module 3042 determines that the current packet loss rate is not greater than the second packet loss threshold, select a previous error correction coding scheme with a redundancy depth of a preset third depth value from a pre-stored coding scheme library as the coding scheme.
It can be seen that, with the implementation of the voice data transmission apparatus described in fig. 4, the current packet loss rate of the current network can be accurately predicted through the neural network prediction model, and meanwhile, the encoding scheme is adjusted according to the current packet loss rate of the current network, so that unnecessary redundant information can be reduced, transmission bandwidth is effectively saved on the premise of ensuring communication quality, voice data transmission efficiency is improved, and then voice data transmission performance is improved.
In addition, the invention also provides computer equipment. The computer device comprises a memory and a processor, wherein the memory can be used for storing a computer program, and the processor can make the computer device execute the functions of the method or each module in the voice data transmission device by operating the computer program.
The memory may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the mobile terminal, and the like. Further, the memory may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
The embodiment also provides a computer storage medium for storing a computer program used in the computer device.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative and, for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, each functional module or unit in each embodiment of the present invention may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention or a part of the technical solution that contributes to the prior art in essence can be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a smart phone, a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (9)

1. A method for voice data transmission, comprising:
constructing a neural network prediction model for predicting the network packet loss rate;
receiving transmission request information including voice data to be transmitted;
obtaining pre-stored historical packet loss data, and predicting the current packet loss rate of the current network through the neural network prediction model according to the historical packet loss data; the pre-stored historical packet loss data comprises historical prediction data obtained by error back-propagation training and all network packet loss rates obtained by the neural network prediction model before the pre-stored historical packet loss data is obtained;
selecting a coding scheme matched with the current packet loss rate from a pre-stored coding scheme library, and coding the voice data to be transmitted according to the coding scheme to obtain coded data;
and sending the coded data and the identification of the coding scheme to receiving equipment, so that the receiving equipment selects a corresponding decoding scheme according to the identification of the coding scheme to decode the coded data to obtain decoded data, and further finishing the transmission of voice data.
2. The method according to claim 1, wherein the neural network prediction model is a multi-layer feedforward neural network, and the constructing the neural network prediction model for predicting the network packet loss rate comprises:
establishing an initial neural network model through an error back propagation algorithm;
and performing error back-propagation training on the initial neural network model through the selected non-lost packet network and the selected lost packet network to obtain a neural network prediction model for predicting the network packet loss rate.
3. The voice data transmission method according to claim 1, wherein before the receiving transmission request information including voice data to be transmitted, the method further comprises:
after a sending device establishes communication connection with a receiving device, sending a coding and decoding corresponding scheme to the receiving device, wherein the coding and decoding corresponding scheme comprises a plurality of coding scheme identifications and a decoding scheme corresponding to each coding scheme identification.
4. The method according to claim 1, wherein the selecting the coding scheme matching the current packet loss rate from a pre-stored coding scheme library comprises:
judging whether the current packet loss rate is greater than a preset first packet loss threshold value or not;
if the current packet loss rate is judged to be greater than the first packet loss threshold value, selecting a previous error correction coding scheme with the redundancy depth as a preset first depth value from a pre-stored coding scheme library as a coding scheme;
if the current packet loss rate is judged to be not greater than the first packet loss threshold, judging whether the current packet loss rate is greater than a preset second packet loss threshold, wherein the first packet loss threshold is greater than the second packet loss threshold;
if the current packet loss rate is judged to be greater than the second packet loss threshold value, selecting a previous error correction coding scheme with the redundancy depth as a preset second depth value from the pre-stored coding scheme library as a coding scheme;
and if the current packet loss rate is judged to be not greater than the second packet loss threshold value, selecting a previous error correction coding scheme with the redundancy depth as a preset third depth value from the pre-stored coding scheme library as the coding scheme.
5. A voice data transmission apparatus, comprising:
the building module is used for building a neural network prediction model for predicting the network packet loss rate;
the receiving module is used for receiving transmission request information comprising voice data to be transmitted;
the prediction module is used for acquiring pre-stored historical packet loss data and predicting the current packet loss rate of the current network through the neural network prediction model according to the historical packet loss data; the pre-stored historical packet loss data comprises historical prediction data obtained by error back-propagation training and all network packet loss rates obtained by the neural network prediction model before the pre-stored historical packet loss data is obtained;
the scheme selection module is used for selecting a coding scheme matched with the current packet loss rate from a pre-stored coding scheme library;
the coding module is used for coding the voice data to be transmitted according to the coding scheme to obtain coded data;
and the sending module is used for sending the coded data and the identification of the coding scheme to receiving equipment so that the receiving equipment selects a corresponding decoding scheme according to the identification of the coding scheme to decode the coded data to obtain decoded data, and further the transmission of voice data is completed.
6. The apparatus for transmitting speech data according to claim 5, wherein the neural network prediction model is a multi-layer feedforward neural network, and the building module comprises:
the first submodule is used for establishing an initial neural network model through an error back propagation algorithm;
and the second sub-module is used for carrying out error back-propagation training on the initial neural network model through the selected packet loss-free network and the selected packet loss network to obtain a neural network prediction model for predicting the network packet loss rate.
7. The voice data transmission apparatus according to claim 5, wherein the scheme selection module comprises:
a third sub-module, configured to determine whether the current packet loss rate is greater than a preset first packet loss threshold;
a fourth sub-module, configured to select, when the third sub-module determines that the current packet loss rate is greater than the preset first packet loss threshold, a previous error correction coding scheme with a redundancy depth being a preset first depth value from a pre-stored coding scheme library as a coding scheme; when the third sub-module judges that the current packet loss rate is not greater than the preset first packet loss threshold, judging whether the current packet loss rate is greater than a preset second packet loss threshold, wherein the preset first packet loss threshold is greater than the preset second packet loss threshold;
a fifth sub-module, configured to select, when the fourth sub-module determines that the current packet loss rate is greater than the preset second packet loss threshold, a previous error correction coding scheme with a redundancy depth being a preset second depth value from the pre-stored coding scheme library as a coding scheme; and when the fourth sub-module judges that the current packet loss rate is not greater than the preset second packet loss threshold, selecting a previous error correction coding scheme with the redundancy depth as a preset third depth value from the pre-stored coding scheme library as the coding scheme.
8. A computer device, characterized in that it comprises a memory for storing a computer program and a processor for executing the computer program to make the computer device execute the voice data transmission method according to any one of claims 1 to 4.
9. A computer-readable storage medium, characterized in that it stores the computer program used in the computer device of claim 8.
CN201810981340.5A 2018-08-27 2018-08-27 Voice data transmission method and device Active CN109218083B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810981340.5A CN109218083B (en) 2018-08-27 2018-08-27 Voice data transmission method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810981340.5A CN109218083B (en) 2018-08-27 2018-08-27 Voice data transmission method and device

Publications (2)

Publication Number Publication Date
CN109218083A CN109218083A (en) 2019-01-15
CN109218083B true CN109218083B (en) 2021-08-13

Family

ID=64989261

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810981340.5A Active CN109218083B (en) 2018-08-27 2018-08-27 Voice data transmission method and device

Country Status (1)

Country Link
CN (1) CN109218083B (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110228071B (en) * 2019-06-06 2022-04-29 南京信息工程大学 Method for classifying various fused toxic gases suitable for chemical plant and inspection robot
CN110444224B (en) * 2019-09-09 2022-05-27 深圳大学 Voice processing method and device based on generative countermeasure network
WO2021046683A1 (en) * 2019-09-09 2021-03-18 深圳大学 Speech processing method and apparatus based on generative adversarial network
CN112511482A (en) * 2019-09-16 2021-03-16 华为技术有限公司 Media data transmission method, device and system
CN111314335B (en) * 2020-02-10 2021-10-08 腾讯科技(深圳)有限公司 Data transmission method, device, terminal, storage medium and system
CN112820306B (en) * 2020-02-20 2023-08-15 腾讯科技(深圳)有限公司 Voice transmission method, system, device, computer readable storage medium and apparatus
CN111312264B (en) * 2020-02-20 2023-04-21 腾讯科技(深圳)有限公司 Voice transmission method, system, device, computer readable storage medium and apparatus
CN111128203B (en) * 2020-02-27 2022-10-04 北京达佳互联信息技术有限公司 Audio data encoding method, audio data decoding method, audio data encoding device, audio data decoding device, electronic equipment and storage medium
CN111883173B (en) * 2020-03-20 2023-09-12 珠海市杰理科技股份有限公司 Audio packet loss repairing method, equipment and system based on neural network
CN112751648B (en) * 2020-04-03 2023-09-19 腾讯科技(深圳)有限公司 Packet loss data recovery method, related device, equipment and storage medium
CN111640442B (en) * 2020-06-01 2023-05-23 北京猿力未来科技有限公司 Method for processing audio packet loss, method for training neural network and respective devices
CN111953694B (en) * 2020-08-13 2021-07-23 南京百家云科技有限公司 Live broadcast-based packet loss compensation method and device
CN114374470A (en) * 2020-10-15 2022-04-19 华为技术有限公司 Data transmission method, system and computer readable storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101562910A (en) * 2008-04-18 2009-10-21 中国移动通信集团公司 Voice data transmission method as well as system and media gateway
CN102413378A (en) * 2011-11-02 2012-04-11 杭州电子科技大学 Adaptive neural network-based lost packet recovery method in video transmission
CN102571292A (en) * 2012-02-29 2012-07-11 浙江中控研究院有限公司 Communication maintenance scheme matching method, link monitor and wireless sensor network
CN102984495A (en) * 2012-12-06 2013-03-20 北京小米科技有限责任公司 Video image processing method and device
CN103684695A (en) * 2013-12-24 2014-03-26 北京新讯世纪信息技术有限公司 Method and system for data transmission
CN106937134A (en) * 2015-12-31 2017-07-07 深圳市潮流网络技术有限公司 A kind of coding method of data transfer, coding dispensing device and system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101562910A (en) * 2008-04-18 2009-10-21 中国移动通信集团公司 Voice data transmission method as well as system and media gateway
CN102413378A (en) * 2011-11-02 2012-04-11 杭州电子科技大学 Adaptive neural network-based lost packet recovery method in video transmission
CN102571292A (en) * 2012-02-29 2012-07-11 浙江中控研究院有限公司 Communication maintenance scheme matching method, link monitor and wireless sensor network
CN102984495A (en) * 2012-12-06 2013-03-20 北京小米科技有限责任公司 Video image processing method and device
CN103684695A (en) * 2013-12-24 2014-03-26 北京新讯世纪信息技术有限公司 Method and system for data transmission
CN106937134A (en) * 2015-12-31 2017-07-07 深圳市潮流网络技术有限公司 A kind of coding method of data transfer, coding dispensing device and system

Also Published As

Publication number Publication date
CN109218083A (en) 2019-01-15

Similar Documents

Publication Publication Date Title
CN109218083B (en) Voice data transmission method and device
US11227612B2 (en) Audio frame loss and recovery with redundant frames
CN102461040B (en) Systems and methods for preventing the loss of information within a speech frame
RU2765886C1 (en) Encoding and decoding of spectral peak positions
CN108777606B (en) Decoding method, apparatus and readable storage medium
US9805729B2 (en) Encoding device and method, decoding device and method, and program
CN103096053A (en) Mode-transforming encoding and decoding method and device
TWI776298B (en) Split gain shape vector coding
TWI306335B (en) System and method for blind transport format detection with cyclic redundancy check
KR20110043684A (en) Method, system, and apparatus for compression or decompression of digital signals
US8576910B2 (en) Parameter selection method, parameter selection apparatus, program, and recording medium
US7047186B2 (en) Voice decoder, voice decoding method and program for decoding voice signals
CN104541469A (en) Method and apparatus for error recovery using information related to the transmitter
US7693239B2 (en) Apparatus for decoding convolutional codes and associated method
CN107545899B (en) AMR steganography method based on unvoiced fundamental tone delay jitter characteristic
CN107077856B (en) Audio parameter quantization
CN103430233A (en) Encoder and method for predictively encoding, decoder and method for decoding, system and method for predictively encoding and decoding and predictively encoded information signal
KR19980081189A (en) Transmission rate prediction device and transmission rate prediction method
CN113409792A (en) Voice recognition method and related equipment thereof
WO2004112256A1 (en) Speech encoding device
CN112435675A (en) FEC-based audio coding method, device, equipment and medium
CN112669857B (en) Voice processing method, device and equipment
CN101866649B (en) Coding processing method and device, decoding processing method and device, communication system
CN117577121B (en) Diffusion model-based audio encoding and decoding method and device, storage medium and equipment
JP3218630B2 (en) High efficiency coding apparatus and high efficiency code decoding apparatus

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20210329

Address after: 510000 Room 202, 59 Jianzhong Road, Tianhe District, Guangzhou City, Guangdong Province

Applicant after: Guangzhou lieyou Information Technology Co.,Ltd.

Address before: 510000 Room 202, west block, 59 Jianzhong Road, software park, Tianhe District, Guangzhou City, Guangdong Province

Applicant before: GUANGZHOU AIPAI NETWORK TECHNOLOGY Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant