CN113744744A

CN113744744A - Audio coding method and device, electronic equipment and storage medium

Info

Publication number: CN113744744A
Application number: CN202110890333.6A
Authority: CN
Inventors: 崔承宗; 阮良; 陈功
Original assignee: Hangzhou Netease Zhiqi Technology Co Ltd
Current assignee: Hangzhou Netease Zhiqi Technology Co Ltd
Priority date: 2021-08-04
Filing date: 2021-08-04
Publication date: 2021-12-03
Anticipated expiration: 2041-08-04
Also published as: CN113744744B

Abstract

The present disclosure discloses an audio encoding method, an apparatus, an electronic device, and a storage medium, the method including: the method comprises the steps of obtaining a target audio frame and coding redundancy information at least comprising first redundancy indication information, conducting first coding processing on the target audio frame to obtain regular code stream data of the target audio frame, responding to the first redundancy indication information that the first audio frame is a redundancy frame of the target audio frame, conducting information source redundancy processing on the regular code stream data of the target audio frame and redundancy code stream data of the first audio frame to obtain coded data of the target audio frame and outputting the coded data, wherein the first audio frame is coded and is separated from the target audio frame by at least one audio frame, the redundancy code stream data of the first audio frame is obtained by conducting second coding processing on the first audio frame, and the audio compression degree corresponding to the second coding processing is higher than the audio compression degree corresponding to the first coding processing. Therefore, the selection of the redundant frame of the target audio frame is flexible, and the flexibility of audio coding is favorably improved.

Description

Audio coding method and device, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of audio processing technologies, and in particular, to an audio encoding method and apparatus, an electronic device, and a storage medium.

Background

The WebRTC is a real-time audio and video open source framework and comprises a plurality of audio codecs, and Opus is one of the commonly used audio codecs.

The Opus has the advantages of wide dynamic code rate coverage range, variable coding frame length, variable complexity, good network packet loss resistance, coverage of all frequency ranges from narrow bands to full frequency bands, suitability for speech and music scenes and the like. But the encoding flexibility of Opus still needs to be improved.

Disclosure of Invention

The embodiment of the disclosure provides an audio encoding method, an audio encoding device, an electronic device and a storage medium, which are used for improving the flexibility of audio encoding.

In a first aspect, an embodiment of the present disclosure provides an audio encoding method, including:

acquiring a target audio frame and coding redundancy information, wherein the coding redundancy information at least comprises first redundancy indication information, the first redundancy indication information is used for indicating whether a first audio frame is a redundancy frame of the target audio frame, and the first audio frame is coded and separated from the target audio frame by at least one audio frame;

performing first coding processing on the target audio frame to obtain conventional code stream data of the target audio frame;

in response to first redundancy indication information that the first audio frame is a redundant frame of the target audio frame, performing source redundancy processing on normal code stream data of the target audio frame and redundant code stream data of the first audio frame to obtain encoded data of the target audio frame, wherein the redundant code stream data of the first audio frame is obtained by performing second encoding processing on the first audio frame, and an audio compression degree corresponding to the second encoding processing is higher than an audio compression degree corresponding to the first encoding processing;

and outputting the coded data of the target audio frame.

In some possible embodiments, the first audio frame is separated from the target audio frame by one audio frame.

In some possible embodiments, the encoding redundancy information further includes second redundancy indication information indicating whether a second audio frame is a redundant frame of the target audio frame, the second audio frame being located between the first audio frame and the target audio frame, and further includes:

and in response to that the first audio frame is first redundancy indication information of the redundant frame of the target audio frame and the second audio frame is second redundancy indication information of the redundant frame of the target audio frame, performing source redundancy processing on the normal code stream data of the target audio frame, the redundant code stream data of the first audio frame and the redundant code stream data of the second audio frame to obtain encoded data of the target audio frame, and performing second encoding processing on the redundant code stream data of the second audio frame.

In some possible embodiments, performing source redundancy processing on the regular code stream data of the target audio frame, the redundant code stream data of the first audio frame, and the redundant code stream data of the second audio frame to obtain the encoded data of the target audio frame includes:

and performing information source redundancy processing on the conventional code stream data of the target audio frame, the redundant code stream data of the first audio frame and the redundant code stream data of the second audio frame by adopting a distance coding algorithm to obtain the coded data of the target audio frame.

In some possible embodiments, the method further comprises:

performing second coding processing on the target audio frame in response to first redundancy indication information that the first audio frame is a redundant frame of the target audio frame or second redundancy indication information that the second audio frame is a redundant frame of the target audio frame, so as to obtain redundant code stream data of the target audio frame;

and storing the redundant code stream data of the target audio frame.

In a second aspect, an embodiment of the present disclosure provides an audio encoding apparatus, including:

an obtaining module, configured to obtain a target audio frame and coding redundancy information, where the coding redundancy information at least includes first redundancy indication information, and the first redundancy indication information is used to indicate whether a first audio frame is a redundancy frame of the target audio frame, where the first audio frame has been coded and is separated from the target audio frame by at least one audio frame;

the encoding module is used for carrying out first encoding processing on the target audio frame to obtain conventional code stream data of the target audio frame;

a redundancy processing module, configured to perform, in response to first redundancy indication information that the first audio frame is a redundant frame of the target audio frame, source redundancy processing on normal code stream data of the target audio frame and redundant code stream data of the first audio frame to obtain encoded data of the target audio frame, where the redundant code stream data of the first audio frame is obtained by performing second encoding processing on the first audio frame, and an audio compression degree corresponding to the second encoding processing is higher than an audio compression degree corresponding to the first encoding processing;

and the output module is used for outputting the coded data of the target audio frame.

In some possible embodiments, the encoding redundancy information further includes second redundancy indication information indicating whether a second audio frame is a redundant frame of the target audio frame, the second audio frame being located between the first audio frame and the target audio frame, the redundancy processing module is further configured to:

In some possible embodiments, the redundant processing module is specifically configured to:

In some possible embodiments, the method further comprises a storage module:

the encoding module is further configured to perform second encoding processing on the target audio frame in response to first redundancy indication information that the first audio frame is a redundant frame of the target audio frame or second redundancy indication information that the second audio frame is a redundant frame of the target audio frame, so as to obtain redundant code stream data of the target audio frame;

the storage module is used for storing the redundant code stream data of the target audio frame.

In a third aspect, an embodiment of the present disclosure provides an electronic device, including: at least one processor, and a memory communicatively coupled to the at least one processor, wherein:

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the audio encoding method described above.

In a fourth aspect, embodiments of the present disclosure provide a storage medium, where instructions are executed by a processor of an electronic device, and the electronic device is capable of executing the audio encoding method.

In the embodiment of the disclosure, a target audio frame and coding redundancy information at least comprising first redundancy indication information are obtained, first coding processing is performed on the target audio frame to obtain regular code stream data of the target audio frame, and in response to that the first audio frame is the first redundancy indication information of the redundancy frame of the target audio frame, source redundancy processing is performed on the regular code stream data of the target audio frame and redundancy code stream data of the first audio frame to obtain coded data of the target audio frame and output the coded data, wherein the first audio frame is coded and separated from the target audio frame by at least one audio frame, the redundancy code stream data of the first audio frame is obtained by performing second coding processing on the first audio frame, and the audio compression degree corresponding to the second coding processing is higher than the audio compression degree corresponding to the first coding processing. Therefore, when the target audio frame is coded, the coded first audio frame separated from the target audio frame by at least one audio frame can be used as the redundant frame of the target audio frame, the selection of the redundant frame is flexible, the flexibility of audio coding is favorably improved, in addition, the compression degree of the redundant code stream data of the first audio frame subjected to the information source redundant processing is larger than that of the conventional code stream data of the target audio frame, and the code rate is favorably saved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this disclosure, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure and not to limit the disclosure. In the drawings:

fig. 1 is a flowchart of an audio encoding method according to an embodiment of the present disclosure;

fig. 2 is a flowchart of another audio encoding method provided by the embodiment of the present disclosure;

FIG. 3 is a schematic diagram of an SILK layer organization structure of a frame of the Opus code stream according to an embodiment of the disclosure;

fig. 4 is a schematic diagram of an audio encoding process provided by an embodiment of the present disclosure;

fig. 5 is a schematic diagram illustrating a compression result of code stream data related to each audio frame according to an embodiment of the present disclosure;

fig. 6 is a schematic structural diagram of an audio encoding apparatus according to an embodiment of the present disclosure;

fig. 7 is a schematic diagram of a hardware structure of an electronic device for implementing an audio encoding method according to an embodiment of the present disclosure.

Detailed Description

In order to improve the flexibility of audio coding, the embodiments of the present disclosure provide an audio coding method and apparatus, an electronic device, and a storage medium.

The preferred embodiments of the present disclosure will be described below with reference to the accompanying drawings of the specification, it being understood that the preferred embodiments described herein are merely for illustrating and explaining the present disclosure, and are not intended to limit the present disclosure, and that the embodiments and features of the embodiments in the present disclosure may be combined with each other without conflict.

To facilitate understanding of the present disclosure, the present disclosure relates to technical terms in which:

opus, a lossy vocoding format, was developed by xiph. org foundation and then standardized by the Internet Engineering Task Force (IETF), the standard format being defined in the RFC 6716 protocol.

SILK, an audio codec applied to Opus.

A Range encoding (Range encoding) algorithm, an entropy encoding method, can effectively improve the data compression rate.

Fig. 1 is a flowchart of an audio encoding method provided in an embodiment of the present disclosure, where the method includes the following steps.

In step S101, a target audio frame and encoded redundancy information including at least first redundancy indication information are acquired.

The first redundancy indication information is used for indicating whether the first audio frame is a redundant frame of the target audio frame, and the first audio frame is coded and separated from the target audio frame by at least one audio frame.

In an implementation, the first redundancy indication information may be a flag bit a. Taking the values of the flag bit a as 0 and 1 as an example, when the value of the flag bit a is 0, it indicates that the first audio frame is not a redundant frame of the target audio frame; when the value of the flag a is 1, it indicates that the first audio frame is a redundant frame of the target audio frame. Taking the values of the flag bit a as FALSE and TRUE, when the value of the flag bit a is FALSE, it indicates that the first audio frame is not a redundant frame of the target audio frame; when the value of the flag bit a is TRUE, it indicates that the first audio frame is a redundant frame of the target audio frame.

Assuming that the target audio frame is the Tth audio frame in the sequence of audio frames, the first audio frame may be the T-2 th audio frame, the T-3 rd audio frame, the T-4 th audio frame, etc.

In addition, the first audio frame may be separated from the target audio frame by one audio frame, i.e., when the target audio frame is the Tth audio frame in the sequence of audio frames, the first audio frame may be the T-2 th audio frame, considering that redundancy of the audio frames is more unnecessary the farther the audio frame is separated from the target audio frame. Thus, the target audio frame can be flexibly coded, and the function of redundant frames can be exerted to the maximum extent.

In step S102, a first encoding process is performed on the target audio frame to obtain regular code stream data of the target audio frame.

In step S103, in response to that the first audio frame is the first redundancy indication information of the redundant frame of the target audio frame, performing information source redundancy processing on the normal code stream data of the target audio frame and the redundant code stream data of the first audio frame to obtain encoded data of the target audio frame.

The redundant code stream data of the first audio frame is obtained by performing second encoding processing on the first audio frame, and the audio compression degree corresponding to the second encoding processing is higher than the audio compression degree corresponding to the first encoding processing.

Along the above example, the first redundancy indication information indicating that the first audio frame is a redundant frame of the target audio frame means that the flag bit a takes the value TRUE or 1.

In addition, for the same audio frame, the normal code stream data of the audio frame is obtained by performing a first encoding process on the audio frame, and the redundant code stream data of the audio frame is obtained by performing a second encoding process on the audio frame, where the audio compression degree corresponding to the second encoding process is higher than the audio compression degree corresponding to the first encoding process, that is, the data amount of the normal code stream data of the audio frame is greater than the data amount of the redundant code stream data of the audio frame, and the redundant code stream data of the audio frame is equivalent to a low-level version of the normal code stream data of the audio frame.

In step S104, the encoded data of the target audio frame is output.

In the embodiment of the disclosure, when the target audio frame is encoded, the encoded first audio frame, which is separated from the target audio frame by at least one audio frame, may be used as the redundant frame of the target audio frame, and the selection of the redundant frame is flexible, which is beneficial to improving the flexibility of audio encoding, and the degree of compression of the redundant code stream data of the first audio frame, which is subjected to the information source redundancy processing, is greater than that of the conventional code stream data of the target audio frame, which is also beneficial to saving the code rate.

Fig. 2 is a flowchart of another audio encoding method provided by the embodiment of the present disclosure, which includes the following steps.

In step S201, a target audio frame and encoded redundancy information including first redundancy indication information and second redundancy indication information are acquired.

The first redundancy indication information is used for indicating whether the first audio frame is a redundant frame of the target audio frame, and the first audio frame is coded and separated from the target audio frame by at least one audio frame; the second redundancy indication information is used for indicating whether a second audio frame is a redundant frame of the target audio frame, and the second audio frame is positioned between the first audio frame and the target audio frame.

The representation of the first redundancy indication information is synchronized with the introduction of step S101, and is not described herein again. Similarly, the second redundancy indication information may be a flag bit B. Taking the values of the flag bit B as 0 and 1 as an example, when the value of the flag bit B is 0, it indicates that the second audio frame is not a redundant frame of the target audio frame; when the value of the flag B is 1, it indicates that the second audio frame is a redundant frame of the target audio frame. Taking the values of the flag bit B as FALSE and TRUE, when the value of the flag bit B is FALSE, it indicates that the second audio frame is not a redundant frame of the target audio frame; when the value of the flag bit B is TRUE, it indicates that the second audio frame is a redundant frame of the target audio frame.

In particular, when the first audio frame is spaced from the target audio frame by more than one audio frame, the second audio frame may be any audio frame located between the first audio frame and the target audio frame; when the first audio frame is separated from the target audio frame by one audio frame, the second audio frame is the audio frame between the first audio frame and the target audio frame.

In step S202, a first encoding process is performed on the target audio frame to obtain regular code stream data of the target audio frame.

In step S203, in response to the first redundant indication information that the first audio frame is a redundant frame of the target audio frame and the second redundant indication information that the second audio frame is a redundant frame of the target audio frame, the normal code stream data of the target audio frame, the redundant code stream data of the first audio frame, and the redundant code stream data of the second audio frame are subjected to source redundant processing to obtain encoded data of the target audio frame.

The redundant code stream data of the first audio frame is obtained by performing second coding processing on the first audio frame, the redundant code stream data of the second audio frame is obtained by performing second coding processing on the second audio frame, and the audio compression degree corresponding to the second coding processing is higher than the audio compression degree corresponding to the first coding processing.

Along the above example, the first redundancy indication information indicating that the first audio frame is a redundant frame of the target audio frame means that the flag bit a takes the value of TRUE or 1, and the second redundancy indication information indicating that the second audio frame is a redundant frame of the target audio frame means that the flag bit B takes the value of TRUE or 1.

In specific implementation, a Range encoding algorithm can be adopted to perform information source redundancy processing on the normal code stream data of the target audio frame, the redundant code stream data of the first audio frame and the redundant code stream data of the second audio frame to obtain the encoded data of the target audio frame. The distance coding algorithm has good data compression performance, so that the data compression rate can be improved, and the audio coding efficiency can be improved.

In step S204, the encoded data of the target audio frame is output.

In the embodiment of the disclosure, when a target audio frame is encoded, source redundancy processing may be performed on a first audio frame which is spaced from the target audio frame by at least one audio frame and is encoded, and a second audio frame which is located between the target audio frame and the first audio frame, and source redundancy processing may be performed on corresponding code stream data of the three audio frames by using a Range encoding algorithm, so that not only encoding flexibility is good, but also data compression performance is good.

In addition, in any of the above embodiments, the second encoding processing may be performed on the target audio frame in response to the first redundancy indication information that the first audio frame is a redundant frame of the target audio frame or the second redundancy indication information that the second audio frame is a redundant frame of the target audio frame, so as to obtain redundant code stream data of the target audio frame, and store the redundant code stream data of the target audio frame. Therefore, when the target audio frame is used as a redundant frame of other audio frames, the redundant code stream data of the target audio frame can be directly acquired, and the information source redundant processing is carried out on the redundant code stream data and the conventional code stream data of the corresponding audio frame, so that the coding speed is improved as much as possible.

The following describes the embodiments of the present disclosure by taking Opus as an example.

Taking monaural coding as an example, fig. 3 is a schematic diagram of an organizational structure of a SILK layer of a frame Opus code stream according to an embodiment of the present disclosure, which includes: VAD flags for indicating whether the target audio Frame is the boundary of the audio Frame sequence, LBRR (T-1) flag for indicating whether the T-1 th audio Frame (i.e. the second audio Frame) is a redundant Frame of the T-th audio Frame (i.e. the target audio Frame), LBRR (T-2) flag for indicating whether the T-2 th audio Frame (i.e. the first audio Frame) is a redundant Frame of the T-th audio Frame, LBRR (T-1) flag for indicating whether the T-2 th audio Frame (i.e. the first audio Frame) is a redundant Frame, and the coding Frame lengths of LBRR (T-1) Frame, LBRR (T-2) Frame and RegularSILK Frame may be 20 ms.

Fig. 4 is a schematic diagram of an audio encoding process provided by an embodiment of the present disclosure, which may perform audio encoding on a sequence of audio frames to be sent to a client according to the process shown in fig. 4, where the process includes the following steps:

in step S401, a sequence of audio frames is acquired.

In step S402, judging whether LBRR (T-2) flag is TRUE, if not, entering step S403; if yes, the process proceeds to step S409.

When the LBRR (T-2) flag is TRUE, the T-2 th audio frame is a redundant frame of the T-th audio frame; and when the LBRR (T-2) flag is not TRUE, namely FALSE, the T-2 th audio frame is not the redundant frame of the T-th audio frame.

In step S403, judging whether the LBRR (T-1) flag is TRUE, if not, entering step S404; if yes, the process proceeds to step S406.

When the LBRR (T-1) flag is TRUE, the T-1 th audio frame is a redundant frame of the T-th audio frame; and when the LBRR (T-1) flag is not TRUE, namely FALSE, the T-1 th audio frame is not the redundant frame of the T-th audio frame.

In step S404, the normal stream data of the T-th audio frame is encoded.

In step S405, the regular code stream data of the T-th audio frame is taken as the encoded data of the T-th audio frame.

In step S406, redundant code stream data and normal code stream data of the tth audio frame are encoded.

In step S407, it is determined whether the T-th audio frame is the first audio frame, if so, the process proceeds to step S405; if not, the process proceeds to step S408.

In step S408, the information source redundancy processing is performed on the normal code stream data of the T-th audio frame and the redundant code stream data of the T-1 th audio frame to obtain the encoded data of the T-th audio frame.

In step S409, judging whether an LBRR (T-1) flag is TRUE, if so, entering step S410; if yes, the process proceeds to step S415.

In step S410, redundant code stream data and regular code stream data of the tth audio frame are encoded.

In step S411, it is determined whether the T-th audio frame is the first audio frame, if yes, the process proceeds to step S405; if not, the process proceeds to step S412.

In step S412, it is determined whether the T-th audio frame is the second audio frame, if yes, the process proceeds to step S413; if not, the process proceeds to step S414.

In step S413, the normal code stream data of the T audio frames and the redundant code stream data of the T-1 th audio frame are subjected to information source redundancy processing, so as to obtain encoded data of the T audio frame.

In step S414, the normal code stream data of the T audio frames, the redundant code stream data of the T-1 th audio frame, and the redundant code stream data of the T-2 th audio frame are subjected to information source redundancy processing to obtain encoded data of the T audio frame.

In specific implementation, a Range Encoding algorithm can be used to perform source redundancy processing on the normal code stream data of the T-th audio frame, the redundant code stream data of the T-1 th audio frame, and the redundant code stream data of the T-2 th audio frame, and fig. 5 is a schematic diagram of a compression result of related code stream data of each audio frame provided by the embodiment of the disclosure, where 100 represents the normal code stream data of the T-th audio frame, 99(T-1) represents the redundant code stream data of the T-1 th audio frame, and 98(T-2) represents the redundant code stream data of the T-2 th audio frame.

In step S415, the redundant code stream data and the normal code stream data of the T-th audio frame are encoded.

In step S416, it is determined whether the T-th audio frame is the first audio frame or the second audio frame, if yes, the process proceeds to step S405; if not, the process proceeds to step S417.

In step S417, the normal code stream data of the T audio frames and the redundant code stream data of the T-2 th audio frame are subjected to information source redundancy processing to obtain encoded data of the T audio frame.

In step S418, the encoded data of the T-th audio frame is output.

In the embodiment of the disclosure, when the Tth audio frame is coded, the flexible protection of the T-1 th audio frame and the T-2 th audio frame can be realized in the in-band forward error correction redundancy, and a more flexible and more powerful Opus in-band forward error correction redundancy strategy is provided. Moreover, on the premise of keeping the level of packet loss resistance unchanged, the Opus internal efficient Range encoding technology is utilized to reduce the code rate consumption of joint optimization of the information source and the channel, and the purpose of saving the code rate is achieved.

Based on the same technical concept, the embodiments of the present disclosure further provide an audio encoding apparatus, and the principle of the audio encoding apparatus to solve the problem is similar to the audio encoding method, so the implementation of the audio encoding apparatus can refer to the implementation of the audio encoding method, and repeated details are not repeated. Fig. 6 is a schematic structural diagram of an audio encoding apparatus provided in an embodiment of the present disclosure, including an obtaining module 601, an encoding module 602, a redundancy processing module 603, and an output module 604.

An obtaining module 601, configured to obtain a target audio frame and coding redundancy information, where the coding redundancy information at least includes first redundancy indication information, and the first redundancy indication information is used to indicate whether a first audio frame is a redundant frame of the target audio frame, where the first audio frame has been coded and is separated from the target audio frame by at least one audio frame;

an encoding module 602, configured to perform a first encoding process on the target audio frame to obtain conventional code stream data of the target audio frame;

a redundancy processing module 603, configured to perform, in response to that the first audio frame is first redundancy indication information of a redundant frame of the target audio frame, source redundancy processing on normal code stream data of the target audio frame and redundant code stream data of the first audio frame to obtain encoded data of the target audio frame, where the redundant code stream data of the first audio frame is obtained by performing second encoding processing on the first audio frame, and an audio compression degree corresponding to the second encoding processing is higher than an audio compression degree corresponding to the first encoding processing;

an output module 604, configured to output the encoded data of the target audio frame.

In some possible embodiments, the encoding redundancy information further includes second redundancy indication information, the second redundancy indication information is used to indicate whether a second audio frame is a redundant frame of the target audio frame, the second audio frame is located between the first audio frame and the target audio frame, and the redundancy processing module 603 is further used to:

In some possible embodiments, the redundancy processing module 603 is specifically configured to:

In some possible implementations, the storage module 605 is further included:

the encoding module 602 is further configured to perform a second encoding process on the target audio frame in response to first redundancy indication information that the first audio frame is a redundant frame of the target audio frame or second redundancy indication information that the second audio frame is a redundant frame of the target audio frame, so as to obtain redundant code stream data of the target audio frame;

the storage module 605 is configured to store redundant code stream data of the target audio frame.

The division of the modules in the embodiments of the present disclosure is schematic, and only one logical function division is provided, and in actual implementation, there may be another division manner, and in addition, each functional module in each embodiment of the present disclosure may be integrated in one processor, may also exist alone physically, or may also be integrated in one module by two or more modules. The coupling of the various modules to each other may be through interfaces that are typically electrical communication interfaces, but mechanical or other forms of interfaces are not excluded. Thus, modules described as separate components may or may not be physically separate, may be located in one place, or may be distributed in different locations on the same or different devices. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.

Fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure, where the electronic device includes a transceiver 701 and a processor 702, and the processor 702 may be a Central Processing Unit (CPU), a microprocessor, an application specific integrated circuit, a programmable logic circuit, a large scale integrated circuit, or a digital Processing Unit. The transceiver 701 is used for data transmission and reception between the electronic device and other devices.

The electronic device may further comprise a memory 703 for storing software instructions executed by the processor 702, and of course may also store some other data required by the electronic device, such as identification information of the electronic device, encryption information of the electronic device, user data, etc. The Memory 703 may be a Volatile Memory (Volatile Memory), such as a Random-Access Memory (RAM); the Memory 703 may also be a Non-Volatile Memory (Non-Volatile Memory) such as a Read-Only Memory (ROM), a Flash Memory (Flash Memory), a Hard Disk (HDD) or a Solid-State Drive (SSD), or the Memory 703 may be any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited thereto. The memory 703 may be a combination of the above memories.

The specific connection medium between the processor 702, the memory 703 and the transceiver 701 is not limited in the embodiments of the present disclosure. In fig. 7, the embodiment of the present disclosure is described by taking only the case where the memory 703, the processor 702, and the transceiver 701 are connected by the bus 704 as an example, the bus is shown by a thick line in fig. 7, and the connection manner between other components is merely schematically described and is not limited thereto. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 7, but this is not intended to represent only one bus or type of bus.

The processor 702 may be dedicated hardware or a processor running software, and when the processor 702 may run software, the processor 702 reads software instructions stored in the memory 703 and executes the audio encoding method referred to in the foregoing embodiments under the drive of the software instructions.

The disclosed embodiments also provide a storage medium, and when instructions in the storage medium are executed by a processor of an electronic device, the electronic device is capable of executing the audio encoding method referred to in the foregoing embodiments.

In some possible embodiments, the various aspects of the audio encoding method provided by the present disclosure may also be implemented in the form of a program product, which includes program code therein, and when the program product is run on an electronic device, the program code is configured to cause the electronic device to execute the audio encoding method referred to in the foregoing embodiments.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable Disk, a hard Disk, a RAM, a ROM, an Erasable Programmable Read-Only Memory (EPROM), a flash Memory, an optical fiber, a Compact Disk Read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The program product for audio encoding in the embodiments of the present disclosure may be a CD-ROM and include program code, and may be run on a computing device. However, the program product of the present disclosure is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

A readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, Radio Frequency (RF), etc., or any suitable combination of the foregoing.

Program code for carrying out operations of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In situations involving remote computing devices, the remote computing devices may be connected to the user computing device over any kind of Network, such as a Local Area Network (LAN) or Wide Area Network (WAN), or may be connected to external computing devices (e.g., over the internet using an internet service provider).

It should be noted that although several units or sub-units of the apparatus are mentioned in the above detailed description, such division is merely exemplary and not mandatory. Indeed, the features and functions of two or more units described above may be embodied in one unit, in accordance with embodiments of the present disclosure. Conversely, the features and functions of one unit described above may be further divided into embodiments by a plurality of units.

Further, while the operations of the disclosed methods are depicted in the drawings in a particular order, this does not require or imply that these operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.

As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.

The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the disclosure. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present disclosure have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the disclosure.

It will be apparent to those skilled in the art that various changes and modifications can be made in the present disclosure without departing from the spirit and scope of the disclosure. Thus, if such modifications and variations of the present disclosure fall within the scope of the claims of the present disclosure and their equivalents, the present disclosure is intended to include such modifications and variations as well.

Claims

1. An audio encoding method, comprising:

and outputting the coded data of the target audio frame.

2. The method of claim 1, wherein the first audio frame is separated from the target audio frame by one audio frame.

3. The method of claim 1 or 2, wherein the encoding redundancy information further comprises second redundancy indication information indicating whether a second audio frame is a redundant frame of the target audio frame, the second audio frame being located between the first audio frame and the target audio frame, further comprising:

4. The method as claimed in claim 3, wherein performing source redundancy processing on the regular code stream data of the target audio frame, the redundant code stream data of the first audio frame, and the redundant code stream data of the second audio frame to obtain the encoded data of the target audio frame comprises:

5. The method of claim 3, further comprising:

and storing the redundant code stream data of the target audio frame.

6. An audio encoding apparatus, comprising:

7. The apparatus of claim 6, wherein the first audio frame is separated from the target audio frame by one audio frame.

8. The apparatus of claim 6 or 7, wherein the encoded redundancy information further comprises second redundancy indication information indicating whether a second audio frame is a redundant frame of the target audio frame, the second audio frame being located between the first audio frame and the target audio frame, the redundancy processing module further configured to:

9. An electronic device, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein:

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-5.

10. A storage medium, wherein instructions in the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform the method of any of claims 1-5.