WO2021179321A1 - Audio data processing method, electronic device and computer-readable storage medium - Google Patents

Audio data processing method, electronic device and computer-readable storage medium Download PDF

Info

Publication number
WO2021179321A1
WO2021179321A1 PCT/CN2020/079342 CN2020079342W WO2021179321A1 WO 2021179321 A1 WO2021179321 A1 WO 2021179321A1 CN 2020079342 W CN2020079342 W CN 2020079342W WO 2021179321 A1 WO2021179321 A1 WO 2021179321A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio
segments
video
data
multiple audio
Prior art date
Application number
PCT/CN2020/079342
Other languages
French (fr)
Chinese (zh)
Inventor
周事成
薛政
Original Assignee
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司 filed Critical 深圳市大疆创新科技有限公司
Priority to CN202080004659.8A priority Critical patent/CN112771880A/en
Priority to PCT/CN2020/079342 priority patent/WO2021179321A1/en
Publication of WO2021179321A1 publication Critical patent/WO2021179321A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/433Content storage operation, e.g. storage operation in response to a pause request, caching operations
    • H04N21/4334Recording operations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/76Television signal recording

Definitions

  • the present disclosure relates to the field of computer technology, and more specifically, to an audio data processing method, an electronic device, and a computer-readable storage medium.
  • the recorded audio and video files are usually saved in segments. Segmented storage can ensure that during the audio and video recording process, even if the audio and video files are damaged, the damaged file is only one segment, not all the audio and video files.
  • segmented audio and video files For example, during editing or playback, the segmented audio and video files need to be synthesized to obtain a complete audio and video file.
  • audio files and video files need to be synthesized separately.
  • the audio decoding algorithm will cause the synthesized audio file to have audio discontinuity, which will cause the synthesized audio file to have a freeze or abnormal hearing.
  • an audio data processing method including: acquiring a plurality of audio coding segments, the plurality of audio coding segments are recorded by audio and video equipment and saved in segments; The audio coding segments are spliced to obtain spliced data; the spliced data is decoded to obtain audio synthesis data of the multiple audio coding segments.
  • the decoding algorithm depends on the characteristics of the preceding and following frame data. The resulting lack of decoded data, thereby avoiding audio distortion caused by discontinuity of the obtained audio synthesis data at the splicing place.
  • an electronic device including: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to execute the executable instructions
  • the audio data processing method includes: acquiring a plurality of audio encoding segments, the plurality of audio encoding segments are recorded by audio and video equipment and stored in segments; Splicing to obtain spliced data; decoding the spliced data to obtain audio synthesis data of the audio coding segment.
  • a computer-readable storage medium on which a computer program is stored, wherein the computer program is characterized in that, when the computer program is executed by a processor, the audio system described in the first aspect of the embodiments of the present disclosure is implemented. Data processing method.
  • Fig. 1 is an architecture diagram of an audio data processing system according to an exemplary embodiment of the present disclosure
  • Fig. 2 is a flowchart of an audio data processing method according to an exemplary embodiment of the present disclosure
  • Fig. 3 is a sub-flow chart of step S12 in an exemplary embodiment of the present disclosure.
  • Fig. 4 is a sub-flow chart of step S11 in an exemplary embodiment of the present disclosure.
  • Fig. 5 is a sub-flow chart of step S12 in an exemplary embodiment of the present disclosure.
  • Fig. 6 is a flowchart of an audio data processing method according to an exemplary embodiment of the present disclosure
  • Fig. 7 is a sub-flow chart of step S111 in an exemplary embodiment of the present disclosure.
  • Fig. 8 is a sub-flow chart of step S111 in an exemplary embodiment of the present disclosure.
  • Fig. 9 is a schematic diagram of an audio data processing method according to an exemplary embodiment of the present disclosure.
  • Fig. 10 is an effect comparison diagram of an audio data processing method according to another exemplary embodiment of the present disclosure.
  • FIG. 11 is a block diagram of an electronic device according to an exemplary embodiment of the present disclosure.
  • Fig. 1 is an architectural diagram of an audio data processing system according to an exemplary embodiment of the present disclosure.
  • the terminal device 101 and the terminal device 102 are connected to the server 105 through a network 104, and the audio and video device 103 is connected to the terminal device 101 and the terminal device 102 through the network 104.
  • the terminal devices 101 and 102 may be, for example, but not limited to, mobile phones, computers, tablet computers, handheld terminals, and the like.
  • the server 105 may be a server that provides various services, for example, a background management server that provides support for the audio data processing system operated by the user using the terminal devices 101 and 102 (just an example).
  • the back-end server can analyze and process the received multiple audio coding segments and other data, and feed back the processing result (for example, audio synthesis data—just an example) to the terminal device.
  • the terminal device 101 (or the terminal device 102) can, for example, obtain multiple audio coding segments, which are recorded by the audio and video device 103 and saved in segments; the terminal device 101 can, for example, The two audio encoding segments are spliced to obtain spliced data; the terminal device 101 may, for example, decode the spliced data to obtain audio synthesis data of the multiple audio encoding segments.
  • the terminal device 101 (may also be the terminal device 102) can obtain multiple audio and video mixed stream data segments from the audio and video device 103, and perform audio extraction on the multiple audio and video mixed stream data segments respectively to obtain the multiple audio coded segments.
  • the terminal device 101 can receive an editing instruction sent by a target object, the editing instruction includes editing object information; acquiring multiple audio and video mixed stream data segments corresponding to the editing object information from the audio and video device 103, the multiple The audio and video mixed stream data segment is recorded by the audio and video device 103 and saved in segments.
  • the terminal device 101 (may also be the terminal device 102) can receive the download instruction sent by the target object; download multiple audio and video mixed stream data segments corresponding to the download instruction from the audio and video device 103 and save them locally.
  • the server 105 may be an entity server, or may be composed of multiple servers, for example, a part of the server 105 may be used as an audio data processing task submission system in the present disclosure for obtaining tasks that will execute audio data processing commands; and A part of the server 105 can also be used, for example, as the audio data processing system in the present disclosure, for obtaining multiple audio coding segments, the multiple audio coding segments are recorded by audio and video equipment and saved in segments; The multiple audio encoding segments are spliced to obtain spliced data; the spliced data is decoded to obtain audio synthesis data of the multiple audio encoding segments.
  • the terminal device 101/terminal device 102 and the audio and video device 103 can transmit data through wireless transmission, such as WiFi, Bluetooth, zigbee, and so on.
  • the terminal device 101/the terminal device 102 and the server 105 can communicate with each other through traditional 4G, 5G, WiFi, or the Internet.
  • Fig. 2 is a flowchart of an audio data processing method according to an exemplary embodiment of the present disclosure.
  • the audio data processing method provided by the embodiments of the present disclosure can be executed by any electronic device with computing and processing capabilities, such as the terminal devices 101 and 102 and/or the server 105.
  • the audio data processing method 20 provided by the embodiment of the present disclosure may include:
  • Step S21 Obtain multiple audio coding segments, the multiple audio coding segments are recorded by the audio and video equipment and stored in segments.
  • Step S22 splicing multiple audio coding segments to obtain spliced data.
  • Step S23 Decoding the spliced data to obtain audio synthesis data of multiple audio coding segments.
  • multiple audio coding segments can be obtained by downloading after communicating with the audio and video equipment.
  • the process of communicating and downloading with the audio and video equipment can be triggered by a local designated command, or can be triggered by a clock cycle. , It can also be triggered by a download instruction actively sent by the audio and video device, which is not specifically limited in the present disclosure.
  • the audio and video equipment can be, for example, but not limited to, video recording equipment, unmanned aerial vehicles (equipped with cameras), and the like.
  • the encoding format of the audio encoding segment may be one of Adaptive Multi-Rate (AMR), Advanced Audio Coding (AAC), OPUS audio encoding format, etc., which are not included in this disclosure. No special restrictions.
  • step S22 the splicing method for splicing multiple audio coding segments may be end-to-end splicing.
  • the decoding mode may be an audio decoding mode, wherein the audio decoding mode corresponding to the audio coding format can be selected according to the different audio coding format.
  • the decoding algorithm depends on the preceding and following frames. The lack of decoded data caused by the characteristics of the data, thereby avoiding audio distortion caused by discontinuity in the splicing of the obtained audio synthesis data.
  • Fig. 3 is a sub-flow chart of step S22 in an exemplary embodiment of the present disclosure.
  • step S22 may include:
  • Step S221 Sort the multiple audio coding segments according to the arrangement information of the multiple audio coding segments.
  • step S222 the multiple audio coding segments are spliced head and tail according to the sorting result to obtain spliced data.
  • the arrangement information of the multiple audio coding segments may be one of recording time information and number information of the multiple audio coding segments.
  • the recording time information of the audio coding segment refers to the recording time information of the start position, the end position, or the designated middle position of the audio coding segment during the recording process.
  • the number information of the audio code segment refers to the number information added by the recording device according to the recording order of the audio code segment during the recording process.
  • the arrangement information of the multiple audio encoding segments is recording time information
  • step S221 may include: sorting the multiple audio encoding segments in chronological order according to the recording time information of the multiple audio encoding segments .
  • the arrangement information of the multiple audio encoding segments is number information
  • step S221 may include: sorting the multiple audio encoding segments in a number order according to the number information of the multiple audio encoding segments.
  • Fig. 4 is a sub-flow chart of step S21 in an exemplary embodiment of the present disclosure.
  • step S21 may include:
  • Step S211 Obtain multiple audio and video mixed stream data segments.
  • Step S212 Perform audio extraction on multiple audio and video mixed stream data segments respectively to obtain multiple audio coded segments.
  • multiple audio and video mixed stream data segments may be recorded by audio and video equipment and saved in segments.
  • audio and video data with a large storage space in order to reduce the damage of the video file due to factors such as recording equipment failure, it can be segmented during the recording process to obtain multiple audio and video mixed stream data segments.
  • Multiple audio and video mixed stream data segments can be downloaded after communicating with audio and video equipment.
  • the process of communicating and downloading with audio and video equipment can be triggered by a local designated command, or triggered by a clock cycle, or by audio
  • the download instruction actively sent by the video device triggers execution, and this disclosure does not specifically limit this.
  • the audio and video equipment can be, for example, but not limited to, video recording equipment, unmanned aerial vehicles (equipped with cameras), and the like.
  • step S212 after audio extraction is performed on each audio-video mixed stream data segment, the audio coding segment of each audio-video mixed stream data segment can be obtained.
  • the multiple audio coding segments may be, for example, x1(t), x2(t), x3(t), etc., 0 ⁇ t ⁇ T, and T is the segment period.
  • Fig. 9 is a schematic diagram of an audio data processing method according to an exemplary embodiment of the present disclosure.
  • steps S22 and S23 based on this embodiment, multiple audio coding segments x1(t), x2(t), x3(t), etc. are spliced to obtain spliced data x(t),
  • the spliced data x(t) is decoded to obtain audio synthesis data y(t) of multiple audio coding segments x1(t), x2(t), x3(t), etc.
  • Fig. 5 is a sub-flow chart of step S22 in an exemplary embodiment of the present disclosure.
  • step S22 may include:
  • Step S51 Determine the arrangement information of the multiple audio coding segments according to the arrangement information of the multiple audio and video mixed stream data segments.
  • Step S52 Sort the multiple audio coding segments according to the arrangement information of the multiple audio coding segments.
  • step S53 the multiple audio coding segments are spliced head to tail according to the sorting result to obtain spliced data.
  • the arrangement information of multiple audio and video mixed stream data segments is one of recording time information and number information.
  • the recording time information of the audio and video mixed stream data segment refers to the recording time information of the start position, the end position, or a designated position in the middle of the audio and video mixed stream data segment during the recording process.
  • the number information of the audio and video mixed stream data segment refers to the number information added by the recording device according to the recording order of the audio and video mixed stream data segment during the recording process.
  • the arrangement information of each audio-video mixed stream data segment may be used as the arrangement information of the audio coded segment obtained by audio extraction of each audio-video mixed stream data segment. For example, audio extraction is performed on the audio and video mixed stream data segment A to obtain the audio encoding end a, and the arrangement information of the audio and video mixed stream data segment A is used as the arrangement information of the audio encoding segment a.
  • step S52 when the arrangement information is recording time information, the multiple audio and video mixed stream data segments may be sorted in chronological order according to the recording time information of the multiple audio and video mixed stream data segments.
  • the arrangement information is number information
  • the multiple audio and video mixed stream data segments can be sorted according to the number order according to the number information of the multiple audio and video mixed stream data segments.
  • Fig. 6 is a flowchart of an audio data processing method according to an exemplary embodiment of the present disclosure.
  • the audio data processing method based on the foregoing embodiment may further include:
  • Step S61 Perform video extraction on multiple audio and video mixed stream data segments respectively to obtain multiple video coded segments.
  • Step S62 Generate video synthesis data according to multiple video coding segments.
  • Step S63 Generate audio and video synthesized data according to the audio synthesized data and the video synthesized data.
  • step S62 multiple video coding segments may be spliced to generate video synthesis data.
  • Fig. 7 is a sub-flow chart of step S211 in an exemplary embodiment of the present disclosure.
  • step S211 may include:
  • Step S2111 Receive an editing instruction sent by the target object, where the editing instruction includes editing object information.
  • Step S2112 Obtain multiple audio and video mixed stream data segments corresponding to the editing object information from the audio and video equipment.
  • the multiple audio and video mixed stream data segments are recorded by the audio and video equipment and stored in segments.
  • the target object may be, for example, an operation object of the execution device of the audio data processing method of the embodiment of the present disclosure.
  • the target object may be the operating user of the terminal device 101 or 102.
  • the terminal device 101 or 102 may generate an editing instruction according to the preset operation.
  • the target object may be the operating user of the terminal device 101 or 102.
  • the terminal device 101 or 102 can generate an editing instruction according to the preset operation and send it to the server 105 via the network 104.
  • the editing object information is used to determine multiple audio and video mixed stream data segments, and the editing object information may be identification information, storage address information, etc. of the multiple audio and video mixed stream data segments. For example, if the audio and video equipment records and saves multiple audio and video mixed stream data segments z1(t), z2(t), z3(t)... the identification information is z, then the editing object information can include identification information z . For another example, if the storage address information of multiple audio and video mixed stream data segments z1(t), z2(t), z3(t)... is C: ⁇ download, the editing object information may include the storage address information C: ⁇ download.
  • the audio and video equipment may be, for example, but not limited to, an unmanned aerial vehicle (equipped with a camera), a video recorder, and the like.
  • the execution device of the audio data processing method of the embodiment of the present disclosure may communicate with the audio and video device to obtain multiple audio and video mixed stream data segments corresponding to the editing object information through the communication interface.
  • step S2112 when the audio and video equipment records and saves in segments to obtain multiple audio and video mixed stream data segments, the recorded audio and video can be segmented according to the segment period to obtain multiple audio and video mixed stream data segments.
  • the audio data processing method may further include: performing an editing operation on the audio and video synthesized data in response to the editing instruction to obtain the edited audio and video synthesized data.
  • the editing instructions are used to perform audio and video editing operations on the audio and video synthesized data.
  • the editing operation included in the editing instruction may be the audio data processing method of the embodiment of the present disclosure.
  • the editing instruction is a splicing instruction
  • the terminal device obtains the audio and video mixed stream data segment corresponding to the editing object information input by the user from the audio and video equipment according to the splicing instruction input by the user, and obtains multiple audio and video mixed streams corresponding to the editing object information.
  • the audio data processing method of the embodiment of the present disclosure can be executed on the multiple audio and video mixed stream data segments according to the editing instruction to obtain the spliced audio and video synthesis data.
  • the audio and video synthesis data can also be processed according to other editing operations input by the user, and the other editing operations can be color correction, cutting, speed changing, reverse playback, copying, and so on.
  • the audio and video synthesized data can be color-tuned, cut, shifted, reversed, and copied according to other editing instructions to obtain the edited audio and video synthesized data.
  • Fig. 8 is a sub-flow chart of step S211 in an exemplary embodiment of the present disclosure.
  • step S211 may include:
  • Step S81 receiving a download instruction sent by the target object.
  • Step S82 Download multiple audio and video mixed stream data segments corresponding to the download instruction from the audio and video equipment and save them locally.
  • the target object may be, for example, an operation object of the execution device of the audio data processing method of the embodiment of the present disclosure.
  • the execution device of the audio data processing method of the embodiment of the present disclosure is the terminal device 101 or 102
  • the target object may be the operating user of the terminal device 101 or 102.
  • the terminal device 101 or 102 may generate a download instruction according to the preset operation.
  • the target object may be the operating user of the terminal device 101 or 102.
  • the terminal device 101 or 102 may generate a download instruction according to the preset operation and send it to the server 105 via the network 104.
  • multiple audio and video mixed stream data segments may be obtained by downloading after communicating with audio and video equipment, which is not specifically limited in the present disclosure.
  • the audio and video equipment includes, but is not limited to, an unmanned aerial vehicle (equipped with a camera), a video recorder, and the like.
  • the format of the multiple audio encoding segments is one of the AMR audio encoding format, the AAC audio encoding format, and the OPUS audio encoding format.
  • an editing instruction sent by the target object may also be received, and the editing instruction includes editing object information.
  • an editing operation is performed on the audio and video synthesized data corresponding to the editing object information to obtain the edited audio and video synthesized data.
  • Editing operations can be, for example, but not limited to, splicing, toning, cutting, shifting, rewinding, copying, etc.
  • Fig. 10 is an effect comparison diagram of an audio data processing method according to another exemplary embodiment of the present disclosure.
  • the original audio data is 16kHz, 16bits single-channel data, and the step size is 20ms, and the mode 8 configuration is adopted. That is, the original data is encoded once every 20ms of data and encoded into 61 binary data.
  • Fig. 10(a) is a time-frequency diagram of original audio data of an exemplary embodiment of the present disclosure.
  • the original audio data is single-frequency data with a duration of 1s, corresponding to 50 frames.
  • the audio synthesis data 101 obtained by combining the decoded data (Pulse-code Modulation, PCM, pulse code modulation) decoded independently of the audio encoding segment amr1 and the audio encoding segment amr2 is shown in FIG. 10(b).
  • the audio coding segment amr1 and the audio coding segment amr2 are spliced according to the audio data processing method of the embodiment of the present disclosure and then decoded to obtain the audio synthesis data 102 as shown in FIG. 11(c). Comparing Figs. 10(b) and (c), it can be seen that the audio synthesis data obtained after splicing and decoding shown in Fig. 10(c) is closer to the original audio data. The audio synthesis data obtained by decoding and splicing shown in Figure 10(b) has huge distortion compared to Figure 10(c).
  • the AMR codec will use historical information, that is, the codec of the current frame will use the previous The data information of one frame or even longer ago, and because the audio is stored in segments, the device decodes the audio of each segment separately, and the first audio data of the next segment cannot be obtained when the previous audio data is decoded. Therefore, when decoding the first audio data of the next frame, there will be a problem of decoding errors.
  • the audio segments are spliced first, and then decoded after splicing. The error problem of segmented decoding appears, and the decoding is more reliable.
  • Electronic devices may include, but are not limited to, smart phones, tablet computers, portable computers, desktop computers, wearable devices, virtual reality devices, smart homes, etc., for example.
  • FIG. 11 is a block diagram of an electronic device according to an exemplary embodiment of the present disclosure.
  • the electronic device 1100 may include:
  • a memory 1120 configured to store executable instructions of the processor 1110
  • the processor 1110 is configured to execute an audio data processing method by executing the executable instruction, and the audio data processing method includes: acquiring a plurality of audio coding segments, and the plurality of audio coding segments are recorded by an audio and video device And saved in segments; splicing the multiple audio coding segments to obtain spliced data; decoding the spliced data to obtain audio synthesis data of the audio coding segment.
  • splicing the multiple audio encoding segments to obtain spliced data includes: sorting the multiple audio encoding segments according to arrangement information of the multiple audio encoding segments; According to the sorting result, the multiple audio coding segments are spliced end to end to obtain spliced data.
  • the arrangement information of the plurality of audio encoding segments is recording time information
  • the multiple audio encoding segments are performed according to the arrangement information of the plurality of audio encoding segments.
  • the sorting includes: sorting the multiple audio coding segments in a time sequence according to the recording time information of the multiple audio coding segments.
  • the arrangement information of the plurality of audio encoding segments is number information
  • the plurality of audio encoding segments are sorted according to the arrangement information of the plurality of audio encoding segments
  • the method includes: sorting the multiple audio coding segments in a number order according to the number information of the multiple audio coding segments.
  • obtaining a plurality of audio coding segments includes: obtaining a plurality of audio and video mixed stream data segments; respectively performing audio extraction on the plurality of audio and video mixed stream data segments to obtain the multiple audio codes part.
  • splicing the multiple audio encoding segments to obtain spliced data includes: determining the number of the multiple audio encoding segments according to the arrangement information of the multiple audio and video mixed stream data segments Arrangement information; sort the multiple audio encoding segments according to the arrangement information of the multiple audio encoding segments; perform end-to-end splicing on the multiple audio encoding segments according to the sorting result to obtain splicing data.
  • the arrangement information of the multiple audio and video mixed stream data segments is one of recording time information and number information.
  • the audio data processing method further includes: respectively performing video extraction on the multiple audio and video mixed stream data segments to obtain multiple video encoding segments; Generate video synthesis data; generate audio and video synthesis data according to the audio synthesis data and the video synthesis data.
  • acquiring multiple audio and video mixed stream data segments includes: receiving an editing instruction sent by a target object through a user interface, where the editing instruction includes editing object information; communicating with an audio and video device through communication The interface obtains the multiple audio and video mixed stream data segments corresponding to the editing object information from the audio and video equipment, and the multiple audio and video mixed stream data segments are recorded by the audio and video equipment and saved in segments.
  • the user interface may be, for example, but not limited to, a touch screen, a physical button, a microphone, and so on.
  • the audio data processing method further includes: performing an editing operation on the audio and video synthesized data in response to the editing instruction to obtain the edited audio and video synthesized data.
  • acquiring multiple audio and video mixed stream data segments includes: pre-downloading the multiple audio and video mixed stream data segments from the audio and video device and save them locally.
  • the format of the plurality of audio encoding segments is one of an AMR audio encoding format, an AAC audio encoding format, and an OPUS audio encoding format.
  • the processor obtains the spliced data by splicing multiple audio encoding segments, and decodes the spliced data, which can avoid that when multiple audio encoding segments are decoded separately, the decoding algorithm depends on the preceding and following frames.
  • modules or units of the device for action execution are mentioned in the above detailed description, this division is not mandatory.
  • the features and functions of two or more modules or units described above may be embodied in one module or unit.
  • the features and functions of a module or unit described above can be further divided into multiple modules or units to be embodied.
  • the example embodiments described here can be implemented by software, or can be implemented by combining software with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, U disk, mobile hard disk, etc.) or on the network , Including several instructions to make a computing device (which may be a personal computer, a server, a terminal device, or a network device, etc.) execute the method according to the embodiments of the present disclosure.
  • a computing device which may be a personal computer, a server, a terminal device, or a network device, etc.
  • a computer-readable storage medium on which is stored a program product capable of implementing the above-mentioned method of this specification.
  • various aspects of the present invention may also be implemented in the form of a program product, which includes program code.
  • the program product runs on a terminal device, the program code is used to enable the The terminal device executes the steps according to various exemplary embodiments of the present invention described in the above-mentioned "Exemplary Method" section of this specification.
  • the disclosed electronic device, computer-readable storage medium, and method may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may also be electrical, mechanical or other forms of connection.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments of the present disclosure.
  • the functional units in the various embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the technical solution of the present disclosure is essentially or a part that contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium. It includes several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present disclosure.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks and other media that can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

An audio data processing method, an electronic device and a computer-readable storage medium. The audio data processing method comprises: acquiring a plurality of audio coding segments, the plurality of audio coding segments being recorded and stored in segments by an audio and video device (S21); splicing the plurality of audio coding segments to obtain spliced data (S22); and decoding the spliced data to obtain audio synthesis data of the plurality of audio coding segments (S23). By splicing a plurality of audio coding segments to obtain spliced data, and decoding the spliced data, the invention can avoid, when the plurality of audio coding segments are respectively decoded, missing of decoding data due to the fact that a decoding algorithm depends on characteristics of previous and next frame data, and thus avoid audio distortion caused by discontinuity of obtained audio synthesis data at the splicing positions.

Description

音频数据处理方法、电子设备及计算机可读存储介质Audio data processing method, electronic equipment and computer readable storage medium 技术领域Technical field
本公开涉及计算机技术领域,并且更为具体地,涉及一种音频数据处理方法、电子设备及计算机可读存储介质。The present disclosure relates to the field of computer technology, and more specifically, to an audio data processing method, an electronic device, and a computer-readable storage medium.
背景技术Background technique
音视频录制过程中,为减少由于录制设备故障等因素导致的视频文件损坏,通常会对录制的音视频文件进行分段保存处理。分段保存能够保证在音视频录制过程中,即使出现音视频文件损坏的情况,受损的文件仅为其中一段,而不是所有音视频文件。During the audio and video recording process, in order to reduce the damage of the video file caused by the failure of the recording equipment and other factors, the recorded audio and video files are usually saved in segments. Segmented storage can ensure that during the audio and video recording process, even if the audio and video files are damaged, the damaged file is only one segment, not all the audio and video files.
在使用分段的音视频文件时,例如编辑或播放过程中,需要对分段的音视频文件进行合成,获得完整的音视频文件。在合成过程中,需要分别对音频文件和视频文件进行合成。其中,在音频文件的合成过程中,音频的解码算法会导致合成的音频文件出现音频不连续的情况,这将导致合成的音频文件存在卡顿或听感异常的情况。When using segmented audio and video files, for example, during editing or playback, the segmented audio and video files need to be synthesized to obtain a complete audio and video file. In the synthesis process, audio files and video files need to be synthesized separately. Among them, in the audio file synthesis process, the audio decoding algorithm will cause the synthesized audio file to have audio discontinuity, which will cause the synthesized audio file to have a freeze or abnormal hearing.
因此,如何将分段的音频文件合成为完整且无损的音频文件是一个亟需解决的问题。Therefore, how to synthesize segmented audio files into complete and lossless audio files is an urgent problem to be solved.
发明内容Summary of the invention
根据本公开的第一方面,提供一种音频数据处理方法,包括:获取多个音频编码段,所述多个音频编码段是由音视频设备进行录制并分段保存的;对所述多个音频编码段进行拼接,获得拼接数据;对所述拼接数据进行解码,获得所述多个音频编码段的音频合成数据。According to a first aspect of the present disclosure, there is provided an audio data processing method, including: acquiring a plurality of audio coding segments, the plurality of audio coding segments are recorded by audio and video equipment and saved in segments; The audio coding segments are spliced to obtain spliced data; the spliced data is decoded to obtain audio synthesis data of the multiple audio coding segments.
在本公开实施例中,通过对多个音频编码段进行拼接,获得拼接数据,并对拼接数据进行解码,能够避免对多个音频编码段分别进行解码时,由于解码算法依赖前后帧数据的特性导致的解码数据的缺失,进而避免获得的音频合成数据在拼接处发生不连续导致的音频失真。In the embodiments of the present disclosure, by splicing multiple audio encoding segments to obtain spliced data, and decoding the spliced data, it is possible to avoid that when multiple audio encoding segments are decoded separately, the decoding algorithm depends on the characteristics of the preceding and following frame data. The resulting lack of decoded data, thereby avoiding audio distortion caused by discontinuity of the obtained audio synthesis data at the splicing place.
根据本公开的第二方面,提供一种电子设备,包括:处理器;以及存储器,用于存储所述处理器的可执行指令;其中,所述处理器配置为经由执行所述可执行指令来执行音频数据处理方法,所述音频数据处理方法包括:获取多个音频编码段,所述多个音频编码段是由音视频设备进行录制并分段保存的;对所述多个音频编码段进行拼接,获得拼接数据;对所述拼接数据进行解码,获得所述音频编码段的音频合成数据。According to a second aspect of the present disclosure, there is provided an electronic device including: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to execute the executable instructions Performing an audio data processing method, the audio data processing method includes: acquiring a plurality of audio encoding segments, the plurality of audio encoding segments are recorded by audio and video equipment and stored in segments; Splicing to obtain spliced data; decoding the spliced data to obtain audio synthesis data of the audio coding segment.
根据本公开的第三方面,提供一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如本公开实施例第一方面所述的音频数据处理方法。According to a third aspect of the present disclosure, there is provided a computer-readable storage medium on which a computer program is stored, wherein the computer program is characterized in that, when the computer program is executed by a processor, the audio system described in the first aspect of the embodiments of the present disclosure is implemented. Data processing method.
附图说明Description of the drawings
为了更清楚地说明本公开实施例的技术方案,下面将对实施例或现有技术描述中所需 要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本公开的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to explain the technical solutions of the embodiments of the present disclosure more clearly, the following will briefly introduce the drawings needed in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only some of the present disclosure. Embodiments, for those of ordinary skill in the art, without creative work, other drawings can be obtained based on these drawings.
图1是根据本公开的一个示例性实施例的音频数据处理系统的架构图;Fig. 1 is an architecture diagram of an audio data processing system according to an exemplary embodiment of the present disclosure;
图2是根据本公开的一个示例性实施例的音频数据处理方法的流程图;Fig. 2 is a flowchart of an audio data processing method according to an exemplary embodiment of the present disclosure;
图3是本公开的一个示例性实施例中步骤S12的子流程图;Fig. 3 is a sub-flow chart of step S12 in an exemplary embodiment of the present disclosure;
图4是本公开的一个示例性实施例中步骤S11的子流程图;Fig. 4 is a sub-flow chart of step S11 in an exemplary embodiment of the present disclosure;
图5是本公开的一个示例性实施例中步骤S12的子流程图;Fig. 5 is a sub-flow chart of step S12 in an exemplary embodiment of the present disclosure;
图6是本公开的一个示例性实施例的音频数据处理方法的流程图;Fig. 6 is a flowchart of an audio data processing method according to an exemplary embodiment of the present disclosure;
图7是本公开的一个示例性实施例中步骤S111的子流程图;Fig. 7 is a sub-flow chart of step S111 in an exemplary embodiment of the present disclosure;
图8是本公开的一个示例性实施例中步骤S111的子流程图;Fig. 8 is a sub-flow chart of step S111 in an exemplary embodiment of the present disclosure;
图9是本公开的一个示例性实施例的音频数据处理方法的示意图;Fig. 9 is a schematic diagram of an audio data processing method according to an exemplary embodiment of the present disclosure;
图10是根据本公开的另一个示例性实施例的音频数据处理方法的效果对比图;Fig. 10 is an effect comparison diagram of an audio data processing method according to another exemplary embodiment of the present disclosure;
图11是根据本公开的一个示例性实施例的电子设备的框图。FIG. 11 is a block diagram of an electronic device according to an exemplary embodiment of the present disclosure.
具体实施方式Detailed ways
现在将参考附图更全面地描述示例实施方式。然而,示例实施方式能够以多种形式实施,且不应被理解为限于在此阐述的范例;相反,提供这些实施方式使得本公开将更加全面和完整,并将示例实施方式的构思全面地传达给本领域的技术人员。所描述的特征、结构或特性可以以任何合适的方式结合在一个或更多实施方式中。在下面的描述中,提供许多具体细节从而给出对本公开的实施方式的充分理解。然而,本领域技术人员将意识到,可以实践本公开的技术方案而省略所述特定细节中的一个或更多,或者可以采用其它的方法、组元、装置、步骤等。在其它情况下,不详细示出或描述公知技术方案以避免喧宾夺主而使得本公开的各方面变得模糊。Example embodiments will now be described more fully with reference to the accompanying drawings. However, the example embodiments can be implemented in various forms, and should not be construed as being limited to the examples set forth herein; on the contrary, these embodiments are provided so that the present disclosure will be more comprehensive and complete, and the concept of the example embodiments will be fully conveyed To those skilled in the art. The described features, structures or characteristics can be combined in one or more embodiments in any suitable way. In the following description, many specific details are provided to give a sufficient understanding of the embodiments of the present disclosure. However, those skilled in the art will realize that the technical solutions of the present disclosure can be practiced without one or more of the specific details, or other methods, components, devices, steps, etc. can be used. In other cases, the well-known technical solutions are not shown or described in detail in order to avoid overwhelming the crowd and obscure all aspects of the present disclosure.
此外,附图仅为本公开的示意性图解,图中相同的附图标记表示相同或类似的部分,因而将省略对它们的重复描述。附图中所示的一些方框图是功能实体,不一定必须与物理或逻辑上独立的实体相对应。可以采用软件形式来实现这些功能实体,或在一个或多个硬件模块或集成电路中实现这些功能实体,或在不同网络和/或处理器装置和/或微控制器装置中实现这些功能实体。In addition, the drawings are only schematic illustrations of the present disclosure, and the same reference numerals in the drawings denote the same or similar parts, and thus their repeated description will be omitted. Some of the block diagrams shown in the drawings are functional entities and do not necessarily correspond to physically or logically independent entities. These functional entities may be implemented in the form of software, or implemented in one or more hardware modules or integrated circuits, or implemented in different networks and/or processor devices and/or microcontroller devices.
下面结合附图对本公开示例实施方式进行详细说明。The exemplary embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings.
图1是根据本公开的一个示例性实施例的音频数据处理系统的架构图。Fig. 1 is an architectural diagram of an audio data processing system according to an exemplary embodiment of the present disclosure.
如图1所示,终端设备101、终端设备102通过网络104与服务器105相连,音视频设备103通过网络104与终端设备101、终端设备102相连。终端设备101、102可例如但不限于为手机、计算机、平板电脑、手持终端等。服务器105可以是提供各种服务的服务器,例如对用户利用终端设备101、102所进行操作的音频数据处理系 统提供支持的后台管理服务器(仅为示例)。后台服务器可以对接收到的多个音频编码段等数据进行分析等处理,并将处理结果(例如音频合成数据—仅为示例)反馈给终端设备。As shown in FIG. 1, the terminal device 101 and the terminal device 102 are connected to the server 105 through a network 104, and the audio and video device 103 is connected to the terminal device 101 and the terminal device 102 through the network 104. The terminal devices 101 and 102 may be, for example, but not limited to, mobile phones, computers, tablet computers, handheld terminals, and the like. The server 105 may be a server that provides various services, for example, a background management server that provides support for the audio data processing system operated by the user using the terminal devices 101 and 102 (just an example). The back-end server can analyze and process the received multiple audio coding segments and other data, and feed back the processing result (for example, audio synthesis data—just an example) to the terminal device.
终端设备101(也可以是终端设备102)可例如获取多个音频编码段,所述多个音频编码段是由音视频设备103进行录制并分段保存的;终端设备101可例如对所述多个音频编码段进行拼接,获得拼接数据;终端设备101可例如对所述拼接数据进行解码,获得所述多个音频编码段的音频合成数据。The terminal device 101 (or the terminal device 102) can, for example, obtain multiple audio coding segments, which are recorded by the audio and video device 103 and saved in segments; the terminal device 101 can, for example, The two audio encoding segments are spliced to obtain spliced data; the terminal device 101 may, for example, decode the spliced data to obtain audio synthesis data of the multiple audio encoding segments.
终端设备101(也可以是终端设备102)可从音视频设备103获取多个音视频混流数据段,分别对多个音视频混流数据段进行音频提取,获得所述多个音频编码段。The terminal device 101 (may also be the terminal device 102) can obtain multiple audio and video mixed stream data segments from the audio and video device 103, and perform audio extraction on the multiple audio and video mixed stream data segments respectively to obtain the multiple audio coded segments.
终端设备101(也可以是终端设备102)可接收目标对象发送的编辑指令,编辑指令包括编辑对象信息;从音视频设备103获取与编辑对象信息对应的多个音视频混流数据段,该多个音视频混流数据段是通过音视频设备103进行录制并分段保存的。The terminal device 101 (or the terminal device 102) can receive an editing instruction sent by a target object, the editing instruction includes editing object information; acquiring multiple audio and video mixed stream data segments corresponding to the editing object information from the audio and video device 103, the multiple The audio and video mixed stream data segment is recorded by the audio and video device 103 and saved in segments.
终端设备101(也可以是终端设备102)可接收目标对象发送的下载指令;从音视频设备103下载该下载指令对应的多个音视频混流数据段并保存在本地。The terminal device 101 (may also be the terminal device 102) can receive the download instruction sent by the target object; download multiple audio and video mixed stream data segments corresponding to the download instruction from the audio and video device 103 and save them locally.
服务器105可以是一个实体的服务器,还可例如为多个服务器组成,服务器105中的一部分可例如作为本公开中的音频数据处理任务提交系统,用于获取将要执行音频数据处理命令的任务;以及服务器105中的一部分还可例如作为本公开中的音频数据处理系统,用于获取多个音频编码段,所述多个音频编码段是由音视频设备进行录制并分段保存的;对所述多个音频编码段进行拼接,获得拼接数据;对所述拼接数据进行解码,获得所述多个音频编码段的音频合成数据。The server 105 may be an entity server, or may be composed of multiple servers, for example, a part of the server 105 may be used as an audio data processing task submission system in the present disclosure for obtaining tasks that will execute audio data processing commands; and A part of the server 105 can also be used, for example, as the audio data processing system in the present disclosure, for obtaining multiple audio coding segments, the multiple audio coding segments are recorded by audio and video equipment and saved in segments; The multiple audio encoding segments are spliced to obtain spliced data; the spliced data is decoded to obtain audio synthesis data of the multiple audio encoding segments.
终端设备101/终端设备102与音视频设备103之间可以通过无线传输方式进行数据传输,例如WiFi、蓝牙、zigbee等等。终端设备101/终端设备102与服务器105之间可以通过传统4G、5G、WiFi或者互联网等进行通信。The terminal device 101/terminal device 102 and the audio and video device 103 can transmit data through wireless transmission, such as WiFi, Bluetooth, zigbee, and so on. The terminal device 101/the terminal device 102 and the server 105 can communicate with each other through traditional 4G, 5G, WiFi, or the Internet.
图2是根据本公开的一个示例性实施例的音频数据处理方法的流程图。本公开实施例提供的音频数据处理方法可以由任意具备计算处理能力的电子设备执行,例如终端设备101、102和/或服务器105。如图2所示,本公开实施例提供的音频数据处理方法20可以包括:Fig. 2 is a flowchart of an audio data processing method according to an exemplary embodiment of the present disclosure. The audio data processing method provided by the embodiments of the present disclosure can be executed by any electronic device with computing and processing capabilities, such as the terminal devices 101 and 102 and/or the server 105. As shown in FIG. 2, the audio data processing method 20 provided by the embodiment of the present disclosure may include:
步骤S21,获取多个音频编码段,该多个音频编码段是由音视频设备进行录制并分段存储保存的。Step S21: Obtain multiple audio coding segments, the multiple audio coding segments are recorded by the audio and video equipment and stored in segments.
步骤S22,对多个音频编码段进行拼接,获得拼接数据。Step S22, splicing multiple audio coding segments to obtain spliced data.
步骤S23,对拼接数据进行解码,获得多个音频编码段的音频合成数据。Step S23: Decoding the spliced data to obtain audio synthesis data of multiple audio coding segments.
本公开实施例中,多个音频编码段可以是与音视频设备进行通信后下载获得的,与音视频设备进行通信和下载的过程可以由本地的指定命令触发执行,也可以由时钟周期触发执行,也可以由音视频设备主动发送的下载指令触发执行,本公开对此并不作特殊限定。音视频设备可例如但不限于为录像设备、无人飞行器(搭载有相机)等。 音频编码段的编码格式可为自适应多速率编码格式(Adaptive Multi-Rate,AMR)、先进音频编码格式(Advanced Audio Coding,AAC)、OPUS音频编码格式等的其中之一,本公开对此并不作特殊限定。In the embodiments of the present disclosure, multiple audio coding segments can be obtained by downloading after communicating with the audio and video equipment. The process of communicating and downloading with the audio and video equipment can be triggered by a local designated command, or can be triggered by a clock cycle. , It can also be triggered by a download instruction actively sent by the audio and video device, which is not specifically limited in the present disclosure. The audio and video equipment can be, for example, but not limited to, video recording equipment, unmanned aerial vehicles (equipped with cameras), and the like. The encoding format of the audio encoding segment may be one of Adaptive Multi-Rate (AMR), Advanced Audio Coding (AAC), OPUS audio encoding format, etc., which are not included in this disclosure. No special restrictions.
在步骤S22,对多个音频编码段进行拼接的拼接方式可为首尾拼接。In step S22, the splicing method for splicing multiple audio coding segments may be end-to-end splicing.
在步骤S23,解码方式可为音频解码方式,其中,可根据音频编码格式的不同选择与该音频编码格式对应的音频解码方式。In step S23, the decoding mode may be an audio decoding mode, wherein the audio decoding mode corresponding to the audio coding format can be selected according to the different audio coding format.
本公开实施例的音频数据处理方法,通过对多个音频编码段进行拼接,获得拼接数据,并对拼接数据进行解码,能够避免对多个音频编码段分别进行解码时,由于解码算法依赖前后帧数据的特性导致的解码数据的缺失,进而避免获得的音频合成数据在拼接处发生不连续导致的音频失真。In the audio data processing method of the embodiment of the present disclosure, by splicing multiple audio encoding segments to obtain spliced data, and decoding the spliced data, it can avoid that when multiple audio encoding segments are decoded separately, the decoding algorithm depends on the preceding and following frames. The lack of decoded data caused by the characteristics of the data, thereby avoiding audio distortion caused by discontinuity in the splicing of the obtained audio synthesis data.
图3是本公开的一个示例性实施例中步骤S22的子流程图。Fig. 3 is a sub-flow chart of step S22 in an exemplary embodiment of the present disclosure.
如图3所示,在一个实施例中,步骤S22可以包括:As shown in Fig. 3, in one embodiment, step S22 may include:
步骤S221,根据多个音频编码段的排列信息对多个音频编码段进行排序。Step S221: Sort the multiple audio coding segments according to the arrangement information of the multiple audio coding segments.
步骤S222,按照排序结果对多个音频编码段进行首尾拼接,获得拼接数据。In step S222, the multiple audio coding segments are spliced head and tail according to the sorting result to obtain spliced data.
本公开实施例中,多个音频编码段的排列信息可为该多个音频编码段的录制时间信息、编号信息的其中之一。音频编码段的录制时间信息是指音频编码段的起始位置、结束位置或中间指定位置在录制过程中的录制时间信息。音频编码段的编号信息是指该音频编码段在录制过程中由录制设备按照其录制顺序添加的编号信息。In the embodiment of the present disclosure, the arrangement information of the multiple audio coding segments may be one of recording time information and number information of the multiple audio coding segments. The recording time information of the audio coding segment refers to the recording time information of the start position, the end position, or the designated middle position of the audio coding segment during the recording process. The number information of the audio code segment refers to the number information added by the recording device according to the recording order of the audio code segment during the recording process.
在本公开的一个示例性实施例中,多个音频编码段的排列信息为录制时间信息,步骤S221可以包括:根据多个音频编码段的录制时间信息对多个音频编码段按照时间顺序进行排序。In an exemplary embodiment of the present disclosure, the arrangement information of the multiple audio encoding segments is recording time information, and step S221 may include: sorting the multiple audio encoding segments in chronological order according to the recording time information of the multiple audio encoding segments .
在本公开的一个示例性实施例中,多个音频编码段的排列信息为编号信息,步骤S221可以包括:根据多个音频编码段的编号信息对多个音频编码段按照编号顺序进行排序。In an exemplary embodiment of the present disclosure, the arrangement information of the multiple audio encoding segments is number information, and step S221 may include: sorting the multiple audio encoding segments in a number order according to the number information of the multiple audio encoding segments.
图4是本公开的一个示例性实施例中步骤S21的子流程图。Fig. 4 is a sub-flow chart of step S21 in an exemplary embodiment of the present disclosure.
如图4所示,在一个实施例中,步骤S21可以包括:As shown in Fig. 4, in one embodiment, step S21 may include:
步骤S211,获取多个音视频混流数据段。Step S211: Obtain multiple audio and video mixed stream data segments.
步骤S212,分别对多个音视频混流数据段进行音频提取,获得多个音频编码段。Step S212: Perform audio extraction on multiple audio and video mixed stream data segments respectively to obtain multiple audio coded segments.
本公开实施例中,多个音视频混流数据段可以是由音视频设备进行录制并分段保存的。例如,对于存储空间较大的音视频数据,为减少由于录制设备故障等因素导致的视频文件损坏,可在录制过程中对其进行分段,获得多个音视频混流数据段。在对音视频混流数据段进行处理时,通常需要对多个音视频混流数据段中的视频轨(Video Track)和音轨(Audio Track)进行分离,以分别对视频轨和音轨进行解码和拼接。多个音视频混流数据段可以是与音视频设备进行通信后下载获得,与音视频设备进行通信和下载的过程可以由本地的指定命令触发执行,也可以由时钟周期触发执行,也 可以由音视频设备主动发送的下载指令触发执行,本公开对此并不作特殊限定。音视频设备可例如但不限于为录像设备、无人飞行器(搭载有相机)等。In the embodiment of the present disclosure, multiple audio and video mixed stream data segments may be recorded by audio and video equipment and saved in segments. For example, for audio and video data with a large storage space, in order to reduce the damage of the video file due to factors such as recording equipment failure, it can be segmented during the recording process to obtain multiple audio and video mixed stream data segments. When processing audio and video mixed stream data segments, it is usually necessary to separate the video track (Video Track) and audio track (Audio Track) in multiple audio and video mixed stream data segments to decode and decode the video track and audio track separately. Splicing. Multiple audio and video mixed stream data segments can be downloaded after communicating with audio and video equipment. The process of communicating and downloading with audio and video equipment can be triggered by a local designated command, or triggered by a clock cycle, or by audio The download instruction actively sent by the video device triggers execution, and this disclosure does not specifically limit this. The audio and video equipment can be, for example, but not limited to, video recording equipment, unmanned aerial vehicles (equipped with cameras), and the like.
在步骤S212中,对每个音视频混流数据段进行音频提取后,可获得该每个音视频混流数据段的音频编码段。其中,多个音频编码段可例如为x1(t)、x2(t),x3(t)等,0<t<T,T为分段周期。In step S212, after audio extraction is performed on each audio-video mixed stream data segment, the audio coding segment of each audio-video mixed stream data segment can be obtained. Among them, the multiple audio coding segments may be, for example, x1(t), x2(t), x3(t), etc., 0<t<T, and T is the segment period.
图9是本公开的一个示例性实施例的音频数据处理方法的示意图。如图9所示,在基于本实施例的步骤S22和S23中,对多个音频编码段x1(t)、x2(t),x3(t)等进行拼接,获得拼接数据x(t),对拼接数据x(t)进行解码,获得多个音频编码段x1(t)、x2(t),x3(t)等的音频合成数据y(t)。Fig. 9 is a schematic diagram of an audio data processing method according to an exemplary embodiment of the present disclosure. As shown in Figure 9, in steps S22 and S23 based on this embodiment, multiple audio coding segments x1(t), x2(t), x3(t), etc. are spliced to obtain spliced data x(t), The spliced data x(t) is decoded to obtain audio synthesis data y(t) of multiple audio coding segments x1(t), x2(t), x3(t), etc.
图5是本公开的一个示例性实施例中步骤S22的子流程图。Fig. 5 is a sub-flow chart of step S22 in an exemplary embodiment of the present disclosure.
如图5所示,在一个实施例中,步骤S22可以包括:As shown in FIG. 5, in one embodiment, step S22 may include:
步骤S51,根据多个音视频混流数据段的排列信息确定多个音频编码段的排列信息。Step S51: Determine the arrangement information of the multiple audio coding segments according to the arrangement information of the multiple audio and video mixed stream data segments.
步骤S52,根据多个音频编码段的排列信息对多个音频编码段进行排序。Step S52: Sort the multiple audio coding segments according to the arrangement information of the multiple audio coding segments.
步骤S53,按照排序结果对多个音频编码段进行首尾拼接,获得拼接数据。In step S53, the multiple audio coding segments are spliced head to tail according to the sorting result to obtain spliced data.
在本公开实施例的部分实施例中,多个音视频混流数据段的排列信息为录制时间信息、编号信息的其中之一。In some of the embodiments of the present disclosure, the arrangement information of multiple audio and video mixed stream data segments is one of recording time information and number information.
其中,音视频混流数据段的录制时间信息是指音视频混流数据段的起始位置、结束位置或中间指定位置在录制过程中的录制时间信息。音视频混流数据段的编号信息是指该音视频混流数据段在录制过程中由录制设备按照其录制顺序添加的编号信息。Among them, the recording time information of the audio and video mixed stream data segment refers to the recording time information of the start position, the end position, or a designated position in the middle of the audio and video mixed stream data segment during the recording process. The number information of the audio and video mixed stream data segment refers to the number information added by the recording device according to the recording order of the audio and video mixed stream data segment during the recording process.
在步骤S51中,可将每个音视频混流数据段的排列信息作为该每个音视频混流数据段音频提取得到的音频编码段的排列信息。例如,对音视频混流数据段A进行音频提取得到音频编码端a,则将音视频混流数据段A的排列信息作为音频编码段a的排列信息。In step S51, the arrangement information of each audio-video mixed stream data segment may be used as the arrangement information of the audio coded segment obtained by audio extraction of each audio-video mixed stream data segment. For example, audio extraction is performed on the audio and video mixed stream data segment A to obtain the audio encoding end a, and the arrangement information of the audio and video mixed stream data segment A is used as the arrangement information of the audio encoding segment a.
在步骤S52中,当排列信息为录制时间信息时,可根据多个音视频混流数据段的录制时间信息对多个音视频混流数据段按照时间顺序进行排序。当排列信息为编号信息时,可根据多个音视频混流数据段的编号信息对多个音视频混流数据段按照编号顺序进行排序。In step S52, when the arrangement information is recording time information, the multiple audio and video mixed stream data segments may be sorted in chronological order according to the recording time information of the multiple audio and video mixed stream data segments. When the arrangement information is number information, the multiple audio and video mixed stream data segments can be sorted according to the number order according to the number information of the multiple audio and video mixed stream data segments.
图6是本公开的一个示例性实施例的音频数据处理方法的流程图。Fig. 6 is a flowchart of an audio data processing method according to an exemplary embodiment of the present disclosure.
如图6所示,基于上述实施例的音频数据处理方法还可以包括:As shown in FIG. 6, the audio data processing method based on the foregoing embodiment may further include:
步骤S61,分别对多个音视频混流数据段进行视频提取,获取多个视频编码段。Step S61: Perform video extraction on multiple audio and video mixed stream data segments respectively to obtain multiple video coded segments.
步骤S62,根据多个视频编码段生成视频合成数据。Step S62: Generate video synthesis data according to multiple video coding segments.
步骤S63,根据音频合成数据和视频合成数据生成音视频合成数据。Step S63: Generate audio and video synthesized data according to the audio synthesized data and the video synthesized data.
本公开实施例中,在步骤S62,可对多个视频编码段进行拼接,生成视频合成数据。In the embodiment of the present disclosure, in step S62, multiple video coding segments may be spliced to generate video synthesis data.
图7是本公开的一个示例性实施例中步骤S211的子流程图。Fig. 7 is a sub-flow chart of step S211 in an exemplary embodiment of the present disclosure.
如图7所示,在一个实施例中,步骤S211可以包括:As shown in FIG. 7, in an embodiment, step S211 may include:
步骤S2111,接收目标对象发送的编辑指令,编辑指令包括编辑对象信息。Step S2111: Receive an editing instruction sent by the target object, where the editing instruction includes editing object information.
步骤S2112,从音视频设备获取与编辑对象信息对应的多个音视频混流数据段,多个音视频混流数据段是通过音视频设备进行录制并分段保存的。Step S2112: Obtain multiple audio and video mixed stream data segments corresponding to the editing object information from the audio and video equipment. The multiple audio and video mixed stream data segments are recorded by the audio and video equipment and stored in segments.
本公开实施例中,在步骤S2111,目标对象可例如为本公开实施例的音频数据处理方法的执行设备的操作对象。例如,当本公开实施例的音频数据处理方法的执行设备为终端设备101或102时,目标对象可为终端设备101或102的操作用户。当终端设备101或102的操作用户在终端设备101或102上进行预设操作时,终端设备101或102可根据该预设操作生成编辑指令。又例如,当本公开实施例的音频数据处理方法的执行设备为服务器105,服务器105为终端设备101或102提供支持的后台管理服务器时,目标对象可为终端设备101或102的操作用户。当终端设备101或102的操作用户在终端设备101或102上进行预设操作时,终端设备101或102可根据该预设操作生成编辑指令并通过网络104发送至服务器105。In the embodiment of the present disclosure, in step S2111, the target object may be, for example, an operation object of the execution device of the audio data processing method of the embodiment of the present disclosure. For example, when the execution device of the audio data processing method of the embodiment of the present disclosure is the terminal device 101 or 102, the target object may be the operating user of the terminal device 101 or 102. When the operating user of the terminal device 101 or 102 performs a preset operation on the terminal device 101 or 102, the terminal device 101 or 102 may generate an editing instruction according to the preset operation. For another example, when the execution device of the audio data processing method of the embodiment of the present disclosure is the server 105, and the server 105 provides a background management server supporting the terminal device 101 or 102, the target object may be the operating user of the terminal device 101 or 102. When an operating user of the terminal device 101 or 102 performs a preset operation on the terminal device 101 or 102, the terminal device 101 or 102 can generate an editing instruction according to the preset operation and send it to the server 105 via the network 104.
编辑对象信息用于确定多个音视频混流数据段,编辑对象信息可以是多个音视频混流数据段的标识信息、存储地址信息等。例如,若音视频设备进行录制并分段保存多个音视频混流数据段z1(t)、z2(t)、z3(t)…的标识信息为z,则编辑对象信息中可包括标识信息z。又例如,若多个音视频混流数据段z1(t)、z2(t)、z3(t)…的存储地址信息为C:\download时,则编辑对象信息中可包括存储地址信息C:\download。The editing object information is used to determine multiple audio and video mixed stream data segments, and the editing object information may be identification information, storage address information, etc. of the multiple audio and video mixed stream data segments. For example, if the audio and video equipment records and saves multiple audio and video mixed stream data segments z1(t), z2(t), z3(t)... the identification information is z, then the editing object information can include identification information z . For another example, if the storage address information of multiple audio and video mixed stream data segments z1(t), z2(t), z3(t)... is C:\download, the editing object information may include the storage address information C:\ download.
在步骤S2112,音视频设备可例如但不限于为无人飞行器(搭载有相机)、录像机等。本公开实施例的音频数据处理方法的执行设备可与音视频设备进行通信,以通过通信接口获取与编辑对象信息对应的多个音视频混流数据段。In step S2112, the audio and video equipment may be, for example, but not limited to, an unmanned aerial vehicle (equipped with a camera), a video recorder, and the like. The execution device of the audio data processing method of the embodiment of the present disclosure may communicate with the audio and video device to obtain multiple audio and video mixed stream data segments corresponding to the editing object information through the communication interface.
在步骤S2112,音视频设备进行录制并分段保存获得多个音视频混流数据段时,可根据分段周期对录制音视频进行分段,获得多个音视频混流数据段。In step S2112, when the audio and video equipment records and saves in segments to obtain multiple audio and video mixed stream data segments, the recorded audio and video can be segmented according to the segment period to obtain multiple audio and video mixed stream data segments.
在本公开的一个示例性实施例中,音频数据处理方法还可以包括:响应于编辑指令对音视频合成数据执行编辑操作,获得编辑后的音视频合成数据。其中,编辑指令用于对音视频合成数据执行音视频编辑操作。编辑指令中包含的编辑操作可为本公开实施例的音频数据处理方法。例如,编辑指令为拼接指令,终端设备根据用户输入的拼接指令从音视频设备获取与用户输入的编辑对象信息对应的音视频混流数据段,在获取到与编辑对象信息对应的多个音视频混流数据段后,可根据编辑指令对多个音视频混流数据段执行本公开实施例的音频数据处理方法,获得拼接后的音视频合成数据。在完成本申请的拼接处理中,还可以根据用户输入的其他编辑操作对音视频合成数据进行处理,该其他编辑操作可为调色、剪切、变速、倒放、复制等。例如,在对多个音视频混流数据段进行拼接后,可根据其他编辑指令对音视频合成数据执行调 色、剪切、变速、倒放、复制等操作,获得编辑后的音视频合成数据。In an exemplary embodiment of the present disclosure, the audio data processing method may further include: performing an editing operation on the audio and video synthesized data in response to the editing instruction to obtain the edited audio and video synthesized data. Among them, the editing instructions are used to perform audio and video editing operations on the audio and video synthesized data. The editing operation included in the editing instruction may be the audio data processing method of the embodiment of the present disclosure. For example, the editing instruction is a splicing instruction, and the terminal device obtains the audio and video mixed stream data segment corresponding to the editing object information input by the user from the audio and video equipment according to the splicing instruction input by the user, and obtains multiple audio and video mixed streams corresponding to the editing object information. After the data segment, the audio data processing method of the embodiment of the present disclosure can be executed on the multiple audio and video mixed stream data segments according to the editing instruction to obtain the spliced audio and video synthesis data. In completing the splicing process of the present application, the audio and video synthesis data can also be processed according to other editing operations input by the user, and the other editing operations can be color correction, cutting, speed changing, reverse playback, copying, and so on. For example, after multiple audio and video mixed stream data segments are spliced, the audio and video synthesized data can be color-tuned, cut, shifted, reversed, and copied according to other editing instructions to obtain the edited audio and video synthesized data.
图8是本公开的一个示例性实施例中步骤S211的子流程图。Fig. 8 is a sub-flow chart of step S211 in an exemplary embodiment of the present disclosure.
如图8所示,在一个实施例中,步骤S211可以包括:As shown in FIG. 8, in an embodiment, step S211 may include:
步骤S81,接收目标对象发送的下载指令。Step S81, receiving a download instruction sent by the target object.
步骤S82,从音视频设备下载该下载指令对应的多个音视频混流数据段并保存在本地。Step S82: Download multiple audio and video mixed stream data segments corresponding to the download instruction from the audio and video equipment and save them locally.
本公开实施例中,在S81,目标对象可例如为本公开实施例的音频数据处理方法的执行设备的操作对象。例如,当本公开实施例的音频数据处理方法的执行设备为终端设备101或102时,目标对象可为终端设备101或102的操作用户。当终端设备101或102的操作用户在终端设备101或102上进行预设操作时,终端设备101或102可根据该预设操作生成下载指令。又例如,当本公开实施例的音频数据处理方法的执行设备为服务器105,服务器105为终端设备101或102提供支持的后台管理服务器时,目标对象可为终端设备101或102的操作用户。当终端设备101或102的操作用户在终端设备101或102上进行预设操作时,终端设备101或102可根据该预设操作生成下载指令并通过网络104发送至服务器105。其中,多个音视频混流数据段可以是与音视频设备进行通信后下载获得,本公开对此并不作特殊限定。本公开实施例中,在步骤S81,音视频设备包括但不限于为无人飞行器(搭载有相机)、录像器等。In the embodiment of the present disclosure, in S81, the target object may be, for example, an operation object of the execution device of the audio data processing method of the embodiment of the present disclosure. For example, when the execution device of the audio data processing method of the embodiment of the present disclosure is the terminal device 101 or 102, the target object may be the operating user of the terminal device 101 or 102. When the operating user of the terminal device 101 or 102 performs a preset operation on the terminal device 101 or 102, the terminal device 101 or 102 may generate a download instruction according to the preset operation. For another example, when the execution device of the audio data processing method of the embodiment of the present disclosure is the server 105, and the server 105 provides a background management server supporting the terminal device 101 or 102, the target object may be the operating user of the terminal device 101 or 102. When the operating user of the terminal device 101 or 102 performs a preset operation on the terminal device 101 or 102, the terminal device 101 or 102 may generate a download instruction according to the preset operation and send it to the server 105 via the network 104. Wherein, multiple audio and video mixed stream data segments may be obtained by downloading after communicating with audio and video equipment, which is not specifically limited in the present disclosure. In the embodiment of the present disclosure, in step S81, the audio and video equipment includes, but is not limited to, an unmanned aerial vehicle (equipped with a camera), a video recorder, and the like.
在本公开的一种示例性实施例中,多个音频编码段的格式为AMR音频编码格式、AAC音频编码格式、OPUS音频编码格式的其中之一。In an exemplary embodiment of the present disclosure, the format of the multiple audio encoding segments is one of the AMR audio encoding format, the AAC audio encoding format, and the OPUS audio encoding format.
在本公开的一种示例性实施例中,还可接收目标对象发送的编辑指令,编辑指令包括编辑对象信息。响应于该编辑指令对编辑对象信息对应的音视频合成数据执行编辑操作,获得编辑后的音视频合成数据。编辑操作可例如但不限于为拼接、调色、剪切、变速、倒放、复制等。In an exemplary embodiment of the present disclosure, an editing instruction sent by the target object may also be received, and the editing instruction includes editing object information. In response to the editing instruction, an editing operation is performed on the audio and video synthesized data corresponding to the editing object information to obtain the edited audio and video synthesized data. Editing operations can be, for example, but not limited to, splicing, toning, cutting, shifting, rewinding, copying, etc.
图10是根据本公开的另一个示例性实施例的音频数据处理方法的效果对比图。Fig. 10 is an effect comparison diagram of an audio data processing method according to another exemplary embodiment of the present disclosure.
如图10所示,在一个实施例中,以AMR音频编码格式的音频编码段为例,音频原始数据为16kHz,16bits的单通道数据,采用了步长20ms,模式8的配置。即原始数据每20ms的数据进行一次编码,并编码为61个二进制数据。图10(a)是本公开的一个示例性实施例的原始音频数据的时频图,该原始音频数据为时长为1s的单频数据,对应50帧。将该原始音频数据编码为AMR音频编码格式的数据后,记为amr0,前0.5s的编码数据与后0.5s的编码数据切割开,分别记为音频编码段amr1和音频编码段amr2。将音频编码段amr1和音频编码段amr2独立解码后的解码数据(Pulse-code modulation,PCM,脉冲编码调制)合并得到的音频合成数据101如图10(b)所示。将音频编码段amr1和音频编码段amr2按照本公开实施例的音频数据处理方法进行拼接后再解码得到音频合成数据102如图11(c)所示。对比图10(b)、(c)可知,图10(c)所示的拼接后再解码得到的音频合成数据更接近原始音频数 据。而图10(b)所示的解码再拼接得到的音频合成数据相比图10(c)存在巨大的失真,这是由于AMR编解码会利用到历史信息,即当前帧的编解码会利用前一帧,甚至更久之前的数据信息,而由于音频是分段存储的,设备对各个分段的音频分别进行解码,在解码后一个分段的首个音频数据时获取不到其前一个分段的最后一帧数据,因此在解码后一帧的首个音频数据时会存在解码错误的问题,利用本公开实施例,先将各个音频分段进行拼接,拼接后再进行解码,则不会出现分段解码的误差问题,解码更可靠。As shown in FIG. 10, in one embodiment, taking the audio coding section of the AMR audio coding format as an example, the original audio data is 16kHz, 16bits single-channel data, and the step size is 20ms, and the mode 8 configuration is adopted. That is, the original data is encoded once every 20ms of data and encoded into 61 binary data. Fig. 10(a) is a time-frequency diagram of original audio data of an exemplary embodiment of the present disclosure. The original audio data is single-frequency data with a duration of 1s, corresponding to 50 frames. After the original audio data is encoded into data in the AMR audio encoding format, it is recorded as amr0, and the first 0.5s encoded data is separated from the last 0.5s encoded data, and they are respectively recorded as audio encoding segment amr1 and audio encoding segment amr2. The audio synthesis data 101 obtained by combining the decoded data (Pulse-code Modulation, PCM, pulse code modulation) decoded independently of the audio encoding segment amr1 and the audio encoding segment amr2 is shown in FIG. 10(b). The audio coding segment amr1 and the audio coding segment amr2 are spliced according to the audio data processing method of the embodiment of the present disclosure and then decoded to obtain the audio synthesis data 102 as shown in FIG. 11(c). Comparing Figs. 10(b) and (c), it can be seen that the audio synthesis data obtained after splicing and decoding shown in Fig. 10(c) is closer to the original audio data. The audio synthesis data obtained by decoding and splicing shown in Figure 10(b) has huge distortion compared to Figure 10(c). This is because the AMR codec will use historical information, that is, the codec of the current frame will use the previous The data information of one frame or even longer ago, and because the audio is stored in segments, the device decodes the audio of each segment separately, and the first audio data of the next segment cannot be obtained when the previous audio data is decoded. Therefore, when decoding the first audio data of the next frame, there will be a problem of decoding errors. According to the embodiments of the present disclosure, the audio segments are spliced first, and then decoded after splicing. The error problem of segmented decoding appears, and the decoding is more reliable.
在本公开的示例性实施例中,还提供了一种能够实现上述方法的电子设备。电子设备例如可包括但不限于智能手机、平板电脑、便携式计算机、台式计算机、可穿戴设备、虚拟现实设备、智能家居等等。In an exemplary embodiment of the present disclosure, an electronic device capable of implementing the above method is also provided. Electronic devices may include, but are not limited to, smart phones, tablet computers, portable computers, desktop computers, wearable devices, virtual reality devices, smart homes, etc., for example.
所属技术领域的技术人员能够理解,本发明的各个方面可以实现为系统、方法或程序产品。因此,本发明的各个方面可以具体实现为以下形式,即:完全的硬件实施方式、完全的软件实施方式(包括固件、微代码等),或硬件和软件方面结合的实施方式,这里可以统称为“电路”、“模块”或“系统”。Those skilled in the art can understand that various aspects of the present invention can be implemented as a system, a method, or a program product. Therefore, various aspects of the present invention can be specifically implemented in the following forms, namely: complete hardware implementation, complete software implementation (including firmware, microcode, etc.), or a combination of hardware and software implementations, which may be collectively referred to herein as "Circuit", "Module" or "System".
图11是根据本公开的一个示例性实施例的电子设备的框图。FIG. 11 is a block diagram of an electronic device according to an exemplary embodiment of the present disclosure.
如图11所示,电子设备1100可以包括:As shown in FIG. 11, the electronic device 1100 may include:
处理器1110; Processor 1110;
以及存储器1120,用于存储所述处理器1110的可执行指令;And a memory 1120, configured to store executable instructions of the processor 1110;
其中,处理器1110配置为经由执行所述可执行指令来执行音频数据处理方法,所述音频数据处理方法包括:获取多个音频编码段,所述多个音频编码段是由音视频设备进行录制并分段保存的;对所述多个音频编码段进行拼接,获得拼接数据;对所述拼接数据进行解码,获得所述音频编码段的音频合成数据。Wherein, the processor 1110 is configured to execute an audio data processing method by executing the executable instruction, and the audio data processing method includes: acquiring a plurality of audio coding segments, and the plurality of audio coding segments are recorded by an audio and video device And saved in segments; splicing the multiple audio coding segments to obtain spliced data; decoding the spliced data to obtain audio synthesis data of the audio coding segment.
在本公开的一个示例性实施例中,对所述多个音频编码段进行拼接,获得拼接数据,包括:根据所述多个音频编码段的排列信息对所述多个音频编码段进行排序;按照排序结果对所述多个音频编码段进行首尾拼接,获得拼接数据。In an exemplary embodiment of the present disclosure, splicing the multiple audio encoding segments to obtain spliced data includes: sorting the multiple audio encoding segments according to arrangement information of the multiple audio encoding segments; According to the sorting result, the multiple audio coding segments are spliced end to end to obtain spliced data.
在本公开的一个示例性实施例中,所述多个音频编码段的所述排列信息为录制时间信息,所述根据所述多个音频编码段的排列信息对所述多个音频编码段进行排序包括:根据所述多个音频编码段的所述录制时间信息对所述多个音频编码段按照时间顺序进行排序。In an exemplary embodiment of the present disclosure, the arrangement information of the plurality of audio encoding segments is recording time information, and the multiple audio encoding segments are performed according to the arrangement information of the plurality of audio encoding segments. The sorting includes: sorting the multiple audio coding segments in a time sequence according to the recording time information of the multiple audio coding segments.
在本公开的一个示例性实施例中,所述多个音频编码段的所述排列信息为编号信息,所述根据所述多个音频编码段的排列信息对所述多个音频编码段进行排序包括:根据所述多个音频编码段的所述编号信息对所述多个音频编码段按照编号顺序进行排序。In an exemplary embodiment of the present disclosure, the arrangement information of the plurality of audio encoding segments is number information, and the plurality of audio encoding segments are sorted according to the arrangement information of the plurality of audio encoding segments The method includes: sorting the multiple audio coding segments in a number order according to the number information of the multiple audio coding segments.
在本公开的一个示例性实施例中,获取多个音频编码段包括:获取多个音视频混流数据段;分别对所述多个音视频混流数据段进行音频提取,获得所述多个音频编码 段。In an exemplary embodiment of the present disclosure, obtaining a plurality of audio coding segments includes: obtaining a plurality of audio and video mixed stream data segments; respectively performing audio extraction on the plurality of audio and video mixed stream data segments to obtain the multiple audio codes part.
在本公开的一个示例性实施例中,对所述多个音频编码段进行拼接,获得拼接数据,包括:根据所述多个音视频混流数据段的排列信息确定所述多个音频编码段的排列信息;根据所述多个音频编码段的排列信息对所述多个音频编码段进行排序;按照排序结果对所述多个音频编码段进行首尾拼接,获得拼接数据。In an exemplary embodiment of the present disclosure, splicing the multiple audio encoding segments to obtain spliced data includes: determining the number of the multiple audio encoding segments according to the arrangement information of the multiple audio and video mixed stream data segments Arrangement information; sort the multiple audio encoding segments according to the arrangement information of the multiple audio encoding segments; perform end-to-end splicing on the multiple audio encoding segments according to the sorting result to obtain splicing data.
在本公开的一个示例性实施例中,所述多个音视频混流数据段的所述排列信息为录制时间信息、编号信息的其中之一。In an exemplary embodiment of the present disclosure, the arrangement information of the multiple audio and video mixed stream data segments is one of recording time information and number information.
在本公开的一个示例性实施例中,所述音频数据处理方法还包括:分别对所述多个音视频混流数据段进行视频提取,获取多个视频编码段;根据所述多个视频编码段生成视频合成数据;根据所述音频合成数据和所述视频合成数据生成音视频合成数据。In an exemplary embodiment of the present disclosure, the audio data processing method further includes: respectively performing video extraction on the multiple audio and video mixed stream data segments to obtain multiple video encoding segments; Generate video synthesis data; generate audio and video synthesis data according to the audio synthesis data and the video synthesis data.
在本公开的一个示例性实施例中,获取多个音视频混流数据段包括:通过用户接口接收目标对象发送的编辑指令,所述编辑指令包括编辑对象信息;与音视频设备进行通信,通过通信接口从所述音视频设备获取与所述编辑对象信息对应的所述多个音视频混流数据段,所述多个音视频混流数据段是通过所述音视频设备进行录制并分段保存的。In an exemplary embodiment of the present disclosure, acquiring multiple audio and video mixed stream data segments includes: receiving an editing instruction sent by a target object through a user interface, where the editing instruction includes editing object information; communicating with an audio and video device through communication The interface obtains the multiple audio and video mixed stream data segments corresponding to the editing object information from the audio and video equipment, and the multiple audio and video mixed stream data segments are recorded by the audio and video equipment and saved in segments.
其中,用户接口可例如但不限于为触摸屏、物理按键、麦克风等。Among them, the user interface may be, for example, but not limited to, a touch screen, a physical button, a microphone, and so on.
在本公开的一个示例性实施例中,所述音频数据处理方法还包括:响应于所述编辑指令对所述音视频合成数据执行编辑操作,获得编辑后的所述音视频合成数据。In an exemplary embodiment of the present disclosure, the audio data processing method further includes: performing an editing operation on the audio and video synthesized data in response to the editing instruction to obtain the edited audio and video synthesized data.
在本公开的一个示例性实施例中,获取多个音视频混流数据段包括:预先从所述音视频设备下载所述多个音视频混流数据段并保存在本地。In an exemplary embodiment of the present disclosure, acquiring multiple audio and video mixed stream data segments includes: pre-downloading the multiple audio and video mixed stream data segments from the audio and video device and save them locally.
在本公开的一个示例性实施例中,所述多个音频编码段的格式为AMR音频编码格式、AAC音频编码格式、OPUS音频编码格式的其中之一。In an exemplary embodiment of the present disclosure, the format of the plurality of audio encoding segments is one of an AMR audio encoding format, an AAC audio encoding format, and an OPUS audio encoding format.
本公开实施例的电子设备,处理器通过对多个音频编码段进行拼接,获得拼接数据,并对拼接数据进行解码,能够避免对多个音频编码段分别进行解码时,由于解码算法依赖前后帧数据的特性导致的解码数据的缺失,进而避免获得的音频合成数据在拼接处发生不连续导致的音频失真。In the electronic device of the embodiment of the present disclosure, the processor obtains the spliced data by splicing multiple audio encoding segments, and decodes the spliced data, which can avoid that when multiple audio encoding segments are decoded separately, the decoding algorithm depends on the preceding and following frames. The lack of decoded data caused by the characteristics of the data, thereby avoiding audio distortion caused by discontinuity in the splicing of the obtained audio synthesis data.
应当注意,尽管在上文详细描述中提及了用于动作执行的设备的若干模块或者单元,但是这种划分并非强制性的。实际上,根据本公开的实施方式,上文描述的两个或更多模块或者单元的特征和功能可以在一个模块或者单元中具体化。反之,上文描述的一个模块或者单元的特征和功能可以进一步划分为由多个模块或者单元来具体化。It should be noted that although several modules or units of the device for action execution are mentioned in the above detailed description, this division is not mandatory. In fact, according to the embodiments of the present disclosure, the features and functions of two or more modules or units described above may be embodied in one module or unit. Conversely, the features and functions of a module or unit described above can be further divided into multiple modules or units to be embodied.
通过以上的实施方式的描述,本领域的技术人员易于理解,这里描述的示例实施方式可以通过软件实现,也可以通过软件结合必要的硬件的方式来实现。因此,根据本公开实施方式的技术方案可以以软件产品的形式体现出来,该软件产品可以存储在 一个非易失性存储介质(可以是CD-ROM,U盘,移动硬盘等)中或网络上,包括若干指令以使得一台计算设备(可以是个人计算机、服务器、终端装置、或者网络设备等)执行根据本公开实施方式的方法。Through the description of the above embodiments, those skilled in the art can easily understand that the example embodiments described here can be implemented by software, or can be implemented by combining software with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, U disk, mobile hard disk, etc.) or on the network , Including several instructions to make a computing device (which may be a personal computer, a server, a terminal device, or a network device, etc.) execute the method according to the embodiments of the present disclosure.
在本公开的示例性实施例中,还提供了一种计算机可读存储介质,其上存储有能够实现本说明书上述方法的程序产品。在一些可能的实施方式中,本发明的各个方面还可以实现为一种程序产品的形式,其包括程序代码,当所述程序产品在终端设备上运行时,所述程序代码用于使所述终端设备执行本说明书上述“示例性方法”部分中描述的根据本发明各种示例性实施方式的步骤。In the exemplary embodiment of the present disclosure, there is also provided a computer-readable storage medium on which is stored a program product capable of implementing the above-mentioned method of this specification. In some possible implementation manners, various aspects of the present invention may also be implemented in the form of a program product, which includes program code. When the program product runs on a terminal device, the program code is used to enable the The terminal device executes the steps according to various exemplary embodiments of the present invention described in the above-mentioned "Exemplary Method" section of this specification.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本公开的范围。A person of ordinary skill in the art may be aware that the units and algorithm steps of the examples described in the embodiments disclosed herein can be implemented by electronic hardware, computer software, or a combination of both, in order to clearly illustrate the hardware and software Interchangeability, in the above description, the composition and steps of each example have been generally described in accordance with the function. Whether these functions are executed by hardware or software depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered as going beyond the scope of the present disclosure.
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and conciseness of description, the specific working process of the above-described system, device, and unit may refer to the corresponding process in the foregoing method embodiment, which will not be repeated here.
在本公开所提供的几个实施例中,应该理解到,所公开的电子设备、计算机可读存储介质和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口、装置或单元的间接耦合或通信连接,也可以是电的,机械的或其它的形式连接。In the several embodiments provided in the present disclosure, it should be understood that the disclosed electronic device, computer-readable storage medium, and method may be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another system, or some features can be ignored or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may also be electrical, mechanical or other forms of connection.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本公开实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments of the present disclosure.
另外,在本公开各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以是两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, the functional units in the various embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分,或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以 使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本公开各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, the technical solution of the present disclosure is essentially or a part that contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium. It includes several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present disclosure. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks and other media that can store program codes. .
以上所述,仅为本公开的具体实施方式,但本公开的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本公开揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本公开的保护范围之内。因此,本公开的保护范围应以权利要求的保护范围为准。The above are only specific implementations of the present disclosure, but the protection scope of the present disclosure is not limited thereto. Any person skilled in the art can easily think of various equivalents within the technical scope disclosed in the present disclosure. Modifications or replacements, these modifications or replacements should be covered within the protection scope of the present disclosure. Therefore, the protection scope of the present disclosure should be subject to the protection scope of the claims.

Claims (25)

  1. 一种音频数据处理方法,其特征在于,包括:An audio data processing method, characterized in that it comprises:
    获取多个音频编码段,所述多个音频编码段是由音视频设备进行录制并分段保存的;Acquiring multiple audio coding segments, the multiple audio coding segments are recorded by audio and video equipment and saved in segments;
    对所述多个音频编码段进行拼接,获得拼接数据;Splicing the multiple audio coding segments to obtain spliced data;
    对所述拼接数据进行解码,获得所述多个音频编码段的音频合成数据。Decoding the spliced data to obtain audio synthesis data of the multiple audio coding segments.
  2. 根据权利要求1所述的方法,其特征在于,对所述多个音频编码段进行拼接,获得拼接数据,包括:The method according to claim 1, wherein the splicing of the multiple audio coding segments to obtain spliced data comprises:
    根据所述多个音频编码段的排列信息对所述多个音频编码段进行排序;Sorting the multiple audio coding segments according to the arrangement information of the multiple audio coding segments;
    按照排序结果对所述多个音频编码段进行首尾拼接,获得拼接数据。According to the sorting result, the multiple audio coding segments are spliced end to end to obtain spliced data.
  3. 根据权利要求2所述的方法,其特征在于,所述多个音频编码段的所述排列信息为录制时间信息,所述根据所述多个音频编码段的排列信息对所述多个音频编码段进行排序包括:The method according to claim 2, wherein the arrangement information of the plurality of audio encoding segments is recording time information, and the plurality of audio encodings are encoded according to the arrangement information of the plurality of audio encoding segments The segments to be sorted include:
    根据所述多个音频编码段的所述录制时间信息对所述多个音频编码段按照时间顺序进行排序。Sort the multiple audio encoding segments in a time sequence according to the recording time information of the multiple audio encoding segments.
  4. 根据权利要求2所述的方法,其特征在于,所述多个音频编码段的所述排列信息为编号信息,所述根据所述多个音频编码段的排列信息对所述多个音频编码段进行排序包括:The method according to claim 2, wherein the arrangement information of the plurality of audio encoding segments is number information, and the arrangement of the plurality of audio encoding segments is performed according to the arrangement information of the plurality of audio encoding segments. Sorting includes:
    根据所述多个音频编码段的所述编号信息对所述多个音频编码段按照编号顺序进行排序。Sorting the multiple audio coding segments in a number order according to the number information of the multiple audio coding segments.
  5. 根据权利要求1所述的方法,其特征在于,获取多个音频编码段包括:The method according to claim 1, wherein obtaining a plurality of audio coding segments comprises:
    获取多个音视频混流数据段;Obtain multiple audio and video mixed stream data segments;
    分别对所述多个音视频混流数据段进行音频提取,获得所述多个音频编码段。Perform audio extraction on the multiple audio and video mixed stream data segments respectively to obtain the multiple audio coding segments.
  6. 根据权利要求5所述的方法,其特征在于,对所述多个音频编码段进行拼接,获得拼接数据,包括:The method according to claim 5, wherein the splicing of the multiple audio coding segments to obtain spliced data comprises:
    根据所述多个音视频混流数据段的排列信息确定所述多个音频编码段的排列信息;Determining the arrangement information of the multiple audio coding segments according to the arrangement information of the multiple audio and video mixed stream data segments;
    根据所述多个音频编码段的排列信息对所述多个音频编码段进行排序;Sorting the multiple audio coding segments according to the arrangement information of the multiple audio coding segments;
    按照排序结果对所述多个音频编码段进行首尾拼接,获得拼接数据。According to the sorting result, the multiple audio coding segments are spliced end to end to obtain spliced data.
  7. 根据权利要求6所述的方法,其特征在于,所述多个音视频混流数据段的所述排列信息为录制时间信息、编号信息的其中之一。The method according to claim 6, wherein the arrangement information of the multiple audio and video mixed stream data segments is one of recording time information and number information.
  8. 根据权利要求5所述的方法,其特征在于,还包括:The method according to claim 5, further comprising:
    分别对所述多个音视频混流数据段进行视频提取,获取多个视频编码段;Performing video extraction on the multiple audio and video mixed stream data segments respectively to obtain multiple video encoding segments;
    根据所述多个视频编码段生成视频合成数据;Generating video synthesis data according to the multiple video coding segments;
    根据所述音频合成数据和所述视频合成数据生成音视频合成数据。The audio and video synthesis data is generated according to the audio synthesis data and the video synthesis data.
  9. 根据权利要求8所述的方法,其特征在于,获取多个音视频混流数据段包括:The method according to claim 8, wherein obtaining multiple audio and video mixed stream data segments comprises:
    接收目标对象发送的编辑指令,所述编辑指令包括编辑对象信息;Receiving an editing instruction sent by the target object, where the editing instruction includes editing object information;
    从所述音视频设备获取与所述编辑对象信息对应的所述多个音视频混流数据段,所述多个音视频混流数据段是通过所述音视频设备进行录制并分段保存的。Obtain the multiple audio and video mixed stream data segments corresponding to the editing object information from the audio and video equipment, where the multiple audio and video mixed stream data segments are recorded by the audio and video equipment and stored in segments.
  10. 根据权利要求9所述的方法,其特征在于,还包括:The method according to claim 9, further comprising:
    响应于所述编辑指令对所述音视频合成数据执行编辑操作,获得编辑后的所述音视频合成数据。In response to the editing instruction, an editing operation is performed on the audio and video synthesized data to obtain the edited audio and video synthesized data.
  11. 根据权利要求5所述的方法,其特征在于,获取多个音视频混流数据段包括:The method according to claim 5, wherein acquiring multiple audio and video mixed stream data segments comprises:
    接收目标对象发送的下载指令;Receive the download instruction sent by the target object;
    从所述音视频设备下载所述下载指令对应的所述多个音视频混流数据段并保存在本地。Download the multiple audio and video mixed stream data segments corresponding to the download instruction from the audio and video device and save them locally.
  12. 根据权利要求1-11中任一项所述的方法,其特征在于,所述多个音频编码段的格式为AMR音频编码格式、AAC音频编码格式、OPUS音频编码格式的其中之一。The method according to any one of claims 1-11, wherein the format of the multiple audio coding segments is one of an AMR audio coding format, an AAC audio coding format, and an OPUS audio coding format.
  13. 一种电子设备,其特征在于,包括:An electronic device, characterized in that it comprises:
    处理器;以及Processor; and
    存储器,用于存储所述处理器的可执行指令;A memory for storing executable instructions of the processor;
    其中,所述处理器配置为经由执行所述可执行指令来执行音频数据处理方法,所述音频数据处理方法包括:Wherein, the processor is configured to execute an audio data processing method by executing the executable instruction, and the audio data processing method includes:
    获取多个音频编码段,所述多个音频编码段是由音视频设备进行录制并分段保存的;Acquiring multiple audio coding segments, the multiple audio coding segments are recorded by audio and video equipment and saved in segments;
    对所述多个音频编码段进行拼接,获得拼接数据;Splicing the multiple audio coding segments to obtain spliced data;
    对所述拼接数据进行解码,获得所述音频编码段的音频合成数据。Decoding the spliced data to obtain audio synthesis data of the audio coding segment.
  14. 根据权利要求13所述的电子设备,其特征在于,对所述多个音频编码段进行拼接,获得拼接数据,包括:The electronic device according to claim 13, wherein the splicing of the multiple audio coding segments to obtain spliced data comprises:
    根据所述多个音频编码段的排列信息对所述多个音频编码段进行排序;Sorting the multiple audio coding segments according to the arrangement information of the multiple audio coding segments;
    按照排序结果对所述多个音频编码段进行首尾拼接,获得拼接数据。According to the sorting result, the multiple audio coding segments are spliced end to end to obtain spliced data.
  15. 根据权利要求14所述的电子设备,其特征在于,所述多个音频编码段的所述排列信息为录制时间信息,所述根据所述多个音频编码段的排列信息对所述多个音频编码段进行排序包括:The electronic device according to claim 14, wherein the arrangement information of the plurality of audio encoding segments is recording time information, and the arrangement of the plurality of audio encoding segments is performed according to the arrangement information of the plurality of audio encoding segments. The coding segment to be sorted includes:
    根据所述多个音频编码段的所述录制时间信息对所述多个音频编码段按照时间顺序进行排序。Sort the multiple audio encoding segments in a time sequence according to the recording time information of the multiple audio encoding segments.
  16. 根据权利要求14所述的电子设备,其特征在于,所述多个音频编码段的所述排列信息为编号信息,所述根据所述多个音频编码段的排列信息对所述多个音频 编码段进行排序包括:The electronic device according to claim 14, wherein the arrangement information of the plurality of audio encoding segments is number information, and the plurality of audio encodings are encoded according to the arrangement information of the plurality of audio encoding segments The segments to be sorted include:
    根据所述多个音频编码段的所述编号信息对所述多个音频编码段按照编号顺序进行排序。Sorting the multiple audio coding segments in a number order according to the number information of the multiple audio coding segments.
  17. 根据权利要求13所述的电子设备,其特征在于,获取多个音频编码段包括:The electronic device according to claim 13, wherein acquiring a plurality of audio coding segments comprises:
    获取多个音视频混流数据段;Obtain multiple audio and video mixed stream data segments;
    分别对所述多个音视频混流数据段进行音频提取,获得所述多个音频编码段。Perform audio extraction on the multiple audio and video mixed stream data segments respectively to obtain the multiple audio coding segments.
  18. 根据权利要求17所述的电子设备,其特征在于,对所述多个音频编码段进行拼接,获得拼接数据,包括:The electronic device according to claim 17, wherein the splicing of the multiple audio coding segments to obtain spliced data comprises:
    根据所述多个音视频混流数据段的排列信息确定所述多个音频编码段的排列信息;Determining the arrangement information of the multiple audio coding segments according to the arrangement information of the multiple audio and video mixed stream data segments;
    根据所述多个音频编码段的排列信息对所述多个音频编码段进行排序;Sorting the multiple audio coding segments according to the arrangement information of the multiple audio coding segments;
    按照排序结果对所述多个音频编码段进行首尾拼接,获得拼接数据。According to the sorting result, the multiple audio coding segments are spliced end to end to obtain spliced data.
  19. 根据权利要求18所述的电子设备,其特征在于,所述多个音视频混流数据段的所述排列信息为录制时间信息、编号信息的其中之一。The electronic device according to claim 18, wherein the arrangement information of the multiple audio and video mixed stream data segments is one of recording time information and number information.
  20. 根据权利要求17所述的电子设备,其特征在于,所述音频数据处理方法还包括:The electronic device according to claim 17, wherein the audio data processing method further comprises:
    分别对所述多个音视频混流数据段进行视频提取,获取多个视频编码段;Performing video extraction on the multiple audio and video mixed stream data segments respectively to obtain multiple video encoding segments;
    根据所述多个视频编码段生成视频合成数据;Generating video synthesis data according to the multiple video coding segments;
    根据所述音频合成数据和所述视频合成数据生成音视频合成数据。The audio and video synthesis data is generated according to the audio synthesis data and the video synthesis data.
  21. 根据权利要求20述的电子设备,其特征在于,获取多个音视频混流数据段包括:The electronic device according to claim 20, wherein acquiring a plurality of audio and video mixed stream data segments comprises:
    通过用户接口接收目标对象发送的编辑指令,所述编辑指令包括编辑对象信息;Receiving an editing instruction sent by the target object through a user interface, where the editing instruction includes editing object information;
    与所述音视频设备进行通信,通过通信接口从所述音视频设备获取与所述编辑对象信息对应的所述多个音视频混流数据段,所述多个音视频混流数据段是通过所述音视频设备进行录制并分段保存的。Communicate with the audio and video equipment, and obtain the multiple audio and video mixed stream data segments corresponding to the editing object information from the audio and video equipment through a communication interface, and the multiple audio and video mixed stream data segments are passed through the Audio and video equipment is recorded and saved in segments.
  22. 根据权利要求21所述的电子设备,其特征在于,所述音频数据处理方法还包括:The electronic device according to claim 21, wherein the audio data processing method further comprises:
    响应于所述编辑指令对所述音视频合成数据执行编辑操作,获得编辑后的所述音视频合成数据。In response to the editing instruction, an editing operation is performed on the audio and video synthesized data to obtain the edited audio and video synthesized data.
  23. 根据权利要求17所述的电子设备,其特征在于,获取多个音视频混流数据段包括:The electronic device according to claim 17, wherein acquiring a plurality of audio and video mixed stream data segments comprises:
    通过用户接口接收目标对象发送的下载指令;Receive the download instruction sent by the target object through the user interface;
    与所述音视频设备进行通信,通过通信接口从所述音视频设备下载所述下载指令对应的所述多个音视频混流数据段并保存在本地。Communicate with the audio and video equipment, download the multiple audio and video mixed stream data segments corresponding to the download instruction from the audio and video equipment through a communication interface and save them locally.
  24. 根据权利要求13-23中任一项所述的电子设备,其特征在于,所述多个音 频编码段的格式为AMR音频编码格式、AAC音频编码格式、OPUS音频编码格式的其中之一。The electronic device according to any one of claims 13-23, wherein the format of the multiple audio coding segments is one of an AMR audio coding format, an AAC audio coding format, and an OPUS audio coding format.
  25. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1-12任一项所述的音频数据处理方法。A computer-readable storage medium with a computer program stored thereon, wherein the computer program implements the audio data processing method according to any one of claims 1-12 when the computer program is executed by a processor.
PCT/CN2020/079342 2020-03-13 2020-03-13 Audio data processing method, electronic device and computer-readable storage medium WO2021179321A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202080004659.8A CN112771880A (en) 2020-03-13 2020-03-13 Audio data processing method, electronic device and computer readable storage medium
PCT/CN2020/079342 WO2021179321A1 (en) 2020-03-13 2020-03-13 Audio data processing method, electronic device and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/079342 WO2021179321A1 (en) 2020-03-13 2020-03-13 Audio data processing method, electronic device and computer-readable storage medium

Publications (1)

Publication Number Publication Date
WO2021179321A1 true WO2021179321A1 (en) 2021-09-16

Family

ID=75699506

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/079342 WO2021179321A1 (en) 2020-03-13 2020-03-13 Audio data processing method, electronic device and computer-readable storage medium

Country Status (2)

Country Link
CN (1) CN112771880A (en)
WO (1) WO2021179321A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1992944A (en) * 2005-12-27 2007-07-04 三星电子株式会社 Device and method for transmitting and receiving audio data in wireless terminal
CN101167127A (en) * 2005-04-28 2008-04-23 杜比实验室特许公司 Method and system for operating audio encoders in parallel
US20120232908A1 (en) * 2011-03-07 2012-09-13 Terriberry Timothy B Methods and systems for avoiding partial collapse in multi-block audio coding
CN102768844A (en) * 2012-03-31 2012-11-07 新奥特(北京)视频技术有限公司 Method for splicing audio code stream
CN105187896A (en) * 2015-09-09 2015-12-23 北京暴风科技股份有限公司 Multi-segment media file playing method and system

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101861941B1 (en) * 2014-02-10 2018-07-02 돌비 인터네셔널 에이비 Embedding encoded audio into transport stream for perfect splicing
CN106534971B (en) * 2016-12-05 2019-04-02 腾讯科技(深圳)有限公司 A kind of audio-video clipping method and device
CN107800988A (en) * 2017-11-08 2018-03-13 青岛海信移动通信技术股份有限公司 A kind of method and device of video record, electronic equipment
CN108718395B (en) * 2018-06-08 2021-05-11 深圳市云智易联科技有限公司 Segmented video recording method and automobile data recorder

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101167127A (en) * 2005-04-28 2008-04-23 杜比实验室特许公司 Method and system for operating audio encoders in parallel
CN1992944A (en) * 2005-12-27 2007-07-04 三星电子株式会社 Device and method for transmitting and receiving audio data in wireless terminal
US20120232908A1 (en) * 2011-03-07 2012-09-13 Terriberry Timothy B Methods and systems for avoiding partial collapse in multi-block audio coding
CN102768844A (en) * 2012-03-31 2012-11-07 新奥特(北京)视频技术有限公司 Method for splicing audio code stream
CN105187896A (en) * 2015-09-09 2015-12-23 北京暴风科技股份有限公司 Multi-segment media file playing method and system

Also Published As

Publication number Publication date
CN112771880A (en) 2021-05-07

Similar Documents

Publication Publication Date Title
CN103299366B (en) Devices for encoding and detecting a watermarked signal
WO2017092340A1 (en) Method and device for compressing and playing video
WO2019120048A1 (en) Mp4 file processing method and related device therefor
WO2020155964A1 (en) Audio/video switching method and apparatus, and computer device and readable storage medium
CN101501775A (en) Media timeline processing infrastructure
CN111078930A (en) Audio file data processing method and device
CN111182315A (en) Multimedia file splicing method, device, equipment and medium
WO2016171900A1 (en) Gapless media generation
JP2023510556A (en) Audio encoding and decoding method and audio encoding and decoding device
US20100104267A1 (en) System and method for playing media file
WO2021179321A1 (en) Audio data processing method, electronic device and computer-readable storage medium
JP2006195471A (en) Method and apparatus for encoding and decoding multi-channel signal
JP2013528823A5 (en)
JPWO2019069710A1 (en) Encoding device and method, decoding device and method, and program
CN112349303B (en) Audio playing method, device and storage medium
WO2022135105A1 (en) Video dubbing method and apparatus for functional machine, terminal device and storage medium
JP7375089B2 (en) Method, device, computer readable storage medium and computer program for determining voice response speed
CN110336804B (en) Multimedia file recovery method, device and equipment
CN100388779C (en) Motion picture processing apparatus, control method therefor, videophone apparatus, and mobile terminal
CN113744744B (en) Audio coding method, device, electronic equipment and storage medium
CN102768834A (en) Method for decoding audio frequency frames
CN116189654B (en) Voice editing method and device, electronic equipment and storage medium
US10348880B2 (en) System and method for generating audio data
JP7247184B2 (en) Information processing device, information processing system, program and information processing method
CN102768844A (en) Method for splicing audio code stream

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20923925

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20923925

Country of ref document: EP

Kind code of ref document: A1