CN114025196A - Encoding method, decoding method, encoding/decoding device, and medium - Google Patents

Encoding method, decoding method, encoding/decoding device, and medium Download PDF

Info

Publication number
CN114025196A
CN114025196A CN202111179290.7A CN202111179290A CN114025196A CN 114025196 A CN114025196 A CN 114025196A CN 202111179290 A CN202111179290 A CN 202111179290A CN 114025196 A CN114025196 A CN 114025196A
Authority
CN
China
Prior art keywords
data
frame
video
encoding
audio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111179290.7A
Other languages
Chinese (zh)
Inventor
李斐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
MIGU Culture Technology Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
MIGU Culture Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, MIGU Culture Technology Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN202111179290.7A priority Critical patent/CN114025196A/en
Publication of CN114025196A publication Critical patent/CN114025196A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/233Processing of audio elementary streams
    • H04N21/2335Processing of audio elementary streams involving reformatting operations of audio signals, e.g. by converting from one coding standard to another
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23424Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for inserting or substituting an advertisement
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234309Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4 or from Quicktime to Realvideo
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4398Processing of audio elementary streams involving reformatting operations of audio signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44016Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440218Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention discloses an encoding method, a decoding method, an encoding and decoding device and a medium, wherein the encoding method comprises the following steps: acquiring a video frame and audio pulse code modulation data corresponding to the video frame; merging the video frame and the audio pulse code modulation data to obtain a data frame; and encoding the data frame. The invention aims to solve the problem that the subsequent video file transmission and decoding process is complex due to the existing audio and video coding mode.

Description

Encoding method, decoding method, encoding/decoding device, and medium
Technical Field
The present invention relates to the field of encoding and decoding technologies, and in particular, to an encoding method, a decoding method, an encoding and decoding apparatus, and a medium.
Background
In order to meet the storage requirement, different video file formats (also called video packaging formats) are set to package video data and audio data in one file based on the video file format, so as to facilitate the simultaneous playing of video and audio. Such a file is called a video file.
Video file format, i.e. the specification of packing video data and audio data into one file. Common video file formats are generally indicated with different suffixes, such as: avi, rmvb, mp4, flv, mkv, and the like.
The audio data and the video data encapsulated in the video file are both compression-encoded data. Currently, for the encoding of Audio data and Video data, different compression encoding technologies are often used to respectively perform compression encoding on the Audio data and the Video data, for example, H265 or AVC (Advanced Video Coding) compression encoding technology is used to perform compression encoding on the Video data, ACC (Advanced Audio Coding) is used to perform compression encoding on the Audio data, the Audio data and the Video data after compression encoding are packaged according to a certain Video file format, and then a Video file can be generated for storage.
The above coding method, both for the subsequent video file transmission and video file decoding, causes the problem of complex implementation process, which is specifically shown in:
referring to fig. 1, for transmission of a video file, since audio data and video data are encoded and decoded respectively and belong to different data streams, in the transmission process of the video file, the audio data and the video data merged in the video file need to be shunted and transmitted, and then merged and encapsulated after transmission is completed. This makes the service complexity of video file transmission high.
For decoding a video file, please refer to fig. 2, since audio data and video data are different streams, the encoded audio data and video data need to be decoded respectively during video decoding; meanwhile, time synchronization processing operation needs to be carried out on the audio data and the video data when the playing and retrieving functions are executed, and the unification of the audio data and the video data on a time domain is ensured. This makes the service complexity when decoding video files high.
Disclosure of Invention
The invention mainly aims to provide an encoding method, a decoding method, an encoding and decoding device and a medium, and aims to solve the problem that the subsequent video file transmission and decoding process is complex due to the existing audio and video encoding mode.
To achieve the above object, the present invention provides an encoding method, including:
acquiring a video frame and audio pulse code modulation data corresponding to the video frame;
merging the video frame and the audio pulse code modulation data to obtain a data frame;
and encoding the data frame.
In an embodiment, the step of combining the video frame and the audio pcm data to obtain a data frame includes:
acquiring video parameters reflecting the size and format of the video frame;
acquiring audio parameters reflecting the size and format of the audio pulse code modulation data;
generating an information header according to the video parameters and the audio parameters, wherein the information header is used for distinguishing video frames in the data frames and audio pulse code modulation data corresponding to the video frames;
and splicing the video frame, the audio pulse code modulation data and the information head according to a preset format to obtain the data frame.
In an embodiment, after the step of encoding the data frame, the method further includes:
acquiring play time stamps of a plurality of coded data frames;
splicing the plurality of coded data frames according to the playing time stamp to obtain a video file;
setting a preset number of data frames in the video file as key frames, wherein the number of the key frames is less than that of the data frames.
In one embodiment, the step of encoding the data frame comprises:
acquiring a preset coding algorithm, wherein the preset coding algorithm at least comprises a video coding algorithm and an audio coding algorithm;
and encoding the data frame by adopting the preset encoding algorithm.
In order to achieve the above object, the present invention further provides a decoding method, including:
acquiring a coded data frame;
decoding the encoded data frame;
and determining a video frame contained in the data frame and audio pulse code modulation data corresponding to the video frame according to the decoded data frame.
In an embodiment, the step of determining, according to the decoded data frame, a video frame included in the data frame and audio pulse code modulation data corresponding to the video frame includes:
acquiring an information header of the decoded data frame;
and determining the video frame contained in the data frame and the audio pulse code modulation data corresponding to the video frame according to the information header.
In one embodiment, the step of acquiring the encoded data frame includes:
acquiring a video file;
acquiring a playing time stamp of each data frame in the video file;
and sequentially using the data frames as the coded data frames according to the playing time stamps.
In addition, to achieve the above object, the present invention further provides a coding and decoding apparatus, which includes a first obtaining module, a merging module, and a coding module, wherein:
the first acquisition module is used for acquiring a video frame and audio pulse code modulation data corresponding to the video frame;
the merging module is used for merging the video frame and the audio pulse code modulation data to obtain a data frame;
the encoding module is used for encoding the data frame; alternatively, the first and second electrodes may be,
the encoding and decoding device comprises a second acquisition module, a decoding module and a determination module, wherein:
the second obtaining module is used for obtaining the coded data frame;
the decoding module is used for decoding the encoded data frame;
the determining module is used for determining a video frame contained in the data frame and audio pulse coding modulation data corresponding to the video frame according to the decoded data frame.
In addition, to achieve the above object, the present invention further provides a codec device, which includes a memory, a processor, and a codec program stored in the memory and operable on the processor, and when executed by the processor, the codec program implements the steps of the encoding method or the decoding method according to any one of the above items.
In addition, to achieve the above object, the present invention further provides a computer readable storage medium having a codec program stored thereon, where the codec program implements the encoding method or the decoding method according to any one of the above items when being executed by a processor.
The invention provides an encoding method, a decoding method, an encoding and decoding device and a medium, which are used for obtaining a video frame and audio pulse code modulation data corresponding to the video frame, combining the video frame and the audio pulse code modulation data to obtain a data frame and encoding the data frame. According to the scheme, the audio frame and the video frame are combined into the data frame and then encoded, the audio frame and the video frame do not need to be encoded respectively, a subsequent video file only needs to be transmitted through one data stream without shunt transmission, the audio frame and the video frame of the combined data frame are synchronized in time, and the video file is decoded without time synchronization processing operation, so that the problem that the subsequent video file transmission and decoding process is complex due to the existing audio and video encoding mode is solved.
Drawings
FIG. 1 is a first diagram of an encoding process to which an embodiment of the invention relates;
FIG. 2 is a first diagram of a decoding process according to an embodiment of the present invention;
fig. 3 is a schematic hardware architecture of a coding/decoding device according to an embodiment of the present invention;
FIG. 4 is a flowchart illustrating a first embodiment of the encoding method of the present invention;
FIG. 5 is a flowchart illustrating a second embodiment of the encoding method of the present invention;
FIG. 6 is a flowchart illustrating a first embodiment of the decoding method of the present invention;
FIG. 7 is a second diagram of an encoding process according to an embodiment of the present invention;
FIG. 8 is a schematic diagram of a data frame according to an embodiment of the present invention;
FIG. 9 is a second diagram of a decoding process according to an embodiment of the present invention;
fig. 10 is a schematic block diagram of a codec device according to an embodiment of the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
As an implementation solution, please refer to fig. 3, fig. 3 is a schematic diagram of a hardware architecture of a codec device according to an embodiment of the present invention, and as shown in fig. 3, the codec device may include a processor 101, for example, a CPU, a memory 102, and a communication bus 103. Wherein a communication bus 103 is used for enabling the connection communication between these components.
The memory 102 may be a high-speed RAM memory or a non-volatile memory (e.g., a disk memory). As shown in fig. 3, a memory 102, which is a kind of computer-readable storage medium, may include therein a codec program; and the processor 101 may be configured to call the codec program stored in the memory 102 and perform the following operations:
acquiring a video frame and audio pulse code modulation data corresponding to the video frame;
merging the video frame and the audio pulse code modulation data to obtain a data frame;
and encoding the data frame.
In one embodiment, the processor 101 may be configured to call the codec program stored in the memory 102 and perform the following operations:
acquiring video parameters reflecting the size and format of the video frame;
acquiring audio parameters reflecting the size and format of the audio pulse code modulation data;
generating an information header according to the video parameters and the audio parameters, wherein the information header is used for distinguishing video frames in the data frames and audio pulse code modulation data corresponding to the video frames;
and splicing the video frame, the audio pulse code modulation data and the information head according to a preset format to obtain the data frame.
In one embodiment, the processor 101 may be configured to call the codec program stored in the memory 102 and perform the following operations:
acquiring play time stamps of a plurality of coded data frames;
splicing the plurality of coded data frames according to the playing time stamp to obtain a video file;
setting a preset number of data frames in the video file as key frames, wherein the number of the key frames is less than that of the data frames.
In one embodiment, the processor 101 may be configured to call the codec program stored in the memory 102 and perform the following operations:
acquiring a preset coding algorithm, wherein the preset coding algorithm at least comprises a video coding algorithm and an audio coding algorithm;
and encoding the data frame by adopting the preset encoding algorithm.
In one embodiment, the processor 101 may be configured to call the codec program stored in the memory 102 and perform the following operations:
acquiring a coded data frame;
decoding the encoded data frame;
and determining a video frame contained in the data frame and audio pulse code modulation data corresponding to the video frame according to the decoded data frame.
In one embodiment, the processor 101 may be configured to call the codec program stored in the memory 102 and perform the following operations:
acquiring an information header of the decoded data frame;
and determining the video frame contained in the data frame and the audio pulse code modulation data corresponding to the video frame according to the information header.
In one embodiment, the processor 101 may be configured to call the codec program stored in the memory 102 and perform the following operations:
acquiring a video file;
acquiring a playing time stamp of each data frame in the video file;
and sequentially using the data frames as the coded data frames according to the playing time stamps.
Based on the hardware architecture of the encoding and decoding device, embodiments of the encoding method and the decoding method of the present invention are provided.
Referring to fig. 4, fig. 4 is a flowchart illustrating an encoding method according to a first embodiment of the present invention, the encoding method includes:
step S10, acquiring video frames and audio pulse code modulation data corresponding to the video frames;
in this embodiment, an execution main body of the data processing method is an encoding and decoding device, where the encoding and decoding device refers to a device capable of encoding and decoding audio and video data, and optionally, the encoding and decoding device may be a server or a terminal device, and of course, in other embodiments, the encoding and decoding device may also be another device capable of encoding and decoding audio and video data, which is not limited in this embodiment.
The encoding and decoding device acquires a video frame and audio pulse code modulation data corresponding to the video frame, where the video frame refers to a single frame of video data acquired by a data acquisition end, optionally, the video data acquired by the data acquisition end may be RGB data or YUV data, and for example, the video frame may be RGB data or YUV data with a play time of 25 milliseconds. It should be noted that the playing duration of the video frame may be determined according to the sampling rate of the video acquired by the data acquisition end, which is not limited in this embodiment; the audio pulse code modulation data refers to audio data acquired by a data acquisition end, optionally, the audio data may be PCM data, for example, the audio pulse code modulation data may be PCM data with a playing time duration of 25 milliseconds, the playing time duration of the audio pulse code modulation data is the same as the playing time duration of a video frame, that is, the video frame and the audio pulse code modulation data are synchronized in time, and the video frame and the audio pulse code modulation data have a one-to-one correspondence relationship.
Optionally, the data acquisition end acquires video data and audio data corresponding to the video data, and determines audio pulse code modulation data corresponding to each video frame in the video data in the audio data according to sampling information of the video data and the audio data. After the data acquisition end acquires the audio data and the video data, the audio data and the video data are grouped and combined according to the sampling information of the audio data and the video data, so that the time ranges of the video frames and the corresponding audio pulse code modulation data are kept consistent. Alternatively, the sampling information may be a sampling rate, a number of channels, a sampling format, and the like of the video data and the audio data. For example, the sampling rate of the high definition audio data sampled by the data acquisition end is 48000HZ, the number of double channels is 16 bytes, the sampling rate of the high definition video data sampled by the data acquisition end is 40PFS, the size of the audio data corresponding to one second of video can be calculated by the sampling rate × number of channels sampling format/8 according to the sampling information of the audio data to be 192KB, the playing duration of one frame of video is 25 milliseconds according to the sampling rate of the video data, and thus, the size of the audio corresponding to one frame of video is 4.8 KB. The video frame and the audio pulse code modulation data corresponding to the video frame can be determined through the mode.
After the data acquisition end acquires the video frame and the audio pulse code modulation data corresponding to the video frame, the data acquisition end sends the video frame and the audio pulse code modulation data corresponding to the video frame to the coding and decoding device, and the coding and decoding device receives the video frame sent by the data acquisition end and the audio pulse code modulation data corresponding to the video frame.
Step S20, merging the video frame and the audio pulse code modulation data to obtain a data frame;
step S30, encoding the data frame.
In this embodiment, please refer to fig. 7, and fig. 7 is a second schematic diagram of an encoding process according to an embodiment of the present invention. As shown in fig. 7, after acquiring a video frame and audio pulse code modulation data corresponding to the video frame, the encoding and decoding apparatus merges the video frame and the audio pulse code modulation data corresponding to the video frame to obtain a data frame, and then encodes the merged data frame by using an encoder to obtain an encoded data frame. The merging refers to combining the video frame and the audio pulse code modulation data corresponding to the video frame into a whole.
Optionally, the encoding and decoding device combines the obtained video frame and the audio pulse code modulation data corresponding to the video frame to obtain a data frame, and then obtains a preset encoding algorithm, where the preset encoding algorithm at least includes a video encoding algorithm and an audio encoding algorithm, and encodes the data frame by using the preset encoding algorithm. Specifically, the encoding and decoding device encodes the data frame after acquiring the data frame, and the encoding and decoding device may encode the data frame by using a preset encoding algorithm through an encoder to obtain the encoded data frame. It should be noted that the encoder may use different encoding algorithms to encode the Audio pulse code modulation data and the Video frame in the data frame respectively to obtain the encoded data frame, for example, the encoder may use an MP4 encoding algorithm or an AVC (Advanced Video Coding) encoding algorithm to encode the Video frame in the data frame, use an ACC (Advanced Audio Coding) encoding algorithm to encode the Audio pulse code modulation data in the data frame, or use the same encoding algorithm to encode the Audio pulse code modulation data and the Video frame in the data frame to obtain the encoded data frame.
Optionally, after obtaining the encoded data frames, the encoding and decoding device may splice the plurality of encoded data frames in sequence to obtain the video file.
Optionally, the encoding and decoding device may obtain the playing time stamps of the multiple encoded data frames, and sequentially splice the multiple encoded data frames according to the playing time stamps to obtain the video file. The playing time stamp refers to the playing time of the data frame, and the encoding and decoding device can sequentially splice the obtained multiple encoded data frames according to the playing time stamp to obtain the video file.
Optionally, after the encoding and decoding device acquires the video file, a preset number of data frames in the video file may be set as key frames, where the number of key frames in the video file is less than the number of data frames in the video file. After the encoding and decoding device sets the key frame, the video file can be stored in a storage module of the encoding and decoding device, and the video file can also be sent to other equipment for storage.
Optionally, the encoding and decoding device may directly decode and play the encoded data frame after acquiring the encoded data frame, or may send the encoded data frame to the playing device for the playing device to decode and play.
In the technical solution provided in this embodiment, a data frame is obtained by obtaining a video frame and audio pulse code modulation data corresponding to the video frame, and combining the video frame and the audio pulse code modulation data, so as to encode the data frame. According to the scheme, the audio frame and the video frame are combined into the data frame and then encoded, the audio frame and the video frame do not need to be encoded respectively, the subsequent video file is transmitted only through one data stream without shunt transmission, the audio frame and the video frame of the combined data frame are synchronized in time, the video file is decoded and then time synchronization processing operation is not needed, and therefore the problem that the transmission and decoding process of the subsequent video file is complex due to the existing audio and video encoding mode is solved.
Referring to fig. 5, fig. 5 is a flowchart illustrating a second embodiment of the encoding method according to the present invention, wherein based on the first embodiment, the step of S20 includes:
step S21, obtaining video parameters reflecting the size and format of the video frame;
step S22, obtaining audio parameters reflecting the size and format of the audio pulse code modulation data;
in this embodiment, after acquiring a video frame and audio pulse code modulation data corresponding to the video frame, the codec device acquires a video parameter reflecting the size and format of the video frame and an audio parameter reflecting the size and format of the audio pulse code modulation data, optionally, the audio parameter may include, but is not limited to, information such as the number of bytes of the audio pulse code modulation data, and the video parameter may include, but is not limited to, information such as the number of bytes of the video frame. The one-to-one correspondence of audio pulse code modulation data and video frames can be realized through audio parameters and video parameters.
Step S23, generating an information header according to the video parameter and the audio parameter, where the information header is used to distinguish a video frame in the data frame and audio pulse code modulation data corresponding to the video frame;
and step S24, splicing the video frame, the audio pulse code modulation data and the information header according to a preset format to obtain the data frame.
In this embodiment, after the encoding and decoding device obtains the audio parameter and the video parameter, an information header is generated according to the audio parameter and the video parameter, and the video frame, the audio frame, and the information header are spliced according to a preset format to obtain a data frame, where the information header is used to distinguish a video frame in the data frame and audio pulse code modulation data corresponding to the video frame.
Specifically, referring to fig. 8, fig. 8 is a schematic diagram of a data frame according to an embodiment of the present invention, and as shown in fig. 8, an encoding and decoding device may splice the acquired header, audio frame, and video frame according to a format of "header-video frame-audio pulse code modulation data" to obtain a data frame. In other embodiments, the preset format may be set according to actual needs, for example, the preset format may also be "video frame-audio pulse code modulation data-header", which is not limited in this embodiment.
In the technical solution provided in this embodiment, a header is generated according to a video parameter and an audio parameter by obtaining the video parameter of a video frame and the audio parameter of audio pulse code modulation data. And splicing the video frame, the audio pulse code modulation data and the information head according to a preset format to obtain a data frame. According to the scheme, the video frame and the audio pulse code modulation data are combined before coding to obtain the data frame, so that the time synchronization of the video frame and the audio pulse code modulation data is ensured, the data frame can be directly coded, the video frame and the audio pulse code modulation data do not need to be coded respectively, and the service complexity of audio and video data processing is reduced.
Referring to fig. 6, fig. 6 is a flowchart illustrating a decoding method according to a first embodiment of the present invention, the decoding method includes:
step S40, acquiring the encoded data frame;
in this embodiment, the codec device acquires the encoded data frame to decode the encoded data frame.
Optionally, the encoding and decoding apparatus may obtain the video file, obtain the playing time stamps of the data frames in the video file, and sequentially use the data frames as the encoded data frames according to the playing time stamps.
Optionally, the encoding and decoding device may obtain the video file by streaming or reading, then obtain the playing time stamps of the data frames in the video file, and sequentially use the playing time stamps of the data frames as the encoded data frames according to the sequence of the playing time stamps of the data frames.
Alternatively, the codec device may acquire a target key frame in the video file. Specifically, when the encoding and decoding device receives the search instruction, the playing time stamp of the data frame to be searched is determined according to the search instruction, then the playing time stamp of the key frame in the video file is obtained, and the key frame closest to the playing time stamp of the data frame to be searched is determined as the target key frame. And after the encoding and decoding device acquires the target key frame, acquiring the playing time stamps of all the data frames in the video file, determining the data frames with the playing time stamps behind the playing time stamps of the target key frame as the data frames to be decoded, and sequentially taking the data frames to be decoded as the encoded data frames according to the sequence of the playing time stamps of the data frames to be decoded. In the embodiment, the target key frame in the video file is acquired, the data frame to be decoded is determined according to the playing time stamp of the target key frame, and each data frame to be decoded is sequentially used as the encoded data frame according to the playing time stamp of the data frame to be decoded. According to the scheme, after the target key frame is searched in the data frame searching process, the target key frame already contains the audio pulse code modulation data and the video frame, and the synchronization of audio and video is ensured, the decoding playing can be directly carried out as long as the key frame is searched in the searching process, the audio and video and the synchronous service logic do not need to be respectively obtained, and the service complexity of the video searching function is reduced.
Step S50, decoding the encoded data frame;
step S60, determining, according to the decoded data frame, a video frame included in the data frame and audio pulse code modulation data corresponding to the video frame.
In this embodiment, please refer to fig. 9, and fig. 9 is a second schematic diagram of a decoding process according to an embodiment of the present invention. As shown in fig. 9, after acquiring the encoded data frame, the codec device decodes the encoded data frame by a decoder to obtain a video frame and audio pulse code modulation data.
Optionally, the codec may decode the data frame by using a decoding algorithm through a decoder to obtain a decoded data frame. It should be noted that the decoder may use different decoding algorithms to decode the audio portion and the video portion in the data frame respectively to obtain the decoded data frame, for example, the decoder may use a decoding algorithm corresponding to an MP4 encoding algorithm or an AVC encoding algorithm to decode the video portion in the data frame, use a decoding algorithm corresponding to an ACC encoding algorithm to decode the audio portion in the data frame, or use the same decoding algorithm to decode the entire data frame to obtain the decoded data frame. After the encoding and decoding device acquires the decoded data frame, the video frame contained in the data frame and the audio pulse coding modulation data corresponding to the video frame are determined according to the decoded data frame.
Optionally, the encoding and decoding apparatus may obtain a header of the decoded data frame, and determine, according to the header, a video frame included in the data frame and audio pulse code modulation data corresponding to the video frame.
Optionally, after the codec device acquires the audio pulse code modulation data and the video frame of the data frame, the codec device simultaneously plays the video frame and the audio pulse code modulation data.
In the technical solution provided in this embodiment, the encoded data frame is decoded by acquiring the encoded data frame, and the video frame included in the data frame and the audio pulse code modulation data corresponding to the video frame are determined according to the decoded data frame. According to the scheme, the video frame and the audio pulse code modulation data are obtained by decoding the data frame, and the audio pulse code modulation data and the video frame obtained by decoding are synchronized in time, so that a special design scheme is not needed to solve the problem of audio and video synchronization when the decoded audio pulse code modulation data and the video frame are played, the audio pulse code modulation data and the video frame can be ensured to be synchronized in time when the audio pulse code modulation data and the video frame obtained by decoding are played at the same time, and the service complexity of audio and video data processing is reduced.
Referring to fig. 10, the present invention further provides a coding and decoding apparatus, which includes a first obtaining module 100, a combining module 200, and a coding module 300, wherein:
the first obtaining module 100 is configured to obtain a video frame and audio pulse code modulation data corresponding to the video frame;
the merging module 200 is configured to merge the video frame and the audio pulse code modulation data to obtain a data frame;
the encoding module 300 is configured to encode the data frame; alternatively, the first and second electrodes may be,
the encoding and decoding apparatus includes a second obtaining module 400, a decoding module 500, and a determining module 600, wherein:
the second obtaining module 400 is configured to obtain a coded data frame;
the decoding module 500 is configured to decode the encoded data frame;
the determining module 600 is configured to determine, according to the decoded data frame, a video frame included in the data frame and audio pulse code modulation data corresponding to the video frame.
Based on the foregoing embodiments, the present invention further provides an encoding and decoding apparatus, where the encoding and decoding apparatus may include a memory, a processor, and an encoding and decoding program stored in the memory and executable on the processor, and when the processor executes the encoding and decoding program, the encoding method or the decoding method according to any of the foregoing embodiments is implemented.
Based on the foregoing embodiments, the present invention further provides a computer-readable storage medium, on which a coding program and a decoding program are stored, where the coding program and the decoding program, when executed by a processor, implement the steps of the encoding method or the decoding method according to any of the foregoing embodiments.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for enabling a terminal device (e.g., a smart tv, a mobile phone, a computer, etc.) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. An encoding method, characterized in that the encoding method comprises:
acquiring a video frame and audio pulse code modulation data corresponding to the video frame;
merging the video frame and the audio pulse code modulation data to obtain a data frame;
and encoding the data frame.
2. The encoding method of claim 1, wherein said step of combining said video frame and said audio pulse code modulation data to obtain a data frame comprises:
acquiring video parameters reflecting the size and format of the video frame;
acquiring audio parameters reflecting the size and format of the audio pulse code modulation data;
generating an information header according to the video parameters and the audio parameters, wherein the information header is used for distinguishing video frames in the data frames and audio pulse code modulation data corresponding to the video frames;
and splicing the video frame, the audio pulse code modulation data and the information head according to a preset format to obtain the data frame.
3. The encoding method of claim 1, wherein the step of encoding the data frame is followed by further comprising:
acquiring play time stamps of a plurality of coded data frames;
splicing the plurality of coded data frames according to the playing time stamp to obtain a video file;
setting a preset number of data frames in the video file as key frames, wherein the number of the key frames is less than that of the data frames.
4. The encoding method of claim 1, wherein the step of encoding the data frame comprises:
acquiring a preset coding algorithm, wherein the preset coding algorithm at least comprises a video coding algorithm and an audio coding algorithm;
and encoding the data frame by adopting the preset encoding algorithm.
5. A decoding method, characterized in that the decoding method comprises:
acquiring a coded data frame;
decoding the encoded data frame;
and determining a video frame contained in the data frame and audio pulse code modulation data corresponding to the video frame according to the decoded data frame.
6. The decoding method according to claim 5, wherein the step of determining, according to the decoded data frame, a video frame included in the data frame and audio pulse code modulation data corresponding to the video frame comprises:
acquiring an information header of the decoded data frame;
and determining the video frame contained in the data frame and the audio pulse code modulation data corresponding to the video frame according to the information header.
7. The decoding method of claim 5, wherein the step of obtaining the encoded data frame comprises:
acquiring a video file;
acquiring a playing time stamp of each data frame in the video file;
and sequentially using the data frames as the coded data frames according to the playing time stamps.
8. An encoding and decoding apparatus, comprising a first obtaining module, a combining module and an encoding module, wherein:
the first acquisition module is used for acquiring a video frame and audio pulse code modulation data corresponding to the video frame;
the merging module is used for merging the video frame and the audio pulse code modulation data to obtain a data frame;
the encoding module is used for encoding the data frame; alternatively, the first and second electrodes may be,
the encoding and decoding device comprises a second acquisition module, a decoding module and a determination module, wherein:
the second obtaining module is used for obtaining the coded data frame;
the decoding module is used for decoding the encoded data frame;
the determining module is used for determining a video frame contained in the data frame and audio pulse coding modulation data corresponding to the video frame according to the decoded data frame.
9. A codec device comprising a memory, a processor and a codec program stored on the memory and operable on the processor, the codec program when executed by the processor implementing the steps of the encoding method according to any one of claims 1 to 4 or the decoding method according to any one of claims 5 to 7.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a codec program, which when executed by a processor implements the steps of the encoding method of any one of claims 1 to 4 or the decoding method of any one of claims 5 to 7.
CN202111179290.7A 2021-10-09 2021-10-09 Encoding method, decoding method, encoding/decoding device, and medium Pending CN114025196A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111179290.7A CN114025196A (en) 2021-10-09 2021-10-09 Encoding method, decoding method, encoding/decoding device, and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111179290.7A CN114025196A (en) 2021-10-09 2021-10-09 Encoding method, decoding method, encoding/decoding device, and medium

Publications (1)

Publication Number Publication Date
CN114025196A true CN114025196A (en) 2022-02-08

Family

ID=80055764

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111179290.7A Pending CN114025196A (en) 2021-10-09 2021-10-09 Encoding method, decoding method, encoding/decoding device, and medium

Country Status (1)

Country Link
CN (1) CN114025196A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101420577A (en) * 2008-11-07 2009-04-29 武汉烽火网络有限责任公司 Storage method for multimedia data and method for accurately positioning playback position
CN102685469A (en) * 2012-05-04 2012-09-19 北京航空航天大学 Audio-video transmission code stream framing method based on moving picture experts group-2 (MPEG-2) advanced audio coding (AAC) and H.264
CN107770597A (en) * 2017-09-28 2018-03-06 北京小鸟科技股份有限公司 Audio and video synchronization method and device
CN109168031A (en) * 2018-11-06 2019-01-08 杭州云英网络科技有限公司 Streaming Media method for pushing and device, steaming media platform
CN110418189A (en) * 2019-08-02 2019-11-05 钟国波 A kind of low latency can be used for transmitting game, high frame per second audio/video transmission method
CN113377993A (en) * 2021-08-13 2021-09-10 深圳市有为信息技术发展有限公司 Audio and video data management method, system and equipment of vehicle-mounted equipment and commercial vehicle

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101420577A (en) * 2008-11-07 2009-04-29 武汉烽火网络有限责任公司 Storage method for multimedia data and method for accurately positioning playback position
CN102685469A (en) * 2012-05-04 2012-09-19 北京航空航天大学 Audio-video transmission code stream framing method based on moving picture experts group-2 (MPEG-2) advanced audio coding (AAC) and H.264
CN107770597A (en) * 2017-09-28 2018-03-06 北京小鸟科技股份有限公司 Audio and video synchronization method and device
CN109168031A (en) * 2018-11-06 2019-01-08 杭州云英网络科技有限公司 Streaming Media method for pushing and device, steaming media platform
CN110418189A (en) * 2019-08-02 2019-11-05 钟国波 A kind of low latency can be used for transmitting game, high frame per second audio/video transmission method
CN113377993A (en) * 2021-08-13 2021-09-10 深圳市有为信息技术发展有限公司 Audio and video data management method, system and equipment of vehicle-mounted equipment and commercial vehicle

Similar Documents

Publication Publication Date Title
EP3603080B1 (en) Method and apparatus for encoding media data comprising generated content
CN107634930B (en) Method and device for acquiring media data
RU2370906C2 (en) Method and device for editing of video fragments in compressed area
CN113170239B (en) Method, apparatus and storage medium for encapsulating media data into media files
EP3099074B1 (en) Systems, devices and methods for video coding
EP3962085A2 (en) Image encoding and decoding method, encoding and decoding device, encoder and decoder
CN110784718B (en) Video data encoding method, apparatus, device and storage medium
US20230043987A1 (en) Image processing apparatus and method
US10979784B1 (en) Track format for carriage of event messages
CN111683293A (en) Method for playing H.265 video across browsers based on HTTP-FLV protocol
WO2021117802A1 (en) Image processing device and method
CN108419041B (en) Video data processing method and device
US8731065B2 (en) Dynamic image stream processing method and device, and dynamic image reproduction device and dynamic image distribution device using the same
CN111885346A (en) Picture code stream synthesis method, terminal, electronic device and storage medium
CN115134629B (en) Video transmission method, system, equipment and storage medium
US9936266B2 (en) Video encoding method and apparatus
CN111147896A (en) Subtitle data processing method, device and equipment and computer storage medium
US11206386B2 (en) Information processing apparatus and information processing method
CN113316013B (en) Video screen projection method and system
CN111263164A (en) High frame frequency video parallel coding and recombination method
CN107635142B (en) Video data processing method and device
CN109600651B (en) Method and system for synchronizing file type live broadcast interactive data and audio and video data
CN114025196A (en) Encoding method, decoding method, encoding/decoding device, and medium
CN102118633B (en) Method, device and system for playing video files
CN112954396B (en) Video playing method and device, electronic equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination