CN116233453A - Video coding method and device - Google Patents

Video coding method and device Download PDF

Info

Publication number
CN116233453A
CN116233453A CN202310498132.0A CN202310498132A CN116233453A CN 116233453 A CN116233453 A CN 116233453A CN 202310498132 A CN202310498132 A CN 202310498132A CN 116233453 A CN116233453 A CN 116233453A
Authority
CN
China
Prior art keywords
slices
slice
video
downsampled
buffer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310498132.0A
Other languages
Chinese (zh)
Other versions
CN116233453B (en
Inventor
刘斯宁
赵昌华
姜晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Aixin Yuanzhi Technology Co ltd
Beijing Aixin Technology Co ltd
Original Assignee
Hangzhou Aixin Yuanzhi Technology Co ltd
Beijing Aixin Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Aixin Yuanzhi Technology Co ltd, Beijing Aixin Technology Co ltd filed Critical Hangzhou Aixin Yuanzhi Technology Co ltd
Priority to CN202310498132.0A priority Critical patent/CN116233453B/en
Publication of CN116233453A publication Critical patent/CN116233453A/en
Application granted granted Critical
Publication of CN116233453B publication Critical patent/CN116233453B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/423Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation characterised by memory arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking

Abstract

The embodiment of the application shows a video coding method and a video coding device, and relates to the technical field of image processing, wherein the method comprises the following steps: acquiring an original slice corresponding to a video frame in a video to be encoded stored in a memory, wherein one video frame comprises at least one original slice; after at least one original slice is obtained, N downsampling slices corresponding to the original slice are obtained, and the N downsampling slices are respectively stored in a corresponding first buffer; the method comprises the steps of reading down-sampling slices in a first buffer, and coding the down-sampling slices based on coding parameters corresponding to each target code stream to obtain coding slices corresponding to each target code stream; the encoded slice is written into memory. When the source video is subjected to multi-channel video output coding, only one original video frame is required to be read from the memory and converted into a plurality of video frames subjected to downsampling processing, so that multi-channel video coding is finished, memory bandwidth consumption is reduced, and memory access efficiency of the memory is improved.

Description

Video coding method and device
Technical Field
The present disclosure relates to the field of image processing technologies, and in particular, to a video encoding method and apparatus.
Background
Video coding is a technique that converts a file in an original video format into a file in another video format by compression techniques to accommodate different network bandwidths, different terminal processing capabilities, and different user requirements. For example, for the same source video of a video website, multiple videos with different resolutions need to be encoded for the source video, so that multiple video encoding is needed to be performed on the source video through video encoding, so that videos with multiple resolutions are obtained, and a user can select a video with a proper resolution according to own broadband and requirements.
For example, in some monitoring scenarios, a camera for monitoring may capture a monitoring video, which may be viewed by both a manager and an employee. However, the terminal device of the manager can only play the monitoring video with lower resolution, while the terminal device of the first-line staff can play the monitoring video with high resolution, i.e. the requirements of the terminal devices of different staff on the resolution of the monitoring video are different. Therefore, in order to be suitable for terminal devices of different employees, the monitoring video shot by the camera needs to be subjected to multi-path video coding so as to output videos with different resolutions.
However, when multi-path video encoding is performed on the source video, since each path of video encoding needs to read the source video from the memory storing the original source video, the same source video frame needs to be read from the memory multiple times, which results in increased memory bandwidth consumption of the memory and reduced memory access efficiency of the memory.
Disclosure of Invention
Some embodiments of the present application provide a video encoding method and apparatus, which can complete multi-path video encoding by only reading an original video frame from a memory once when multi-path video output encoding is performed on a source video, so as to reduce memory bandwidth consumption and improve memory access efficiency of the memory.
In a first aspect, some embodiments of the present application provide a video encoding method, including:
acquiring an original slice corresponding to a video frame in a video to be encoded stored in a memory, wherein one video frame comprises at least one original slice;
after at least one original slice is obtained, N downsampling slices corresponding to the original slice are obtained, the N downsampling slices are respectively stored in a corresponding first buffer, and N is a positive integer greater than 1;
reading the downsampled slices in the first cache, and encoding the downsampled slices based on encoding parameters corresponding to each target code stream to obtain encoding slices corresponding to each target code stream, wherein each target code stream is a code stream corresponding to an encoded video;
The encoded slice is written into the memory.
In some embodiments, the acquiring N downsampled slices corresponding to the original slice includes:
and synchronously transmitting the original slice to N downsamplers, so that the N downsamplers perform downsampling processing on the original slice to obtain N downsampled slices.
In some embodiments, the reading the downsampled slices in the first buffer and encoding the downsampled slices based on the encoding parameters corresponding to each target code stream includes:
determining a target downsampled slice corresponding to each target code stream based on the resolution of each downsampled slice;
reading the target downsampled slice from the first buffer;
and encoding the read target downsampled slice based on the encoding parameters respectively corresponding to the target code streams.
In some embodiments, if the resolution requirements of the encoded video are the same for different target code streams, the reading the target downsampled slice from the first buffer includes:
and reading the same downsampling slice from the first buffer in a time division multiplexing mode so as to finish the encoding of different target code streams aiming at the same downsampling slice.
In some embodiments, the obtaining the original slice corresponding to the video frame in the video to be encoded stored in the memory includes:
reading pixel data in the video frame from the memory;
when the number of pixels for reading the pixel data reaches a preset threshold, determining that the pixel data read when the number reaches the preset threshold is the original slice, and the total number of pixels in the video frame is an integer multiple of the preset threshold.
In some embodiments, before the acquiring the original slice corresponding to the video frame in the video to be encoded stored in the memory, the method further includes:
determining conversion coefficients of the resolution corresponding to each target code stream and the resolution of the original slice;
calculating a limit threshold according to the conversion coefficient and the available capacity of the first buffer, wherein the limit threshold is the maximum pixel number of the original slice supported by the first buffer;
and determining the preset threshold according to the limit threshold and the total number of pixels in the video frame.
In some embodiments, the writing the encoded slice into the memory comprises:
storing the coded slices obtained each time to a corresponding second cache;
For at least one of the original slices, after the encoding by the encoding parameters corresponding to each target code stream is completed, reading the encoding slices corresponding to each target code stream from the second buffer;
and writing the coding slices corresponding to the target code streams into the memory.
In some embodiments, if the first buffer is located in a target buffer, the total number of pixels of the N downsampled slices is not greater than the number of pixels that the target buffer can accommodate;
if the first buffer is located in a first logical partition of the target buffer, the total number of pixels of the N downsampled slices is not greater than the number of pixels that can be accommodated by the first logical partition.
In a second aspect, some embodiments of the present application provide a video encoding apparatus, including:
the acquisition module is used for acquiring an original slice corresponding to a video frame in the video to be encoded stored in the memory, wherein one video frame comprises at least one original slice;
the storage module is used for acquiring N downsampling slices corresponding to the original slices after at least one original slice is acquired, and storing the N downsampling slices into corresponding first caches respectively, wherein N is a positive integer greater than 1;
The coding module is used for reading the downsampling slices in the first buffer memory, coding the downsampling slices based on coding parameters corresponding to each target code stream, and obtaining coding slices corresponding to each target code stream respectively, wherein each target code stream is a code stream corresponding to a coded video;
and the writing module is used for writing the coded slice into the memory.
In a third aspect, some embodiments of the present application provide an electronic device, including: the device comprises a processor, a memory, a transmitter, a downsampler, a target buffer and an encoder;
the processor is used for processing man-machine interaction to obtain a target code stream and configuring coding parameters corresponding to the target code stream;
the memory is used for storing video to be coded and coded slices;
the transmitter is configured to acquire an original slice corresponding to a video frame in the video to be encoded stored in the memory, and transmit the original slice to the downsampler, where one video frame includes at least one original slice;
the downsampler is configured to obtain N downsampled slices corresponding to the original slice after at least one original slice is obtained, and store the N downsampled slices in the target buffer respectively, where N is a positive integer greater than 1;
The target buffer is used for buffering the downsampled slices;
the encoder is used for reading the downsampled slices in the target buffer, encoding the downsampled slices in the target buffer based on encoding parameters corresponding to each target code stream, and obtaining encoding slices corresponding to each target code stream respectively, wherein each target code stream is a code stream corresponding to an encoded video;
the target buffer is further used for buffering the coded slices;
the transmitter is further configured to write the encoded slice in the target buffer into the memory.
In some embodiments, when the downsampled slice and the encoded slice are buffered by the same target buffer, the target buffer includes a first logical partition for buffering the downsampled slice and a second logical partition for buffering the encoded slice;
when the number of target buffers includes at least two, and the downsampled slices and the encoded slices are buffered by different ones of the target buffers, the target buffers include a first physical buffer for buffering the downsampled slices and a second physical buffer for buffering the encoded slices.
In a fourth aspect, some embodiments of the present application provide an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the video encoding method when executing the program.
In a fifth aspect, some embodiments of the present application provide a computer readable storage medium having a computer program stored thereon, where the program is executed by a processor to implement the video encoding method.
Some embodiments of the present application provide a video encoding method and apparatus, the method comprising: acquiring an original slice corresponding to a video frame in a video to be encoded stored in a memory, wherein one video frame comprises at least one original slice; after at least one original slice is obtained, N downsampling slices corresponding to the original slice are obtained, the N downsampling slices are respectively stored in a corresponding first buffer, and N is a positive integer greater than 1; reading the downsampled slices in the first cache, and encoding the downsampled slices based on encoding parameters corresponding to each target code stream to obtain encoding slices corresponding to each target code stream, wherein each target code stream is a code stream corresponding to an encoded video; the encoded slice is written into the memory.
When the source video is subjected to multi-channel video output coding, after the original slice corresponding to the original video frame is read from the memory, the original slice is converted into a plurality of downsampled slices and cached, and in the subsequent multi-channel video coding process, only the downsampled slices are read from the cache and coded, so that multi-channel video coding can be finished only by reading the original video frame once from the memory, the frequency of accessing the memory can be reduced, the bandwidth consumption is reduced, and the memory access efficiency of the memory is improved.
Further, in the prior art, when encoding a certain video frame, it is often necessary to read the entire video frame from the memory before starting encoding. In the embodiment of the application, after at least one original slice is obtained, the original slice can be copied into a plurality of downsampled slices so as to finish multi-path video coding, so that the waiting time for reading the whole video frame can be reduced, the coding delay time is shortened, and the coding efficiency is improved.
Drawings
FIG. 1 shows a flow chart of a video encoding method;
fig. 2 shows a schematic diagram of an encoding process of an h.265/HEVC algorithm;
Fig. 3 shows a schematic diagram of a video encoding apparatus;
fig. 4 shows a schematic diagram of an electronic device.
Detailed Description
For purposes of clarity and implementation of the present application, the following description will make clear and complete descriptions of exemplary implementations of the present application with reference to the accompanying drawings in which exemplary implementations of the present application are illustrated, it being apparent that the exemplary implementations described are only some, but not all, of the examples of the present application.
It should be noted that the brief description of the terms in the present application is only for convenience in understanding the embodiments described below, and is not intended to limit the embodiments of the present application. Unless otherwise indicated, these terms should be construed in their ordinary and customary meaning.
The terms first and second and the like in the description and in the claims and in the above-described figures are used for distinguishing between similar or similar objects or entities and not necessarily for limiting a particular order or sequence, unless otherwise indicated. It is to be understood that the terms so used are interchangeable under appropriate circumstances.
The terms "comprises," "comprising," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a product or apparatus that comprises a list of elements is not necessarily limited to all elements explicitly listed, but may include other elements not expressly listed or inherent to such product or apparatus.
As shown in fig. 1, an embodiment of the present application provides a video encoding method, including:
step S101: acquiring an original slice corresponding to a video frame in a video to be encoded stored in a memory;
wherein one of said video frames comprises at least one of said original slices;
the video to be encoded may be high-resolution video data loaded from a disk file into a memory of a computer, or may be high-resolution video data input into a memory of a computer through a high-speed interface such as MIPI (Mobile Industry Processor Interface ), PCIe (Peripheral Component Interconnect express, high-speed serial computer expansion bus standard), network, USB (Universal Serial Bus ), etc. For example: the video to be encoded is a set of video data in YUV (a color coding method) format of 4K resolution. The video data is data including both luminance and chrominance attributes.
The memory includes the main memory of the computer, which is DDR SDRAM (Double Data Rate Synchronous Dynamic Random Access Memory ). DDR SDRAM is a synchronous dynamic random access memory with double the data rate, and its data rate is twice the system clock frequency. In one possible design, the memory storing the video to be encoded may be the main memory.
Of course, the memory may be other forms of memory, which are not limited in this application.
Step S102: after at least one original slice is obtained, N downsampling slices corresponding to the original slice are obtained, the N downsampling slices are respectively stored in a corresponding first buffer, and N is a positive integer greater than 1;
the downsampled slice may be a slice having the same resolution as the original slice, or a slice having the same resolution as the target stream.
In some embodiments, the acquiring N downsampled slices corresponding to the original slice includes:
and synchronously transmitting the original slice to N downsamplers, so that the N downsamplers perform downsampling processing on the original slice to obtain N downsampled slices.
In the case of multiplexing video coding, it is necessary to obtain coded data corresponding to a plurality of target code streams. One target code stream corresponds to only one image resolution, and one image resolution may correspond to a plurality of target code streams, that is, image resolutions corresponding to different target code streams may be the same. In one example, the image resolution corresponding to the target code stream with the code rate of 2Mbps is 1920×1080, and the image resolution corresponding to the target code stream with the code rate of 0.5Mbps is 1920×1080, which is allowed in the practical application scenario.
It is easy to understand that the number of downsampled slices has a correspondence with the target code stream, and in general, the number of downsampled slices is not more than the number of target code streams. In one possible design, the number of downsampled slices may be the same as the number of target code streams, or may be less than the number of target code streams if there is coincidence in the image resolution of the target code streams.
In one possible design, the resolution of the downsampled slices corresponds one-to-one to the image resolution of the target code stream. In one example, the number of target code streams is 4, and the number of downsampled slices is also 4, and the resolution and code rate of the 4 target code streams are 1920×1080@2mbps, 1280×720@1mbps, 704× 576@0.5Mbps, and 352× 288@0.1Mbps, respectively, and the resolution of the 4 downsampled slices is 1920×1080, 1280×720, 704×576, and 352×288, respectively.
In another possible design, the resolution of the downsampled slice corresponds one-to-many to the image resolution of the target bitstream. In one example, the number of target code streams is 4, the number of downsampled slices is 2, the resolution/code rate of the 4 target code streams is 1920×1080@2mbps, 1920×1080@1mbps, 704× 576@0.5Mbps and 704× 576@0.2Mbps, respectively, and the resolution of the 2 downsampled slices is 1920×1080 and 704×576, respectively.
In some embodiments, a set of downsampled slices of different resolutions may be obtained by downsampling the original slices such that the resolution of each downsampled slice corresponds to the target code stream. It is easy to understand that the resolution of the downsampled slice obtained through the downsampling process is smaller than that of the original slice, so that the data size after downsampling is reduced, compared with the original slice, the buffer space which is required to be occupied is reduced, and the data size which is required to be read and processed by an encoder in the subsequent video encoding process is reduced, so that the video encoding efficiency can be improved.
In one possible design, when the original slice is subjected to downsampling, if at least one target code stream has the same image resolution as the original slice, the corresponding downsampled slice can be obtained by directly copying the original slice, and any mathematical operation is not required to be performed, so that the efficiency of obtaining the downsampled slice is further improved, and the beneficial effects of reducing power consumption and encoding delay are achieved.
In the process of downsampling, an original resolution set can be firstly obtained, wherein the original resolution set comprises image resolutions corresponding to target code streams; then, de-duplicating the original resolution set to obtain a de-duplicated resolution set; and determining the number of downsampled slices as the number of resolutions in the de-duplication resolution set, wherein the resolutions of the downsampled slices are respectively the resolutions in the de-duplication resolution set.
Wherein the step of de-duplicating the original resolution set includes:
judging whether the original resolution set comprises at least two identical first resolutions or not;
if at least two identical first resolutions are included in the set of resolutions, only one first resolution is reserved such that the number of occurrences of each resolution in the set of deduplication resolutions is 1.
In one example, the number of target code streams is 4, the resolution and code rate of the 4 target code streams are 1920×1080@2mbps, 1920×1080@1mbps, 704× 576@0.5Mbps and 704× 576@0.2Mbps respectively, the de-duplication resolution set is {1920×1080, 704×576}, the number of downsampled slices is 2, and the resolution is 1920×1080 and 704×576 respectively.
In the application, the N downsampled slices obtained after the downsampling process are respectively stored in the first buffer. The first cache is characterized by providing a write port and a read port, wherein the write port can acquire data from the outside when being connected with the outside, and the read port can output data to the outside when being connected with the outside. If the first buffer is located in the target buffer, the total number of pixels of the N downsampled slices obtained after downsampling is not greater than the number of pixels which can be accommodated in the target buffer;
If the first buffer is located in the first logical partition of the target buffer, the total number of pixels of the N downsampled slices obtained after the downsampling process is not greater than the number of pixels that can be accommodated by the first logical partition.
In one possible design, the target cache includes a cache memory. The high-speed buffer memory refers to a first-level memory existing between a main memory of a computer and an encoder, and is characterized in that the high-speed buffer memory consists of static memory elements, the data capacity is smaller, the data reading and writing have fixed beats, the data reading and writing speed is much higher than that of the main memory and the encoder, and part of the characteristics help to reduce delay caused by waiting for data by the encoder, so that the video coding efficiency is improved.
Step S103: reading the downsampled slices in the first buffer, and coding the downsampled slices based on coding parameters corresponding to each target code stream to obtain coding slices corresponding to each target code stream respectively;
the target code stream is a code stream corresponding to the coded video.
In the encoding by this step, since the downsampled slice for encoding is read from the first buffer, only the first buffer is accessed, and the memory is not accessed.
Step S104: the encoded slice is written into the memory.
When the scheme provided by the embodiment of the application is used for carrying out multi-channel video output coding, after the original slice corresponding to the original video frame is read from the memory, different downsampling treatments are carried out on the original slice to obtain a plurality of downsampled slices and buffer the downsampled slices, and in the subsequent multi-channel video coding process, the downsampled slices are read from the buffer and coded, so that the multi-channel video coding can be finished only by reading the original video frame once from the memory, the frequency of accessing the memory can be reduced, the memory bandwidth consumption is reduced, the memory access efficiency is improved, and the coding delay caused by waiting for data by the encoder is reduced.
Furthermore, in the prior art, when a certain video frame is encoded, the encoding is usually started after the whole video frame is read from the memory, but in the embodiment of the present application, after at least one original slice is obtained, the original slice can be split into a plurality of downsampled slices, so as to start the multi-path video encoding, thereby avoiding the waiting time required for reading the whole video frame from the memory, and further obtaining the beneficial effects of reducing the encoding delay and improving the encoding efficiency.
In some embodiments, the reading the downsampled slices in the first buffer and encoding the downsampled slices based on the encoding parameters corresponding to each target code stream includes:
determining a target downsampled slice corresponding to each target code stream based on the resolution of each downsampled slice;
reading the target downsampled slice from the first buffer;
and encoding the read target downsampled slices based on the encoding parameters corresponding to each target code stream.
Through the steps, the target downsampling slice corresponding to the target code stream can be read from the first buffer, and the target downsampling slice is encoded. Wherein, if the resolution of the downsampled slices in the first buffer is the same as the image resolution of the target code stream, respectively, the resolution of the target downsampled slices corresponds to the target code stream.
Since the resolution of the target downsampled slice corresponds to the target code stream, it is helpful to improve the efficiency of encoding when encoding the target downsampled slice.
If the resolution requirements of different target code streams on the encoded video are the same, the reading the target downsampled slice from the first buffer includes:
And reading the same downsampling slice from the first buffer in a time division multiplexing mode so as to finish the encoding of different target code streams aiming at the same downsampling slice.
If the resolution requirements of different target code streams on the coded video are the same, the target downsampling slices corresponding to the different target code streams are the same downsampling slices, and in this case, the target downsampling slices are read from the first buffer in a time-sharing multiplexing mode so as to ensure that video coding can be realized for each target code stream.
It should be noted that each time-sharing process involves only one of the multiple target code streams, not all of the code streams, and only part of the content, not all of the content, of one video frame to be encoded.
In addition, to clarify the process of encoding the target downsampled slice of the present application, one embodiment is disclosed below:
after a target code stream is selected, determining a first buffer address and an encoding parameter set address corresponding to the target code stream, wherein the first buffer address is a buffer address of a target downsampling slice, and the encoding parameter set address stores the encoding parameter set corresponding to the target code stream. The encoding parameter set includes, but is not limited to, GOP (Group of Pictures ) structure parameters of the video frame, target bitrate, quantization parameters, intra prediction mode options, inter prediction mode options, motion estimation mode options, and so forth.
And loading the coding parameter set stored in the coding parameter set address by the encoder core, starting the encoder core, and coding the target downsampled slice in the address of the first buffer memory to obtain the coding slice of the target code stream.
Video coding algorithms are complex in logic and require significant computational resources. In order to improve the quality (resolution, definition, real-time, etc.) of video coding, the implementation of video coding currently mainly depends on the computing platforms such as high-performance CPU (Central Processing Unit ), GPU (Graphics Processing Unit, graphics processor), DSP (Digital Signal Process, digital signal processor) and ASIC (Application Specific Integrated Circuit ) to provide the required computing power.
As shown in fig. 2, the encoding process of the h.265/HEVC (High Efficiency Video Coding, high frequency video coding) algorithm is taken as an example. The encoder of the h.265/HEVC algorithm mainly comprises modules such as transformation, quantization, entropy coding, intra-frame prediction, inter-frame prediction, loop filtering (deblocking filtering), etc. Inter prediction is a key technology for improving the compression ratio of video coding, the basic principle is from a certain historical frame to a historical pixel block which is matched with a current pixel block, the technology is called a motion estimation algorithm, a large amount of data is required to be read and compared, and the calculation complexity is high.
The intra-frame encoding steps are as follows: 1) The image is first divided into block areas; 2) Carrying out intra-frame estimation on the data and carrying out intra-frame coding; 3) Intra prediction (de-encoding); 4) Subtracting the intra-frame prediction result from the original image frame to obtain intra-frame residual data (residual signal); 5) The residual signal is subjected to linear transformation, and the transformed coefficient is scaled and quantized; 6) The processed residual signal coefficient is subjected to inverse processing to obtain residual reconstruction data; 7) Adding the residual reconstructed data to the intra-frame prediction signal to obtain a block predicted image frame (reconstructed data); 8) Finally removing blocking effect through loop filtering and self-adaptive compensation to obtain an image frame (reference frame) without blocking; 9) The intra-frame encoded signal and the residual signal coefficient are entropy encoded together and output.
The inter-frame coding steps are as follows: 1) The block image frame data is input into a motion estimation module, and inter-frame coding is carried out by combining the upper/lower image frame data obtained by the previous intra-frame prediction; 2) Performing motion compensation on inter-frame coded data; 3) Subtracting the inter-frame prediction result from the block original image frame to obtain an intra-frame residual signal; 4) The residual signal is subjected to linear transformation, and the transformed coefficient is scaled and quantized; 5) The processed residual signal coefficient is subjected to inverse processing to obtain a residual signal; 6) Adding the residual signal with the inter-frame prediction signal to obtain a block prediction image frame; 7) Finally removing blocking effect through loop filtering and self-adaptive compensation to obtain an image frame without blocking; 8) The inter-frame coded signal and the residual signal coefficient are output after entropy coding.
The embodiment of fig. 2 describes an example of the encoding process of the h.265/HEVC algorithm, and in the actual encoding process, other video encoding methods may be used, which is not limited in this application.
In some embodiments, the obtaining the original slice corresponding to the video frame in the video to be encoded stored in the memory includes:
reading pixel data in the video frame from the memory;
when the number of pixels for reading the pixel data reaches a preset threshold, determining that the pixel data read when the number reaches the preset threshold is the original slice, and the total number of pixels in the video frame is an integer multiple of the preset threshold. The method provided in the above embodiment reads pixel data in a video frame from a memory in real time when an original slice is acquired, determines that the pixel data read when the number of pixels read reaches a preset threshold is the original slice when the number of pixels read reaches the preset threshold, and may further continue to read other pixel data from the memory after determining the original slice so as to continue to acquire a next original slice. After an original slice is determined, the number of pixels is cleared, and the number of pixels of the pixel data is read again.
By the method provided by the embodiment, the original slice can be acquired without waiting for all data of the whole video frame to be accumulated in the memory, so that the efficiency of acquiring the original slice can be improved, and the efficiency of video coding can be further improved.
Moreover, the original slices are obtained by the method provided by the embodiment, and because the total number of pixels in the video frame is an integer multiple of the preset threshold value, the number of pixels in each original slice is the same, and any pixel in the video frame appears and only appears in one original slice, so that the omission of pixels in the video or the repetition of the same pixels in the video coding process can be avoided.
In some embodiments, the preset threshold may be calculated according to the target code stream and the available capacity of the first buffer. In this case, the step of determining the preset threshold value includes:
first, a conversion coefficient of a resolution corresponding to each target code stream and a resolution of an original slice is determined.
The conversion coefficient is the number of pixels that can be used to calculate each downsampled slice. In one possible design, the conversion coefficient is a ratio of a resolution corresponding to each target code stream to an original slice resolution, in one example, the original slice resolution is 1920×1080, the target code stream corresponds to a resolution of 1280×720, and the conversion coefficient is (1280×720)/(1920×1080) =0.44.
Alternatively, in another possible design, the conversion coefficient may be a ratio of a total number of pixels corresponding to each target code stream in the same video frame to a total number of pixels of the original slice.
And then, calculating a limit threshold according to conversion coefficients of the resolution corresponding to each target code stream and the resolution of the original slice and the available capacity of the first buffer, wherein the limit threshold is the maximum pixel number of the first buffer supporting buffer original slice.
PPI =S/(a1+ a2+……+ an);
Wherein PPI is a limit threshold, S is the number of pixels that can be most accommodated in the position where the first buffer is located, a1, a2 and an are conversion coefficients of the resolution corresponding to each target code stream and the resolution of the original slice respectively, n is the total number of target code streams, a1 is the conversion coefficient of the resolution corresponding to the first target code stream and the resolution of the original slice, a2 is the conversion coefficient of the resolution corresponding to the second target code stream and the resolution of the original slice, and an is the conversion coefficient of the resolution corresponding to the nth target code stream and the resolution of the original slice.
And finally, determining a preset threshold according to a limit threshold and the total number of pixels in the video frame, wherein the preset threshold is smaller than or equal to the limit threshold, and the total number of pixels in the video frame is an integer multiple of the preset threshold.
In one possible design, the preset threshold may take the maximum value that satisfies the above condition, so as to reduce the number of original slices, thereby reducing the number of encoding times and improving the encoding efficiency.
It should be noted that the above scheme is applicable to the case that the resolution of each downsampled slice is different.
The video frame may be divided into a plurality of original slices according to a preset threshold, and the original slices are named with numbers in order to distinguish the respective original slices. The number X of original slices in the video frame may be determined according to the description header of the video frame data, and the numbers of the original slices are 1~X, respectively. The correlation operation for the original slices may be sequentially performed in the numbered order. When the original slice with the number of X is acquired, the original slice corresponding to the next video frame in the video to be encoded stored in the memory can be continuously acquired. Numbering the original slices is beneficial to tracking the encoding progress.
In some embodiments, the writing the encoded slice into the memory comprises:
storing the coded slices obtained each time to a corresponding second cache;
the second cache is characterized by providing a write port and a read port, wherein the write port can acquire data from the outside when being connected with the outside, and the read port can output data to the outside when being connected with the outside;
For at least one of the original slices, after the encoding by the encoding parameters corresponding to each target code stream is completed, reading the encoding slice corresponding to each target code stream from the second buffer;
and writing the coding slices corresponding to the target code streams into the memory.
After a target code stream is selected, the address of a second buffer memory and the address of a memory corresponding to the target code stream are also determined. The address of the second buffer memory is used for storing the coded slices corresponding to the target code stream, and the address of the memory is used for storing the coded slices stored in the address of the second buffer memory.
Through the steps, after the coded slices corresponding to the target code streams of at least one original slice are stored in the second buffer, the data in the second buffer, namely the coded slices corresponding to the at least one original slice, are written into the memory. That is, after encoding of each target code stream of at least one original slice is completed, the encoded slice corresponding to each target code stream of the at least one original slice is written into the memory, so that the number of times of writing data into the memory is reduced, and the access efficiency of the memory is further improved.
In one possible design, the number of target streams to be encoded may also be obtained, the target streams being named with numbers. For example: the number of the target code streams is M, and the numbers of the target code streams are 1~M respectively. The encoding of the target downsampled slices based on the different numbered target code streams may be performed sequentially in the numbering order. After the target code stream with the number M is executed, M or integer multiple of M coded slices stored in the second buffer memory are written into the memory, and meanwhile, the coding operation based on the target code stream with the number 1 can be executed on the next original slice. Numbering the target code stream is advantageous for tracking the encoding progress.
In some embodiments, the writing the encoded slice into the memory comprises:
storing the coded slices obtained each time to a corresponding second cache;
after storing the coding slice corresponding to at least one target code stream, reading the coding slice corresponding to the target code stream from the second buffer, and writing the coding slice corresponding to the target code stream into the memory.
And after the original slices corresponding to the video frames are all encoded, executing the step of acquiring the original slices corresponding to the next video frame in the video to be encoded.
It is added that one video frame to be encoded may produce one or more encoded output frames, the specific number of which may be user configurable. When one video frame to be encoded produces at least two encoded output frames, the encoding parameters of the at least two encoded output frames may be identical or not identical.
As shown in fig. 3, an embodiment of the present application provides a video encoding apparatus, including:
an obtaining module 301, configured to obtain an original slice corresponding to a video frame in a video to be encoded stored in a memory, where one video frame includes at least one original slice;
the storage module 302 is configured to obtain N downsampling slices corresponding to the original slices after at least one original slice is obtained, and store the N downsampling slices to corresponding first caches respectively, where N is a positive integer greater than 1;
the encoding module 303 is configured to read the downsampled slices in the first buffer, encode the downsampled slices based on encoding parameters corresponding to each target code stream, and obtain encoding slices corresponding to each target code stream, where each target code stream is a code stream corresponding to an encoded video;
A writing module 304, configured to write the encoded slice into the memory.
As shown in fig. 4, an embodiment of the present application provides an electronic device, including: a processor 401, a memory 402, a transmitter 403, a downsampler 404, a target buffer 405 and an encoder 406; the method comprises the steps of carrying out a first treatment on the surface of the
The processor 401 is configured to process man-machine interaction to obtain a target code stream, and configure coding parameters corresponding to the target code stream;
the memory 402 is used for storing video to be encoded and encoded slices;
the transmitter 403 is configured to acquire an original slice corresponding to a video frame in the video to be encoded stored in the memory 402, and transmit the original slice to the downsampler 404, where one video frame includes at least one original slice;
the downsampler 404 is configured to obtain N downsampled slices corresponding to the original slice after obtaining at least one original slice, and store the N downsampled slices in the target buffer 405, where N is a positive integer greater than 1;
in one possible design, the electronic device further includes a splitter 407, where the transmitter 403 transmits the original slice to the splitter 407, and the splitter 407 is configured to send the original slice to N downsamplers synchronously, so that the N downsamplers downsample the original slice.
The target buffer 405 is configured to buffer the downsampled slice;
the encoder 406 is configured to read the downsampled slices in the target buffer 405, encode the downsampled slices in the target buffer 405 based on encoding parameters corresponding to each target code stream, and obtain encoding slices corresponding to each target code stream, where each target code stream is a code stream corresponding to an encoded video;
the target buffer 405 is further configured to buffer the encoded slice;
the transmitter 403 is further configured to write the encoded slice in the target buffer 405 to the memory 402.
When the downsampled slice and the coded slice are buffered by the same target buffer 405, the target buffer 405 includes a first logical partition and a second logical partition, the first logical partition is used for buffering the downsampled slice, and the second logical partition is used for buffering the coded slice;
when the number of target buffers 405 includes at least two and the downsampled slices and the encoded slices are buffered by different ones of the target buffers 405, the target buffers 405 include a first physical buffer for buffering the downsampled slices and a second physical buffer for buffering the encoded slices.
The embodiment of the application provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the video coding method provided by the embodiment of the application is realized when the processor executes the program.
The embodiment of the application provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements the video encoding method provided by the embodiment of the application.
Some embodiments of the present application provide a video encoding method and apparatus, the method comprising: acquiring an original slice corresponding to a video frame in a video to be encoded stored in a memory, wherein one video frame comprises at least one original slice; after at least one original slice is obtained, obtaining N downsampling slices corresponding to the original slice through shunting and downsampling, and respectively storing the N downsampling slices into corresponding first caches, wherein N is a positive integer greater than 1; reading the downsampled slices in the first cache, and coding the downsampled slices based on coding parameters corresponding to each target code stream to obtain coding slices corresponding to each target code stream respectively, wherein the target code streams are code streams corresponding to coded videos; the encoded slice is written into the memory. When the source video is subjected to multi-channel video output coding, only one original video frame is required to be read from the memory and converted into a plurality of downsampled data, so that multi-channel video coding is finished, memory bandwidth consumption is reduced, and memory access efficiency of the memory is improved. Further, the embodiment of the application can start to split into a plurality of downsampled slices to start the multi-path video coding after at least one original slice is acquired, so that the waiting time required for reading the whole video frame is avoided, and the coding delay time is reduced.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the corresponding technical solutions from the scope of the technical solutions of the embodiments of the present application.
The foregoing description, for purposes of explanation, has been presented in conjunction with specific embodiments. However, the illustrative discussions above are not intended to be exhaustive or to limit the embodiments to the precise forms disclosed above. Many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles and the practical application, to thereby enable others skilled in the art to best utilize the embodiments and various embodiments with various modifications as are suited to the particular use contemplated.

Claims (13)

1. A video encoding method, comprising:
acquiring an original slice corresponding to a video frame in a video to be encoded stored in a memory, wherein one video frame comprises at least one original slice;
After at least one original slice is obtained, N downsampling slices corresponding to the original slice are obtained, the N downsampling slices are respectively stored in a corresponding first buffer, and N is a positive integer greater than 1;
reading the downsampled slices in the first cache, and encoding the downsampled slices based on encoding parameters corresponding to each target code stream to obtain encoding slices corresponding to each target code stream, wherein each target code stream is a code stream corresponding to an encoded video;
the encoded slice is written into the memory.
2. The method of claim 1, wherein the obtaining N downsampled slices corresponding to the original slice comprises:
and synchronously transmitting the original slice to N downsamplers, so that the N downsamplers perform downsampling processing on the original slice to obtain N downsampled slices.
3. The method according to claim 2, wherein the reading the downsampled slices in the first buffer and encoding the downsampled slices based on the encoding parameters corresponding to each target code stream comprises:
determining a target downsampled slice corresponding to each target code stream based on the resolution of each downsampled slice;
Reading the target downsampled slice from the first buffer;
and encoding the read target downsampled slice based on the encoding parameters respectively corresponding to the target code streams.
4. The method of claim 3, wherein the reading the target downsampled slice from the first buffer if the resolution requirements of the encoded video for different target code streams are the same comprises:
and reading the same downsampling slice from the first buffer in a time division multiplexing mode so as to finish the encoding of different target code streams aiming at the same downsampling slice.
5. The method according to claim 1, wherein the obtaining the original slice corresponding to the video frame in the video to be encoded stored in the memory includes:
reading pixel data in the video frame from the memory;
when the number of pixels for reading the pixel data reaches a preset threshold, determining that the pixel data read when the number reaches the preset threshold is the original slice, and the total number of pixels in the video frame is an integer multiple of the preset threshold.
6. The method of claim 5, further comprising, prior to the acquiring the original slice corresponding to the video frame in the video to be encoded stored in the memory:
Determining conversion coefficients of the resolution corresponding to each target code stream and the resolution of the original slice;
calculating a limit threshold according to the conversion coefficient and the available capacity of the first buffer, wherein the limit threshold is the maximum pixel number of the original slice supported by the first buffer;
and determining the preset threshold according to the limit threshold and the total number of pixels in the video frame.
7. The method of claim 1, wherein the writing the encoded slice into the memory comprises:
storing the coded slices obtained each time to a corresponding second cache;
for at least one of the original slices, after the encoding by the encoding parameters corresponding to each target code stream is completed, reading the encoding slices corresponding to each target code stream from the second buffer;
and writing the coding slices corresponding to the target code streams into the memory.
8. The method of claim 2, wherein the step of determining the position of the substrate comprises,
if the first buffer is located in the target buffer, the total number of pixels of the N downsampled slices is not greater than the number of pixels that can be accommodated in the target buffer;
If the first buffer is located in a first logical partition of the target buffer, the total number of pixels of the N downsampled slices is not greater than the number of pixels that can be accommodated by the first logical partition.
9. A video encoding apparatus, comprising:
the acquisition module is used for acquiring an original slice corresponding to a video frame in the video to be encoded stored in the memory, wherein one video frame comprises at least one original slice;
the storage module is used for acquiring N downsampling slices corresponding to the original slices after at least one original slice is acquired, and storing the N downsampling slices into corresponding first caches respectively, wherein N is a positive integer greater than 1;
the coding module is used for reading the downsampling slices in the first buffer memory, coding the downsampling slices based on coding parameters corresponding to each target code stream, and obtaining coding slices corresponding to each target code stream respectively, wherein each target code stream is a code stream corresponding to a coded video;
and the writing module is used for writing the coded slice into the memory.
10. An electronic device, comprising: the device comprises a processor, a memory, a transmitter, a downsampler, a target buffer and an encoder;
The processor is used for processing man-machine interaction to obtain a target code stream and configuring coding parameters corresponding to the target code stream;
the memory is used for storing video to be coded and coded slices;
the transmitter is configured to acquire an original slice corresponding to a video frame in the video to be encoded stored in the memory, and transmit the original slice to the downsampler, where one video frame includes at least one original slice;
the downsampler is configured to obtain N downsampled slices corresponding to the original slice after at least one original slice is obtained, and store the N downsampled slices in the target buffer respectively, where N is a positive integer greater than 1;
the target buffer is used for buffering the downsampled slices;
the encoder is used for reading the downsampled slices in the target buffer, encoding the downsampled slices in the target buffer based on encoding parameters corresponding to each target code stream, and obtaining encoding slices corresponding to each target code stream respectively, wherein each target code stream is a code stream corresponding to an encoded video;
the target buffer is further used for buffering the coded slices;
The transmitter is further configured to write the encoded slice in the target buffer into the memory.
11. The electronic device of claim 10, wherein when the downsampled slice and the encoded slice are buffered by the same target buffer, the target buffer comprises a first logical partition for buffering the downsampled slice and a second logical partition for buffering the encoded slice;
when the number of target buffers includes at least two, and the downsampled slices and the encoded slices are buffered by different ones of the target buffers, the target buffers include a first physical buffer for buffering the downsampled slices and a second physical buffer for buffering the encoded slices.
12. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the video encoding method of any one of claims 1-8 when the program is executed.
13. A computer readable storage medium having stored thereon a computer program, which when executed by a processor, implements the video encoding method according to any of claims 1-8.
CN202310498132.0A 2023-05-06 2023-05-06 Video coding method and device Active CN116233453B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310498132.0A CN116233453B (en) 2023-05-06 2023-05-06 Video coding method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310498132.0A CN116233453B (en) 2023-05-06 2023-05-06 Video coding method and device

Publications (2)

Publication Number Publication Date
CN116233453A true CN116233453A (en) 2023-06-06
CN116233453B CN116233453B (en) 2023-07-14

Family

ID=86585861

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310498132.0A Active CN116233453B (en) 2023-05-06 2023-05-06 Video coding method and device

Country Status (1)

Country Link
CN (1) CN116233453B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116886932A (en) * 2023-09-07 2023-10-13 中移(杭州)信息技术有限公司 Video stream transmission method, device, terminal equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040032968A1 (en) * 2002-05-31 2004-02-19 Canon Kabushiki Kaisha Embedding a multi-resolution compressed thumbnail image in a compressed image file
US20140169467A1 (en) * 2012-12-14 2014-06-19 Ce Wang Video coding including shared motion estimation between multple independent coding streams
CN104506870A (en) * 2014-11-28 2015-04-08 北京奇艺世纪科技有限公司 Video coding processing method and device suitable for multiple code streams
CN105657426A (en) * 2016-01-08 2016-06-08 全时云商务服务股份有限公司 Video encoding system and method
CN115734004A (en) * 2021-08-27 2023-03-03 西安诺瓦星云科技股份有限公司 Video processing method, device, system and equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040032968A1 (en) * 2002-05-31 2004-02-19 Canon Kabushiki Kaisha Embedding a multi-resolution compressed thumbnail image in a compressed image file
US20140169467A1 (en) * 2012-12-14 2014-06-19 Ce Wang Video coding including shared motion estimation between multple independent coding streams
CN104506870A (en) * 2014-11-28 2015-04-08 北京奇艺世纪科技有限公司 Video coding processing method and device suitable for multiple code streams
CN105657426A (en) * 2016-01-08 2016-06-08 全时云商务服务股份有限公司 Video encoding system and method
CN115734004A (en) * 2021-08-27 2023-03-03 西安诺瓦星云科技股份有限公司 Video processing method, device, system and equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116886932A (en) * 2023-09-07 2023-10-13 中移(杭州)信息技术有限公司 Video stream transmission method, device, terminal equipment and storage medium
CN116886932B (en) * 2023-09-07 2023-12-26 中移(杭州)信息技术有限公司 Video stream transmission method, device, terminal equipment and storage medium

Also Published As

Publication number Publication date
CN116233453B (en) 2023-07-14

Similar Documents

Publication Publication Date Title
US8395634B2 (en) Method and apparatus for processing information
KR100772379B1 (en) External memory device, method for storing image date thereof, apparatus for processing image using the same
US9509992B2 (en) Video image compression/decompression device
WO2017133315A1 (en) Lossless compression method and system appled to video hard decoding
US8577165B2 (en) Method and apparatus for bandwidth-reduced image encoding and decoding
JPWO2006013690A1 (en) Image decoding device
WO2009142003A1 (en) Image coding device and image coding method
WO2017087052A1 (en) Method and system of reference frame caching for video coding
KR20130070574A (en) Video transmission system having reduced memory requirements
CN116233453B (en) Video coding method and device
JP5496047B2 (en) Image reproduction method, image reproduction apparatus, image reproduction program, imaging system, and reproduction system
JP2012085001A5 (en)
KR101611408B1 (en) Method and apparatus for bandwidth-reduced image encoding and image decoding
WO2023193701A1 (en) Image coding method and apparatus
JP2002112268A (en) Compressed image data decoding apparatus
WO2022206217A1 (en) Method and apparatus for performing image processing in video encoder, and medium and system
JP2950367B2 (en) Data output order conversion method and circuit in inverse discrete cosine converter
KR100891116B1 (en) Apparatus and method for bandwidth aware motion compensation
KR102267215B1 (en) Embedded codec (ebc) circuitry for position dependent entropy coding of residual level data
KR102171119B1 (en) Enhanced data processing apparatus using multiple-block based pipeline and operation method thereof
CN100576917C (en) The method and system of inversely scanning frequency efficiency
CN114339249B (en) Video decoding method, readable medium and electronic device thereof
WO2022206166A1 (en) Method and device for performing image processing in a video encoding device, and system
JP4214554B2 (en) Video decoding device
JP2009272948A (en) Moving image decoding apparatus and moving image decoding method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant