WO2022206166A1 - Method and device for performing image processing in a video encoding device, and system - Google Patents

Method and device for performing image processing in a video encoding device, and system Download PDF

Info

Publication number
WO2022206166A1
WO2022206166A1 PCT/CN2022/074533 CN2022074533W WO2022206166A1 WO 2022206166 A1 WO2022206166 A1 WO 2022206166A1 CN 2022074533 W CN2022074533 W CN 2022074533W WO 2022206166 A1 WO2022206166 A1 WO 2022206166A1
Authority
WO
WIPO (PCT)
Prior art keywords
memory
block
image data
read
encoded
Prior art date
Application number
PCT/CN2022/074533
Other languages
French (fr)
Chinese (zh)
Inventor
赵娟萍
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Publication of WO2022206166A1 publication Critical patent/WO2022206166A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/156Availability of hardware or computational resources, e.g. encoding based on power-saving criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/423Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation characterised by memory arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/43Hardware specially adapted for motion estimation or compensation
    • H04N19/433Hardware specially adapted for motion estimation or compensation characterised by techniques for memory access

Definitions

  • the present application belongs to the technical field of electronic devices, and in particular, relates to a method, device, storage medium, electronic device and system for image processing in a video encoding device.
  • the video encoding apparatus may encode video images.
  • it is usually necessary to read the data amount of multiple frames of encoded video images.
  • the power consumption of the video encoding device is relatively large.
  • Embodiments of the present application provide a method, device, storage medium, electronic device, and system for performing image processing in a video encoding device, which can reduce power consumption of the video encoding device.
  • an embodiment of the present application provides a method for image processing in a video encoding device, the method comprising:
  • the to-be-encoded block is encoded according to the relative relationship between the matched block and the to-be-encoded block.
  • an embodiment of the present application provides an apparatus for performing image processing in a video encoding apparatus, the apparatus comprising:
  • a first determining module configured to determine the block to be encoded from the current frame image
  • the second determination module is configured to determine, from the reconstructed frame images of the historical frame images, a first area that needs to be read repeatedly for many times, and store the image data of the first area in a preset memory, and the preset memory Set the power consumption of the memory to be less than the preset power consumption threshold;
  • a reading module for reading the image data of the first area from the preset memory
  • a third determining module configured to determine a matching block matching the block to be encoded from the first region according to the read image data of the first region
  • An encoding module configured to encode the to-be-encoded block according to the relative relationship between the matched block and the to-be-encoded block.
  • embodiments of the present application provide a storage medium on which a computer program is stored, and when the computer program is executed on a computer, causes the computer to execute the image encoding in a video encoding apparatus provided by the embodiments of the present application method of processing.
  • the embodiments of the present application further provide an electronic device, including a memory, a processor, and a video encoding apparatus.
  • the processor is configured to execute the computer program stored in the memory by invoking the computer program provided in the embodiments of the present application.
  • a method for image processing in a video encoding device is configured to execute the computer program stored in the memory by invoking the computer program provided in the embodiments of the present application.
  • an embodiment of the present application further provides an image processing system, including a video encoding device, a first memory, and a second memory, wherein the power consumption of the second memory is greater than a first preset of the power consumption of the first memory.
  • the video encoding device includes a third memory, the read speed of the third memory is greater than the second preset multiple of the read speed of the first memory, the first memory and the second memory respectively store The image data repeatedly read in the reconstructed frame images of the historical frame images, the video encoding device reads the repeatedly read data from the first memory and the second memory according to a preset number of times during encoding.
  • the image data in the search window is read from the memory, and a matching block matching the block to be coded is determined, and coding is performed according to the motion vector and residual of the matching block and the block to be coded.
  • FIG. 1 is a first schematic flowchart of a method for performing image processing in a video encoding apparatus provided by an embodiment of the present application.
  • FIG. 2 is a schematic structural diagram of a video compression system in the related art.
  • FIG. 3 is a schematic diagram of data storage in a video encoding apparatus in the related art.
  • FIG. 4 is a schematic diagram of increasing the number of channels of a dynamic random access memory (DRAM) for data access in the related art.
  • DRAM dynamic random access memory
  • FIG. 5 is a schematic diagram of a comparison of square blocks provided by an embodiment of the present application.
  • FIG. 6 is a schematic diagram of a hierarchical search provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram of a non-hierarchical search provided by an embodiment of the present application.
  • FIG. 8 is a second schematic flowchart of a method for performing image processing in a video encoding apparatus provided by an embodiment of the present application.
  • FIG. 9 is a schematic diagram of a scene of searching in a reconstructed frame image of a historical frame image provided by an embodiment of the present application.
  • FIG. 10 is a schematic diagram illustrating a comparison of the energy consumed when reading data between a static random-access memory (Static Random-Access Memory, SRAM) provided by an embodiment of the present application and a dynamic random-access memory.
  • SRAM Static Random-Access Memory
  • FIG. 11 is a schematic structural diagram of a video compression system using a system cache (Sys$) provided by an embodiment of the present application.
  • FIG. 12 is another schematic structural diagram of a video compression system using a system cache provided by an embodiment of the present application.
  • FIG. 13 is a schematic structural diagram of a video compression system using a system buffer memory (System Buffer, SysBuf) provided by an embodiment of the present application.
  • System Buffer System Buffer, SysBuf
  • FIG. 14 is a schematic diagram of a scene when the reconstructed frame image of the historical frame image provided by the embodiment of the present application is moved down by one block line.
  • FIG. 15 is a schematic diagram of a power consumption curve when reading and writing data from a multi-channel DRAM according to an embodiment of the present application.
  • FIG. 16 is a schematic diagram of a power consumption curve when reading and writing data from Sys$ or SysBuf and DRAM, respectively, according to an embodiment of the present application.
  • FIG. 17 is a schematic diagram of a scene of a search range of a search window in a reconstructed frame image of a historical frame image provided by an embodiment of the present application.
  • FIG. 18 is a schematic diagram of a scene encoded by a video encoding apparatus provided by an embodiment of the present application.
  • FIG. 19 is a third schematic flowchart of a method for performing image processing in a video encoding apparatus provided by an embodiment of the present application.
  • FIG. 20 is a schematic structural diagram of an apparatus for performing image processing in a video encoding apparatus provided by an embodiment of the present application.
  • FIG. 21 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • FIG. 22 is another schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • FIG. 23 is a schematic structural diagram of an image processing system provided by an embodiment of the present application.
  • FIG. 24 is another schematic structural diagram of an image processing system provided by an embodiment of the present application.
  • An embodiment of the present application provides a method for image processing in a video encoding device, the method comprising:
  • the to-be-encoded block is encoded according to the relative relationship between the matched block and the to-be-encoded block.
  • the preset memory includes a first memory and a second memory
  • the power consumption of the second memory is greater than a first preset multiple of the power consumption of the first memory
  • the first area includes A plurality of block rows, the first area that needs to be read repeatedly for multiple times is determined from the reconstructed frame image of the historical frame image, and the image data of the first area is stored in the preset memory, including:
  • the reading of the image data of the first region from the preset memory includes:
  • the image data of unread block lines in the first region is read block line by block line from the second memory.
  • the reading the image data of the first region from the second memory and storing the image data in the first memory includes:
  • the image data of the moved block line is read from the second memory and stored in the first memory in memory;
  • the determining a matching block matching the block to be encoded from the first region according to the read image data of the first region includes:
  • the image data of the search window is read from the third memory, and according to the image data of the search window, the block with the least encoding cost of the block to be encoded is determined from the search window;
  • the block with the smallest coding cost of the block to be coded is used as the matching block.
  • the image data of the search window is read from the third memory, and the image data of the to-be-coded block is determined from the search window according to the image data of the search window.
  • the least expensive block to encode including:
  • the reduced block with the smallest encoding cost of the block to be encoded is determined from the reduced search window;
  • a block with the smallest encoding cost to the block to be encoded is determined from the search window.
  • the relative relationship is a motion vector and a residual
  • the encoding the block to be encoded according to the relative relationship between the matching block and the block to be encoded includes:
  • the to-be-encoded block is encoded according to the motion vector sum residual of the matched block and the to-be-encoded block.
  • the encoding the block to be encoded according to the motion vector and the residual of the matching block and the block to be encoded includes:
  • Entropy coding is performed on the motion vector of the matching block and the block to be coded and the forward transformed and quantized first residual data to obtain coded video stream data;
  • the to-be-coded block is reconstructed according to the second residual data.
  • the preset memory includes a first memory and a second memory
  • the power consumption of the second memory is greater than a first preset multiple of the power consumption of the first memory
  • the first area includes A plurality of block rows, the first area that needs to be read repeatedly for multiple times is determined from the reconstructed frame image of the historical frame image, and the image data of the first area is stored in the preset memory, including:
  • the reading of the image data of the first region from the preset memory includes:
  • the image data of unread block lines in the plurality of first regions is read block line by block line from the second memory.
  • the determining a matching block matching the block to be encoded from the first region according to the read image data of the first region includes:
  • the image data of the plurality of search windows are read from the third memory, and according to the image data of the plurality of search windows, the codes corresponding to the blocks to be coded are respectively determined from the plurality of search windows. the least expensive block or blocks;
  • the matching block is determined from a plurality of blocks with the smallest coding cost of the block to be coded.
  • the image data of the plurality of search windows are read from the third memory, and according to the image data of the plurality of search windows, the image data of the plurality of search windows are determined respectively from the plurality of search windows.
  • One or more blocks with the smallest coding cost of the block to be coded including:
  • the image data of the plurality of reduced search windows from the plurality of reduced search windows, respectively determine one or more reduced blocks with the smallest coding cost of the to-be-coded block;
  • one or more blocks with the smallest coding cost of the block to be coded are respectively determined from the plurality of search windows .
  • the first memory includes a system cache or system buffer memory disposed external to the video encoding device
  • the second memory includes dynamic random access memory disposed external to the video encoding device.
  • the third memory includes a buffer or buffer provided inside the video encoding device.
  • FIG. 1 is a first schematic flowchart of a method for performing image processing in a video encoding apparatus provided by an embodiment of the present application.
  • the method for image processing in a video encoding device can be applied to a video encoding device.
  • the flow of the method for image processing in a video encoding device may include:
  • the video encoding apparatus may encode video images.
  • it is usually necessary to read the data amount of multiple frames of encoded video images.
  • the power consumption of the video encoding device is relatively large.
  • FIG. 2 is a schematic structural diagram of a video compression system in the related art.
  • a central processing unit Central Processing Unit/Processor, CPU
  • a video encoding device an image processor (Image Signal Processor, ISP) and a neural network processor (Neural Network Processing Unit, NPU) are connected through the bus and dynamic Random access memory controller (Dynamic Random Access Memory Controller, DRAMC) reads and writes data from DRAM, central processing unit, video encoding device, image processor and neural network processor time-sharing bandwidth, central processing unit, image processor and
  • DRAMC Dynamic Random Access Memory Controller
  • the priority of the neural network processor is higher than that of the video encoding device.
  • the video encoding apparatus needs to perform a search operation when encoding, which will occupy a large bandwidth.
  • FIG. 3 is a schematic diagram of data storage in a video encoding apparatus in the related art.
  • the current frame (Current Frame) image, the reference frame (Reference Frame) image, the reconstructed frame (Reconstructed Frame) image, the bit stream (Bitstreams) and the temporary data (Temporary data) are all stored in the DRAM in the video encoding device.
  • DRAM offers less bandwidth.
  • the temporary data may be Temporal Motion Vector (TMV), scaled frames, or other data.
  • High Efficiency Video Coding H.265/HEVC
  • Versatile Video Coding H.266/VVC
  • the first generation of video coding standards of the Open Media Alliance Alliance for Open Media Video 1, AV1
  • essential video coding Essential Video Coding, MPEG-5/EVC
  • MPEG-5/EVC essential video coding
  • it is usually used to increase the bandwidth of the DRAM or increase the frequency of the DRAM to accelerate the throughput of data.
  • FIG. 4 is a schematic diagram of increasing the number of DRAM channels for data access in the related art.
  • the bandwidth and frequency can be increased to increase the data throughput speed of the DRAM, but it will cause greater power consumption.
  • the bandwidth of the system DRAM consumes a large amount of energy. But whether the video encoding device performs immediate or non-real-time operations, it is very important to maintain the highest efficiency.
  • the video encoding apparatus completes the encoding at an expected time, it will cause great power consumption of the DRAM.
  • Video coding devices generally use a block (which can be considered as a pixel block) as a basic unit, and the block can be a rectangle, a square, or a trapezoid, or a triangle pieced together.
  • a block-based comparison algorithm appears.
  • FIG. 5 is a schematic diagram of a comparison of square blocks provided by an embodiment of the present application.
  • the block to be compressed in the current frame image is compared with the block of the reference frame image in the form of a square block, and the reference frame image is the reconstructed image of the historical frame image, that is, the encoded image of the historical frame image.
  • the block to be compressed and the block of the reference frame image are N ⁇ N blocks, and N is an integer greater than or equal to 4.
  • N is an integer greater than or equal to 4.
  • each block is a block of N ⁇ N pixel size, for example, may be 4 ⁇ 4 , 32 ⁇ 32 blocks, 128 ⁇ 128 blocks, etc., where 4 ⁇ 4, 32 ⁇ 32, 128 ⁇ 128 refer to the number of pixels.
  • For each block to be coded go to the surrounding of the same position in the reconstructed frame image of the historical frame image to find the block that best matches it, that is, the matching block.
  • the movement of the matching block relative to the block to be coded is called the motion vector ( Motion Vector, MV).
  • the block to be encoded is determined from the current frame image, where the block to be encoded is a block to be compressed in the current frame image, that is, a block to be encoded in the current frame image.
  • the block to be encoded may be a block of N ⁇ N size.
  • When coding the block to be coded it is usually necessary to compare it with the block in the reference frame image, so it is necessary to search for the block to be compared in the reference frame image.
  • the reference frame image is a reconstructed frame image of the historical frame image, that is, an encoded image of the historical frame image.
  • the block can be compared with the block to be encoded, and the reconstructed frame image of the historical frame image may be a plurality of historical frame images.
  • the encoding cost of the reconstructed frame image is the least. Therefore, before searching for a block, the area to be searched needs to be known in advance. Therefore, when searching for blocks in the reconstructed frame image of the historical frame image, it is necessary to know the search range (search range, SRng) in the reconstructed frame image of the historical frame image, that is, it is necessary to determine the repeated reading When searching for blocks in the first area, the data of the first area needs to be read repeatedly for many times.
  • the image data of the first area is stored in the preset memory, which is convenient for subsequent searches from the preset memory. It is assumed that the image data of the first area is read from the memory.
  • the power consumption of the preset memory is less than the preset power consumption threshold.
  • the image data of the first area is stored in the preset memory
  • a matching block matching the block to be encoded is determined from the first region according to the read image data of the first region.
  • the search for the first area can be realized.
  • each block in the first area is respectively compared with the block to be encoded in the current frame image.
  • the block that best matches the block to be coded is found from the first area, and the best matching block is the matching block.
  • the common block matching algorithm can use hierarchical search (hierarchical search) or non-hierarchical search (not hierarchical search), after searching, the motion vector and pixel value residues are obtained, for subsequent further compression coding.
  • the pixel residual amount can be obtained by subtracting the predicted value from the actual value of the pixel.
  • FIG. 6 is a schematic diagram of a hierarchical search provided by an embodiment of the present application.
  • Hierarchical search is to reduce the block to be searched and the area to be searched by the same magnification, such as 1/2, 1/4 or 1/8, etc.
  • On the reduced image that is, the area to be searched, first determine the area to be searched. After the approximate extent of the block, go back to the unreduced image for a finer block search.
  • the reduction ratio of each layer may be the same or different, for example, the reduction ratio of each layer may be 1/2, 1/4, 1/8 and 1/16.
  • Figure 6 is an example of motion search in 3 levels. First, the 1/4 reduced image is searched. Then, the motion vector obtained from the 1/4 reduced image range is finer and smaller in the 1/2 reduced image range. After that, the motion vector obtained by reducing the image range by 1/2 searches the range of the original size image to obtain the final motion vector.
  • FIG. 7 is a schematic diagram of a non-hierarchical search provided by an embodiment of the present application.
  • Non-hierarchical search refers to performing block comparison tasks directly in unreduced images.
  • Common methods include full search and n-step search.
  • Figure 7 shows motion search directly on the original size image, that is, in the search window in the reconstructed frame image of the previous frame image, the full search method is used to find the current block (current block) and the reconstruction of the previous frame image Which block in the frame image has the smallest coding cost.
  • the minimum coding cost may take various forms, for example, the minimum coding cost may be the minimum sum of absolute values of residuals of each pixel of a certain block to be searched and the current block.
  • p in Fig. 7 is the horizontal search range.
  • motion estimation refers to block-based motion estimation.
  • the basic idea is to divide each frame of the image sequence into many non-overlapping blocks, and consider that the displacements of all pixels in the block are the same, and then calculate the value of each block.
  • the block most similar to the current block is found according to certain block matching criteria, that is, the matching block, and the relative displacement between the matching block and the current block is the motion vector.
  • Motion estimation searches for reconstructed pixels encoded at different previous time points, that is, reconstructed pixels in reconstructed frame images of historical frame images.
  • a preset number of historical frame images can be randomly selected from the historical frame images, and the reconstructed frame images of the selected historical frame images can be searched. displacement between blocks, and then select the optimal motion vector as the final search result. It can be understood that the position of the matching block in the reconstructed frame image of the historical frame image can be determined according to the motion vector.
  • the relative displacement relationship and the relative error relationship between the matching block and the block to be encoded for example, subtract the two-dimensional pixel of the corresponding position of the matching block from the two-dimensional pixel of the block to be encoded to obtain the relative error between the matching block and the block to be encoded
  • the to-be-encoded block can be encoded according to the relative displacement relation and the relative error relation between the matching block and the to-be-encoded block.
  • the video encoding apparatus may determine the block to be encoded from the current frame image, determine the first area that needs to be read repeatedly from the reconstructed frame image of the historical frame image, and The image data of the first area is stored in a preset memory, and the power consumption of the preset memory is less than a preset power consumption threshold. Then, the image data of the first area is read from the preset memory, and a matching block matching the block to be encoded is determined from the first area according to the read image data of the first area. Then, according to the relative relationship between the matching block and the block to be encoded, the block to be encoded is encoded.
  • the image data of the first region is stored in a preset memory with low power consumption, so as to achieve the purpose of reducing the power consumption of the video encoding apparatus. Therefore, the embodiments of the present application can reduce the power consumption of the video encoding apparatus.
  • FIG. 8 is a second schematic flowchart of a method for performing image processing in a video encoding apparatus provided by an embodiment of the present application.
  • the method for image processing in a video encoding device can be applied to a video encoding device.
  • the flow of the method for image processing in a video encoding device may include:
  • each frame of image can be divided into multiple block lines, and each block line can be divided into multiple blocks.
  • the block row to be encoded needs to be determined from the current frame image.
  • the block row to be encoded refers to the block row where the block to be encoded is located.
  • the block lines located before the block line to be encoded in the current frame image are all encoded block lines.
  • FIG. 9 is a schematic diagram of a scene of searching in a reconstructed frame image of a historical frame image provided by an embodiment of the present application. It can be seen from FIG. 9 that the block to be encoded is located within the range of the search window in the vertical projection direction.
  • the first area may include a plurality of block rows.
  • the block to be encoded After the block to be encoded is determined, it is necessary to determine a plurality of block lines that need to be read repeatedly from the reconstructed frame image of the historical frame image (which can be considered as a historical frame image with the strongest correlation with the current frame image), The multiple block lines that need to be read repeatedly are the block lines located in the first area.
  • each block row includes a plurality of blocks, and the plurality of blocks are arranged in a row.
  • the preset memory includes a first memory and a second memory. It should be noted that the reconstructed frame image of the historical frame image can be stored in the second memory in advance, and then the reconstructed frame image from the historical frame image stored in the second memory can be stored in the second memory. A first area that needs to be read repeatedly for multiple times is determined in the frame image.
  • the image data of the first area can be read from the second memory, and the image data of the first area can be read from the second memory.
  • the read image data of the first area is stored in the first memory, and is read while waiting for the video encoding apparatus to perform encoding.
  • the power consumption of the second memory is greater than the first preset multiple of the power consumption of the first memory, and the sum of the power consumption of the first memory and the power consumption of the second memory is less than the preset value.
  • Power consumption threshold which can reduce the power consumption when reading and writing data.
  • the preset power consumption threshold may be considered as the power consumption when all the image data in the first area is read and written by the second memory.
  • both the first memory and the second memory are memory external to the video encoding device.
  • the first memory may include a system cache or system buffer memory provided outside the video encoding device, that is, the first memory may include a system cache or system buffer memory provided outside the video encoding device.
  • the Sys$ or SysBuf outside the device the second memory may include a dynamic random access memory disposed outside the video encoding device, that is, the second memory may include a DRAM disposed outside the video encoding device.
  • the first memory can also be other low-power memory, etc.
  • Sys$ or SysBuf is composed of multiple SRAMs
  • the second memory can be DRAM
  • the power consumption of DRAM is greater than
  • the first preset multiple of the power consumption of the Sys$ or SysBuf outside the video encoding device, and the sum of the power consumption of the Sys$ or SysBuf outside the video encoding device and the power consumption of the DRAM is less than the preset power consumption threshold, which can reduce the read
  • the preset power consumption threshold can be considered as the power consumption when all the image data in the first area is read and written by the DRAM.
  • FIG. 10 is a schematic diagram illustrating a comparison of the energy consumed when reading data in the static random access memory and the dynamic random access memory provided by the embodiment of the present application.
  • the difference in energy consumption between reading data in SRAM and reading data in DRAM is about 100 times, that is, the power consumption of reading data in SRAM is far less than the power consumption of reading data in DRAM.
  • the motion estimation step of the video encoding device requires a large bandwidth provided by the DRAM, because during the search process, certain associated regions (ie, the first region) in the reconstructed frame images of the historical frame images are read for block search and comparison. Due to cost considerations, the block lines covered by the search range (that is, the first area) are usually not completely stored in the video encoding device, and usually only the size required within the search range (such as within the search window range) to meet the high-speed data access requirements during motion estimation.
  • the cache or buffer includes multiple SRAMs, and if the image data of the first area is stored in the video encoding device Inside the hardware of the device, the SRAM inside the video encoding device needs to be divided into more units, each unit is a bank, which will lead to a larger area of a single bank. As the area of a single bank becomes larger, the area of the SRAM also becomes larger, while the storage capacity of the SRAM remains unchanged, resulting in higher costs.
  • an 8-bit (bit) luminance (luma) portion requires at least 1 megabyte (MB) of storage space.
  • the SRAM needs to be divided into more units to meet the data entry and exit requirements, resulting in a larger area of the SRAM.
  • a motion estimation design that stores data in the form of a search window and processing the compression of one block line will require multiple pieces of data of the block line in the reconstructed frame image of the historical frame image, which means processing one frame of data. It will require the reading of multiple frames of data.
  • the image data of the search window is stored in a cache or buffer inside the video encoding device.
  • the cache or buffer includes finely divided SRAM groups.
  • the finer division means that the area of the same storage unit becomes larger. For example, the average area ratio of 1 bit in the bank is larger than that in the SRAM, which can provide sufficient of the data bandwidth to the motion estimation circuit. This not only results in a large area of SRAM, but also makes layout routing more difficult because it is divided into many banks. Therefore, the block row of the entire first area will not be implemented using this method with a high storage unit (eg, 1 bit) area.
  • the vertical search range will be multiple times the height of the block to be encoded, which results in that the bandwidth for reading the image data in the first area will be multiple times the bandwidth for writing the image data in the first area. And this situation is even more serious when the picture to be encoded reaches 4K or 8K.
  • the resolution of 4K pictures is 3840 ⁇ 2160 pixels, and the resolution of 4K pictures is 7680 ⁇ 4320 pixels.
  • the vertical search range must be larger than the 1080P resolution, otherwise the degree of picture compression will be greatly reduced.
  • the image data of the first region is read from the second memory and stored in the first memory, which may include:
  • the image data of the moved block line is read from the second memory and stored in the first in memory;
  • FIG. 11 is a schematic structural diagram of a video compression system using a system cache provided by an embodiment of the present application.
  • FIG. 12 is another schematic structural diagram of a video compression system using a system cache provided by an embodiment of the present application.
  • FIG. 13 is a schematic structural diagram of a video compression system using a system buffer memory provided by an embodiment of the present application.
  • the image data of the first area is stored in Sys$ or SysBuf
  • the image data of the search window is stored in the cache or buffer inside the video encoding device.
  • n in Fig. 11 to Fig. 13 is a number indicating the size of the storage capacity.
  • the DRAM reads and writes data at a speed of 0.5GB/s to 2GB/s
  • the Sys$ or SysBuf reads and writes data at a speed of 3GB/s to 8GB/s
  • the cache or buffer reads and writes data at a speed of 3GB/s to 8GB/s. 10GB/s ⁇ 50GB/s.
  • the speed of reading and writing data of DRAM, the speed of reading and writing data of Sys$ or SysBuf, and the speed of reading and writing data of cache or buffer can also be other values, but the speed of reading and writing data of cache or buffer must be satisfied.
  • the speed of data is greater than the speed of Sys$ or SysBuf to read and write data and the speed of DRAM to read and write data
  • the speed of Sys$ or SysBuf to read and write data is greater than the speed of DRAM to read and write data.
  • Sys$ can read data from DRAM through DramC, and the data read by Sys$ from DRAM through DramC can be read by central processing unit, video encoding device, image processor and neural network processor.
  • the first area moves down a block line in the reconstructed frame image of the historical frame image
  • both Sys$ and DRAM store the new block line
  • Sys$ removes the unused block lines at the same time.
  • the device needs to perform encoding, it can directly read the data in the first area stored in Sys$.
  • the image data in the first area is also read from DRAM through DramC, and then read by the video encoding device.
  • the Sys$ or SysBuf outside the video encoding device When storing, the Sys$ or SysBuf outside the video encoding device will remove the block lines that will not be used in the next line to be encoded, so that the cache or buffer inside the video encoding device can read the first area from the DRAM
  • the number of times the image data of DRAM is changed from multiple times to 1 times, and because the energy consumption of DRAM access is 100 times higher than that of SRAM, this can greatly reduce power consumption.
  • the position where the reconstructed frame image of the historical frame image is read ie the first area
  • the behavior (repeated reading) are predictable
  • reading the reconstructed frame image of the historical frame image will be the same as reading the current frame image.
  • the reconstructed frame image of the historical frame image compressed by the video encoding device it can be determined how many relevant block rows of the reconstructed frame image of the historical frame image to be stored in such a low-power storage space. Whenever the block line to be encoded (the block line to be encoded) moves down one line, remove the uppermost block line stored in Sys$ or SysBuf, and then re-read the newly added block line in the first area when it was moved down , and store the image data of the newly added block row.
  • FIG. 14 is a schematic diagram of a scene when the reconstructed frame image of the historical frame image provided by the embodiment of the present application is moved down by one block line.
  • the video encoding device encodes a block line downward, it evicts the unrelated block line originally stored in the upper part of Sys$ or SysBuf, and then sends the image data of the newly needed block line into Sys$ or SysBuf. That is, when the first area covered by the search range can move down with the block to be encoded, the unused area is expelled from Sys$ or SysBuf, and the block line to be encoded that will be used is stored in Sys$ or SysBuf middle.
  • the video encoding device encodes a block row downward, it removes the irrelevant block row above the first area stored in Sys$ or SysBuf, and then stores the newly required block row during encoding in the Sys $ or SysBuf outside the video encoding device. $ or SysBuf.
  • the image data of the first region is read block by line from the first memory.
  • the image data of the first region may be read block by line from the first memory, for example, the image data of the first region may be read from Sys$ or SysBuf.
  • the image data of the first region may be read from Sys$ or SysBuf.
  • it is read block by row, that is, read in order from top to bottom.
  • the size of the macro block (Macro block) in Figure 14 is 16 ⁇ 16 pixels, that is, 16 pixels in the horizontal direction times 16 pixels in the vertical direction,
  • the size of the macroblock may also be 32 ⁇ 32 pixels, 64 ⁇ 64 pixels, or the like.
  • the macroblock is the block to be encoded in the current frame image.
  • the vertical search range is ⁇ 64.
  • the total number of times of reading from the second memory can be divided into the first few times of reading from the first memory, and the remaining times of reading from the second memory. If the minimum requirement of Sys$ or SysBuf is met, the 9 pieces of data read from DRAM will be disassembled into 1 read from DRAM and 8 reads from Sys$ or SysBuf.
  • the 9 data volumes read from DRAM can be disassembled into 2 reads from DRAM and 7 reads from Sys$ or SysBuf, etc.
  • the power consumption is demanding conditions In this case, all 9 copies of data read from DRAM can be read from Sys$ or SysBuf. At this time, the power consumption is the lowest, but the cost will increase.
  • the original number of readings from DRAM can be split into several readings from SRAM, and the other several readings from DRAM, which can reduce the power consumption of reading data as a whole. And the number of reads from the SRAM and the number of reads from the DRAM can be adjusted to meet the needs of different power consumption.
  • FIG. 15 is a schematic diagram of a power consumption curve when reading and writing data from a multi-channel DRAM according to an embodiment of the present application.
  • the abscissa is the position of the reconstructed frame image of the historical frame image, for example, the top position of the image, the middle position of the image, and the bottom position of the image
  • the ordinate is the power consumption of reading and writing data during video encoding.
  • the limited power consumption provided by the video compression system can make the video encoding device unable to meet the speed requirements, or cause the video compression System is overheating. If the upper limit of power consumption is considered, the speed of reading and writing data is limited, and the reading and writing speed when the upper limit of power consumption is not considered cannot be reached.
  • FIG. 16 is a schematic diagram of a power consumption curve when reading and writing data from Sys$ or SysBuf and DRAM, respectively, according to an embodiment of the present application.
  • the video encoding device replaces a large amount of DRAM power consumption with the power consumption of Sys$ or SysBuf, which greatly reduces power consumption.
  • a search window may be determined from the first area, so that the search range may be narrowed, and matching blocks may be searched therefrom, thereby further reducing power consumption.
  • the reconstructed frame image of the historical frame image that is not reduced when using the non-hierarchical search, and the reconstructed frame image of the historical frame image that is reduced or non-reduced when using the hierarchical search, as long as the vertical position can be predicted.
  • the search window is applicable.
  • FIG. 17 is a scene schematic diagram of a search range of a search window in a reconstructed frame image of a historical frame image provided by an embodiment of the present application. It can be seen from FIG. 17 that the search window is located in the first area, the area between the adjacent dotted lines in the first area is the block row, and the searched motion vector can point to any place in the search window.
  • L, R, T and B are respectively the search range located on the left side of the block to be coded, the search range on the right side, the upper search range and the lower search range in the search window.
  • R and B are positive numbers and L and T are negative numbers.
  • L is not necessarily equal to R, and T is not necessarily equal to B.
  • the image data of the search window is stored in a third memory
  • the third memory may be a memory inside the video encoding device, and the third memory may include a memory set inside the video encoding device cache or buffer. Since the block within the search window is searched during motion estimation, and the demand for bandwidth is high during block search, the read and write speed of the third memory is higher than the read and write speed of the first memory and the speed of the second memory. read and write speed. To meet the needs of search speed and high bandwidth. Wherein, the read/write speed of the third memory is greater than the second preset multiple of the read/write speed of the first memory.
  • the image data of the search window is read from the third memory, and the search can be carried out in a hierarchical or non-hierarchical manner.
  • the coded blocks are compared to determine the block with the smallest coding cost compared to the block to be coded.
  • the coding cost may include a residual, for example, in another embodiment, the coding cost may include a block vector and a residual, and so on. It can be known that the block with the smallest coding cost may be the block with the smallest residual difference with the block to be coded, or the block with the smallest coding cost after comprehensively considering the block vector and the residual difference with the block to be coded.
  • the motion vector may be the relative displacement between the searched block and the block to be encoded.
  • the residual may be a difference obtained by subtracting the two-dimensional pixels at the corresponding positions of the searched blocks from the two-dimensional pixels of the block to be encoded.
  • the image data of the search window is read from the third memory, and according to the image data of the search window, the block with the least encoding cost of the block to be encoded is determined from the search window, Can include:
  • the reduced block with the smallest encoding cost of the block to be encoded is determined from the reduced search window;
  • a block with the smallest encoding cost to the block to be encoded is determined from the search window.
  • a hierarchical search method when searching, a hierarchical search method can be used, and the search level is different according to the number of levels. For example, if a search using two layers is used, a search in two layers is performed, and if a search using three layers is used, a search in three layers is performed.
  • an appropriate number of layers can be set according to specific needs. It should be noted that the reduction ratio of each layer may be the same or different.
  • the search window is reduced according to the preset number of layers. For example, the search window is reduced according to the number of two layers to obtain a reduced search window.
  • the search window is 1/2 of the original search window size. Then, according to the image data of the reduced search window, a reduced block with the smallest coding cost of the block to be encoded is determined from the reduced search window, and the reduced block and the reduced search window are reduced.
  • the magnification is the same.
  • the approximate range of the reduced block to be searched On the image of the reduced search window, first determine the approximate range of the reduced block to be searched, and then return to the image of the unreduced search window to perform a finer block search, that is, according to the reduced block, after the reduced block.
  • the approximate range in the search window of , and the original search window is more finely searched, and the block with the least encoding cost of the block to be encoded can be determined from the unreduced search window.
  • the search window is reduced according to the preset number of layers, for example, the search window is reduced according to the number of three layers to obtain a reduced search window.
  • the search window is 1/4 of the original search window size. Then, according to the image data of the 1/4 reduced range search window, determine the reduced block with the smallest coding cost of the block to be coded from the 1/4 reduced range search window, and obtain the 1/4 reduced range The corresponding block vector under the search window of .
  • the search window is reduced according to the preset number of layers, for example, the search window is reduced according to the number of three layers, and a reduced search window is obtained.
  • the search window is 1/6 of the original search window size.
  • the reduced block with the smallest coding cost of the block to be coded is determined from the 1/6 reduced range search window, and the 1/6 reduced range is obtained.
  • the corresponding block vector under the search window is 1/6 of the original search window size.
  • the block with the smallest coding cost eg, the smallest residual
  • the block with the smallest coding cost with the block to be coded is used as the matching block.
  • the relative relationship between the matching block and the block to be coded may be a motion vector and a residual.
  • the block to be coded can be coded according to the motion vector and the residual of the matching block and the block to be coded.
  • encoding the block to be coded according to the motion vector and the residual of the matching block and the block to be coded may include:
  • Entropy coding (Entropy Coding, EC) is performed on the motion vector of the matching block and the block to be coded and the forward transformed and quantized first residual data to obtain video stream coded data; or
  • the to-be-coded block is reconstructed according to the second residual data.
  • FIG. 18 is a schematic diagram of a scene of encoding by a video encoding apparatus provided by an embodiment of the present application. It can be seen from FIG. 18 that the motion estimation is located in the data flow relationship between the video coding apparatus and other modules. For example, motion estimation (hierarchical search or non-hierarchical search can be used) searches the reconstructed frame images of multiple historical frame images, and a matching block is found. The relative relationship between the matching block and the current block (ie, the block to be encoded) The displacement is the motion vector, and the residual is obtained according to the error between the current block and the matching block.
  • motion estimation hierarchical search or non-hierarchical search can be used
  • the forward transformation adopts Fast Fourier Transformation (FFT) to transform the spectrum, the abscissa on the spectrum curve is the frequency, and the ordinate is the energy.
  • FFT Fast Fourier Transformation
  • the pixels in the space are converted into spectral coefficients that are uncorrelated and energy-concentrated.
  • the data is only converted to the frequency domain, and the amount of data does not change, which can reduce distortion.
  • Quantization can be achieved by dividing the forward transformed matrix by the value of the corresponding position in the quantization matrix.
  • the spectral coefficients are further compressed by quantization and entropy coding to obtain a compressed video stream. Among them, the quantization process removes some unimportant high-frequency information, which can compress the amount of image data, so quantization is the key to compression.
  • the first residual data is obtained after forward transformation and quantization.
  • the first residual data obtained after forward transformation and quantization is subjected to inverse quantization and transformation (De-Quantization & Inv.Transform, DQIT) to the air domain, that is, the second residual data of the matching block and the block to be encoded are obtained, and the current frame is obtained.
  • the to-be-coded block of the image is reconstructed (Block Reconstruction, BlkRec) in the picture block area as the neighbor of the next to-be-coded block.
  • In-loop filter In-loop filter (InF) is used to deal with the continuity problem between blocks to make it smoother.
  • a commonly used loop filter is a linear low-pass filter that filters out high frequency components and noise.
  • the embodiments of the present application are based on predictable data access behavior (ie, repeated reading behavior) during video encoding, so as to realize intelligent selection of a data storage mode, so as to reduce the power consumption of the video encoding apparatus.
  • Whether the data to be read should be stored in the low-power Sys$ or SysBuf can be changed according to the frame reference reading strategy during encoding, so that the reconstructed frame images of some or all of the historical frame images stored in Sys$ or SysBuf are repeated.
  • the number of readings is the highest, so as to reduce power consumption to the greatest extent, and ensure that the video encoding device can always maintain the lowest power consumption state when entering and exiting data. If the Sys$ or SysBuf has high-speed bandwidth at the same time, since the Sys$ or SysBuf can satisfy the bandwidth required for repeatedly reading data, the bandwidth of the DRAM can be further reduced.
  • the embodiment of the present application can ensure that the power consumption of the video encoding device is controllable, and the hardware or software of the video encoding device can complete the encoding work as soon as possible, and make full use of the possibility that the video encoding device will repeatedly read the image data of the first area for many times. Desired behavior to change the storage characteristics of the read data allows the video encoding device to maintain its operating speed while reducing power consumption because accessing the data saves power.
  • the speed of reading data is not limited by power consumption, so the video encoding device does not overheat.
  • the SRAM in Sys$ or SysBuf has low latency when reading and writing, which can improve the processing frame rate and reduce the response latency. Since the power consumption can be greatly reduced, the usage time of the battery in the video encoding device can be increased, and the user experience can be improved.
  • FIG. 19 is a third schematic flowchart of a method for performing image processing in a video encoding apparatus provided by an embodiment of the present application.
  • the method of performing image processing in a video encoding device can be applied to a video encoding device or the like.
  • the flow of the method for image processing in a video encoding device may include:
  • the block to be encoded is determined from the current frame image.
  • step 301 For the specific implementation of step 301, reference may be made to the embodiment of step 201, and details are not described herein again.
  • the first area may include a plurality of block rows, each block row includes a plurality of blocks, and the plurality of blocks are arranged in a row.
  • the preset memory includes a first memory and a second memory. It should be noted that the reconstructed frame images of multiple historical frame images can be stored in the second memory in advance, and then the reconstructed frame images of the multiple historical frame images stored in the second memory can be stored in the second memory. A plurality of first regions that need to be read repeatedly for many times are determined in the reconstructed frame image of the image.
  • the multiple first areas can be read from the second memory and store the read image data of the first region in the first memory, and read it while waiting for the video encoding device to perform encoding.
  • the power consumption of the second memory is greater than the first preset multiple of the power consumption of the first memory.
  • the image data of the multiple first regions is read from the second memory, and store the read image data of multiple first areas in the first memory, and the total power consumption of reading and writing data from the first memory and the second memory is less than the preset power consumption threshold, which can reduce the time when reading data. power consumption.
  • the first memory may be Sys$ or SysBuf
  • the second memory may be DRAM
  • the power consumption of the DRAM is greater than the first preset multiple of the power consumption of Sys$ or SysBuf.
  • the energy difference between reading SRAM and reading DRAM is about 100 times different, that is, the energy of reading SRAM is much smaller than that of reading DRAM.
  • the image data of a plurality of first regions may be read from the first memory, for example, the number of times of reading from the first memory may be greater than the number of times of reading from the second memory, and the number of times of reading from the first memory may be less than the number of times read from the second memory, or the number of times read from the first memory may be equal to the number of times read from the second memory, specifically the number of times read from the first memory and the second memory, respectively.
  • Corresponding settings should be made according to specific scenarios, which are not specifically limited in this embodiment of the present application.
  • the video encoding device when reading the image data of multiple first regions, it can be read from Sys$ or SysBuf (ys$ or SysBuf is composed of multiple SRAMs), and when reading, It is read block by row, that is, the block rows in the first area are read in order from top to bottom.
  • the number of times read from Sys$ or SysBuf is greater than or equal to the preset times threshold, it switches to read the remaining data from the DRAM.
  • DRAM consumes 100 times more energy than SRAM. Therefore, by reading a part of the image data of the multiple first regions from Sys$ or SysBuf, and reading another part of the data from the DRAM, the power consumption of reading data can be reduced.
  • the video encoding device replaces a large amount of DRAM power consumption with the power consumption of Sys$ or SysBuf, which greatly reduces the power consumption.
  • the image data of a plurality of unread block lines in the first region is read block line by block line from the second memory.
  • step 305 For the specific implementation of step 305, reference may be made to the embodiment of step 205, and details are not described herein again.
  • the image data of the search window can be separately determined from the image data of each first region, that is, each One search window can be determined in the first area of the reconstructed frame image of the historical frame image, so that multiple search windows can be determined, and each search window is located in the corresponding first area.
  • each search window is determined from the first region of the reconstructed frame image of each historical frame image, reference may be made to the embodiment in step 206 for its specific implementation, which will not be repeated here.
  • the image data of multiple search windows may be stored in a third memory, where the read/write speed of the third memory is greater than the second preset multiple of the read/write speed of the first memory.
  • the third memory may be a cache or a buffer, and the read/write speed of the cache or buffer is greater than the second preset multiple of the read/write speed of Sys$ or SysBuf.
  • the blocks in each block row in the search window are compared with the blocks to be encoded, so that each block and the block to be encoded can be obtained.
  • the coding cost of the coding block one or more blocks can be determined from the coding cost in ascending order, that is, one or more blocks with the smallest coding cost of the block to be coded are determined from each search window.
  • scan block by line in the current search window search for blocks in the current search window, compare the searched block with the block to be coded, and find the code corresponding to the block to be coded from the current search window.
  • the least expensive block or blocks are examples of the blocks in each block row in the search window.
  • the image data of multiple search windows are read from the third memory, and according to the image data of the multiple search windows, the blocks corresponding to the blocks to be encoded are respectively determined from the multiple search windows.
  • One or more blocks with the least coding cost which can include:
  • the image data of the plurality of reduced search windows from the plurality of reduced search windows, respectively determine one or more reduced blocks with the smallest coding cost of the to-be-coded block;
  • one or more blocks with the smallest coding cost of the block to be coded are respectively determined from the plurality of search windows .
  • a hierarchical search method when searching, a hierarchical search method can be used, and the search level is different according to the number of levels. For example, if a search using two layers is used, a search in two layers is performed, and if a search using three layers is used, a search in three layers is performed.
  • an appropriate number of layers can be set according to specific needs. It should be noted that the reduction ratio of each layer may be the same or different.
  • the multiple search windows are reduced according to the preset number of layers. search windows, the multiple reduced search windows are respectively 1/2 of the size of the original search windows. Then, according to the image data of the plurality of reduced search windows, one or more reduced blocks with the smallest coding cost of the block to be coded are respectively determined from the plurality of reduced search windows, and the one or more reduced blocks are respectively determined.
  • the reduction ratio of each reduced block and the reduced search window is the same.
  • each reduced search window On the image of each reduced search window, first determine the approximate range of one or more reduced blocks to be searched, and then return to the image of the unreduced search window to perform a finer block search, that is, according to the narrowed
  • the approximate range of the latter block in the reduced search window is to perform a finer search on the original search window, and one or more blocks with the least encoding cost of the block to be encoded can be determined from the unreduced search window.
  • the plurality of search windows are reduced according to the preset number of layers, for example, the search windows are reduced according to the number of three layers, and a plurality of reduced search windows are obtained.
  • Search windows the multiple reduced search windows are respectively 1/4 of the size of the original search windows.
  • one or more reduced blocks with the smallest coding cost of the block to be coded are respectively determined from the multiple 1/4 reduced search windows, Obtain one or more block vectors corresponding to multiple 1/4-reduced search windows respectively.
  • a search with a finer and smaller range is performed in multiple 1/2-reduced search windows, and finally, according to one or more block vectors obtained from multiple 1/2-reduced search windows, a plurality of original-sized search windows are searched again. Search the range of the window to obtain the final one or more block vectors, so that one or more blocks with the smallest coding cost of the block to be coded can be determined.
  • the original search window can be searched more precisely, and one or more of the unreduced search windows can be determined with the least encoding cost of the block to be encoded. piece.
  • the At least a plurality of blocks with the least coding cost of the block to be coded can be found in each search window. From these blocks, one or more blocks can be selected in order of coding cost from small to large. Usually, one or two blocks with the smallest coding cost are selected. block as a matching block. It can be known that the number of matching blocks can be one, two, or more, depending on the required number of reference blocks. If the number of reference blocks to be examined is two, two matching blocks need to be determined.
  • step 310 For the specific implementation of step 310, reference may be made to the embodiment of step 210, and details are not described herein again.
  • the target position or attribute of data reading can be selected according to the long-term shooting requirement of the photographing device, the requirement of low heat dissipation cost and the relatively large power consumption caused by the predictable behavior.
  • the data that needs to be read repeatedly are read from Sys$ or SysBuf and DRAM respectively, not all of them are read from DRAM, because the same data is read, the power consumption of SRAM is far less than that of DRAM. Therefore, the embodiments of the present application can greatly reduce the power consumption when reading data.
  • This embodiment of the present application uses motion estimation as an example to describe in detail how to reduce the power consumption of reading data.
  • it can also be applied to all modules and applications that require high bandwidth but predictable data access behavior, such as video decoders, frame rate up conversion devices, etc.
  • the behavior of these modules and applications is usually predictable, such as the number of repeated reads.
  • the corresponding storage characteristics can be pre-allocated, that is, the repeatedly read data is stored in low-power memory, such as
  • the energy consumption of different levels of memory is corresponding to the access times of the image data of all or part of the frames, that is, the energy consumption corresponding to different levels is selected according to the access times of the image data of all or part of the frames.
  • the times of reading data from Sys$ or SysBuf and DRAM can be reasonably allocated.
  • the video decoder can also determine the behavior of accessing data by analyzing the code stream in advance, and the frame rate boosting device can know which areas will be used multiple times during processing through simple analysis, and so on. It can also be applied to fixed artificial intelligence (AI) network behavior.
  • AI artificial intelligence
  • the repeated reading part of AI network behavior is the feature map part, and the AI network behavior is predictable.
  • FIG. 20 is a schematic structural diagram of an apparatus for performing image processing in a video encoding apparatus according to an embodiment of the present application.
  • the apparatus 400 for performing image processing in a video encoding apparatus may include: a first determination module 401 , a second determination module 402 , a reading module 403 , a third determination module 404 , and an encoding module 405 .
  • the first determination module 401 is used to determine the block to be encoded from the current frame image
  • the second determination module 402 is configured to determine, from the reconstructed frame images of the historical frame images, a first area that needs to be read repeatedly for many times, and store the image data of the first area in a preset memory, and the The power consumption of the preset memory is less than the preset power consumption threshold;
  • a reading module 403 configured to read the image data of the first region from the preset memory
  • a third determining module 404 configured to determine a matching block matching the block to be encoded from the first region according to the read image data of the first region;
  • the encoding module 405 is configured to encode the block to be encoded according to the relative relationship between the matching block and the block to be encoded.
  • the preset memory includes a first memory and a second memory
  • the power consumption of the second memory is greater than a first preset multiple of the power consumption of the first memory
  • the first area Including a plurality of block lines, the second determining module 402 can be used for:
  • the reading module 403 can be used for:
  • the image data of unread block lines in the first region is read block line by block line from the second memory.
  • the second determining module 402 may be used to:
  • the image data of the moved block line is read from the second memory and stored in the first memory in memory;
  • the third determining module 404 may be used to:
  • the image data of the search window is stored in a third memory, and the read-write speed of the third memory is greater than the second preset multiple of the read-write speed of the first memory;
  • the image data of the search window is read from the third memory, and according to the image data of the search window, the block with the least encoding cost of the block to be encoded is determined from the search window;
  • the block with the smallest coding cost of the block to be coded is used as the matching block.
  • the third determining module 404 may be used to:
  • the reduced block with the smallest encoding cost of the block to be encoded is determined from the reduced search window;
  • a block with the smallest encoding cost to the block to be encoded is determined from the search window.
  • the relative relationship is a motion vector and a residual
  • the encoding module 405 can be used to:
  • the to-be-encoded block is encoded according to the motion vector sum residual of the matched block and the to-be-encoded block.
  • the third determining module 404 may be used to:
  • Entropy coding is performed on the motion vector of the matching block and the block to be coded and the forward transformed and quantized first residual data to obtain coded video stream data;
  • the to-be-coded block is reconstructed according to the second residual data.
  • the preset memory includes a first memory and a second memory
  • the power consumption of the second memory is greater than a first preset multiple of the power consumption of the first memory
  • the first area Including a plurality of block lines, the second determining module 402 can be used for:
  • the image data of the plurality of first regions is read from the second memory and stored in the first memory.
  • the reading module 403 can be used for:
  • the image data of unread block lines in the plurality of first regions is read block line by block line from the second memory.
  • the third determining module 404 may be used to:
  • the image data of the plurality of search windows are read from the third memory, and according to the image data of the plurality of search windows, the codes corresponding to the blocks to be coded are respectively determined from the plurality of search windows. the least expensive block or blocks;
  • the matching block is determined from a plurality of blocks with the smallest coding cost of the block to be coded.
  • the third determining module 404 may be used to:
  • the image data of the plurality of reduced search windows from the plurality of reduced search windows, respectively determine one or more reduced blocks with the smallest coding cost of the to-be-coded block;
  • one or more blocks with the smallest coding cost of the block to be coded are respectively determined from the plurality of search windows .
  • the first memory includes a system cache or a system buffer memory provided outside the video encoding device
  • the second memory includes a dynamic random access memory provided outside the video encoding device.
  • the third memory includes a buffer or buffer provided inside the video encoding device.
  • An embodiment of the present application provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed on a computer, the computer is made to execute the image encoding in a video encoding device as provided in this embodiment. Process in the method of processing.
  • An embodiment of the present application further provides an electronic device, including a memory, a processor, and a video encoding apparatus.
  • the processor is configured to execute the video encoding apparatus provided in this embodiment by calling a computer program stored in the memory. The flow in the method of image processing.
  • the above-mentioned electronic device may be a mobile terminal such as a tablet computer or a smart phone.
  • FIG. 21 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • the electronic device 500 may include a video encoding apparatus 501, a memory 502, a processor 503 and other components.
  • a video encoding apparatus 501 may include a video encoding apparatus 501, a memory 502, a processor 503 and other components.
  • FIG. 21 does not constitute a limitation on the electronic device, and may include more or less components than the one shown, or combine some components, or arrange different components.
  • the video encoding device 501 may be used for encoding video images to compress the content of the video images.
  • Memory 502 may be used to store applications and data.
  • the application program stored in the memory 502 contains executable code.
  • Applications can be composed of various functional modules.
  • the processor 503 executes various functional applications and data processing by running the application programs stored in the memory 502 .
  • the processor 503 is the control center of the electronic device, uses various interfaces and lines to connect various parts of the entire electronic device, and executes the electronic device by running or executing the application program stored in the memory 502 and calling the data stored in the memory 502.
  • the various functions and processing data of the device are used to monitor the electronic equipment as a whole.
  • the processor 503 in the electronic device loads the executable code corresponding to the process of one or more application programs into the memory 502 according to the following instructions, and the processor 503 executes and stores it in the memory 502 in the application, which executes:
  • the to-be-encoded block is encoded according to the relative relationship between the matched block and the to-be-encoded block.
  • the electronic device 500 may include components such as a video encoder 501 , a memory 502 , a processor 503 , a battery 504 , an input unit 505 , and an output unit 506 .
  • the video coding module 501 may be used for coding video images to compress the content of the video images.
  • Memory 502 may be used to store applications and data.
  • the application program stored in the memory 502 contains executable code.
  • Applications can be composed of various functional modules.
  • the processor 503 executes various functional applications and data processing by running the application programs stored in the memory 502 .
  • the processor 503 is the control center of the electronic device, uses various interfaces and lines to connect various parts of the entire electronic device, and executes the electronic device by running or executing the application program stored in the memory 502 and calling the data stored in the memory 502.
  • the various functions and processing data of the device are used to monitor the electronic equipment as a whole.
  • the battery 504 may be used to provide electrical support for various components of the electronic device, thereby ensuring the normal operation of the various components.
  • the input unit 505 can be used to receive an input video stream of video images, for example, can be used to receive a video stream that needs to be compressed.
  • the output unit 506 may be used to output the compressed video stream.
  • the processor 503 in the electronic device loads the executable code corresponding to the process of one or more application programs into the memory 502 according to the following instructions, and the processor 503 executes and stores it in the memory 502 in the application, which executes:
  • the to-be-encoded block is encoded according to the relative relationship between the matched block and the to-be-encoded block.
  • FIG. 23 is a schematic structural diagram of the image processing system provided by the embodiment of the present application.
  • FIG. 24 is another schematic structural diagram of an image processing system provided by an embodiment of the present application.
  • the image processing system 600 includes a video encoding apparatus 601, a first memory 602 and a second memory 603, wherein the power consumption of the second memory 603 is greater than a first preset multiple of the power consumption of the first memory 602, and the video encoding apparatus 601 may Including a third memory, the reading speed of the third memory is greater than the second preset multiple of the reading speed of the first memory, and the first memory 602 and the second memory 603 respectively store the reconstructed frame images of the historical frame images.
  • the video encoding device 601 reads the image data repeatedly read from the first memory 602 and the second memory 603 according to a preset number of times when encoding, and determines the image data in the search window. Image data, storing the image data in the search window in the third memory.
  • the first area that needs to be read repeatedly for multiple times can be determined from the reconstructed frame images of the historical frame images stored in the second memory 603 , and then read the image data of the first area from the second memory 603 , and store the read image data of the first area into the first memory 602 .
  • the video encoding apparatus 601 can retrieve the image data from the first memory 602 The image data of the first region is read block by line. If the number of times of reading from the first memory 602 is greater than or equal to the preset number of times threshold, the image data of the unread block lines in the first region is read block line by block line from the second memory 603 .
  • the video encoding apparatus 601 may directly read the image data from the second memory 603 , or the first memory 602 may read the image data from the second memory 603 after For storage, this part of the image data is directly read from the first memory 602 by the video encoding device 601 .
  • the video encoding device 601 can read the image data in the search window from the third memory, according to the image data in the search window read from the third memory, from the search window, determine the matching block that matches the module to be encoded, And encoding is performed according to the motion vector and residual of the matching block and the block to be encoded.
  • the apparatus for performing image processing in a video coding apparatus provided by the embodiments of the present application and the method for performing image processing in a video coding apparatus in the above embodiments belong to the same concept. Any of the methods provided in the method embodiments for performing image processing in a video encoding device can be run on the device of a video encoding device. For details of the specific implementation process, please refer to the method embodiments for performing image processing in a video encoding device. Repeat.
  • the storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM, Read Only Memory), a random access memory (RAM, Random Access Memory), and the like.
  • each functional module may be integrated in a processing chip, or each module may exist physically alone, or two or more modules may be used. integrated in one module.
  • the above-mentioned integrated modules can be implemented in the form of hardware, and can also be implemented in the form of software function modules. If the integrated module is implemented in the form of a software function module and sold or used as an independent product, it can also be stored in a computer-readable storage medium, such as a read-only memory, a magnetic disk or an optical disk, etc. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The present application discloses a method for performing image processing in a video encoding device, comprising: determining from the current image frame a block to be encoded; determining, from a reconstruction image frame of a historical image frame, a first area needing to be repeatedly read for multiple times, and storing image data in a preset memory; reading the image data of the first area from the preset memory; and determining, from the first area according to the image data, a matching block matched with the block to be encoded; and according to a relative relationship between the matching block and the block to be encoded, encoding the block to be encoded.

Description

在视频编码装置中进行图像处理的方法、装置及系统Method, device and system for image processing in a video encoding device
本申请要求于2021年04月01日提交中国专利局、申请号为202110358171.1、申请名称为“在视频编码装置中进行图像处理的方法、装置及系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with the application number 202110358171.1 and the application title "Method, Apparatus and System for Image Processing in a Video Encoding Device" filed with the China Patent Office on April 1, 2021, the entire contents of which are Incorporated herein by reference.
技术领域technical field
本申请属于电子设备技术领域,尤其涉及一种在视频编码装置中进行图像处理的方法、装置、存储介质、电子设备及系统。The present application belongs to the technical field of electronic devices, and in particular, relates to a method, device, storage medium, electronic device and system for image processing in a video encoding device.
背景技术Background technique
随着技术的不断发展,视频编码装置的功能越来越强大。视频编码装置可以对视频图像进行编码。在对一帧视频图像进行编码时,通常会需要多帧已编码视频图像数据量的读取。然而,相关技术中,在对已编码视频图像的数据进行读取时,视频编码装置的功耗较大。With the continuous development of technology, the functions of the video encoding apparatus are becoming more and more powerful. The video encoding apparatus may encode video images. When encoding one frame of video image, it is usually necessary to read the data amount of multiple frames of encoded video images. However, in the related art, when the data of the encoded video image is read, the power consumption of the video encoding device is relatively large.
发明内容SUMMARY OF THE INVENTION
本申请实施例提供一种在视频编码装置中进行图像处理的方法、装置、存储介质、电子设备及系统,可以降低视频编码装置的功耗。Embodiments of the present application provide a method, device, storage medium, electronic device, and system for performing image processing in a video encoding device, which can reduce power consumption of the video encoding device.
第一方面,本申请实施例提供一种在视频编码装置中进行图像处理的方法,所述方法包括:In a first aspect, an embodiment of the present application provides a method for image processing in a video encoding device, the method comprising:
从当前帧图像中确定出待编码块(encoded block);Determine the block to be encoded (encoded block) from the current frame image;
从历史帧图像的重构帧图像中确定出需要多次重复读取的第一区域,并将所述第一区域的图像数据存储在预设存储器中,所述预设存储器的功耗小于预设功耗阈值;Determine the first area that needs to be read repeatedly from the reconstructed frame images of the historical frame images, and store the image data of the first area in a preset memory, where the power consumption of the preset memory is less than that of the preset memory. Set the power consumption threshold;
从所述预设存储器中读取所述第一区域的图像数据;read the image data of the first area from the preset memory;
根据读取的所述第一区域的图像数据从所述第一区域中确定出与所述待编码块相匹配的匹配块;determining a matching block matching the block to be encoded from the first region according to the read image data of the first region;
根据所述匹配块与所述待编码块的相对关系,对所述待编码块进行编码。The to-be-encoded block is encoded according to the relative relationship between the matched block and the to-be-encoded block.
第二方面,本申请实施例提供一种在视频编码装置中进行图像处理的装置,所述装置包括:In a second aspect, an embodiment of the present application provides an apparatus for performing image processing in a video encoding apparatus, the apparatus comprising:
第一确定模块,用于从当前帧图像中确定出待编码块;a first determining module, configured to determine the block to be encoded from the current frame image;
第二确定模块,用于从历史帧图像的重构帧图像中确定出需要多次重复读取的第一区域,并将所述第一区域的图像数据存储在预设存储器中,所述预设存储器的功耗小于预设功耗阈值;The second determination module is configured to determine, from the reconstructed frame images of the historical frame images, a first area that needs to be read repeatedly for many times, and store the image data of the first area in a preset memory, and the preset memory Set the power consumption of the memory to be less than the preset power consumption threshold;
读取模块,用于从所述预设存储器中读取所述第一区域的图像数据;a reading module for reading the image data of the first area from the preset memory;
第三确定模块,用于根据读取的所述第一区域的图像数据从所述第一区域中确定出与所述待编码块相匹配的匹配块;a third determining module, configured to determine a matching block matching the block to be encoded from the first region according to the read image data of the first region;
编码模块,用于根据所述匹配块与所述待编码块的相对关系,对所述待编码块进行编码。An encoding module, configured to encode the to-be-encoded block according to the relative relationship between the matched block and the to-be-encoded block.
第三方面,本申请实施例提供一种存储介质,其上存储有计算机程序,当所述计算机程序在计算机上执行时,使得所述计算机执行本申请实施例提供的在视频编码装置中进行图像处理的方法。In a third aspect, embodiments of the present application provide a storage medium on which a computer program is stored, and when the computer program is executed on a computer, causes the computer to execute the image encoding in a video encoding apparatus provided by the embodiments of the present application method of processing.
第四方面,本申请实施例还提供一种电子设备,包括存储器,处理器以及视频编码装置,所述处理器通过调用所述存储器中存储的计算机程序,用于执行本申请实施例提供的在视频编码装置中进行图像处理的方法。In a fourth aspect, the embodiments of the present application further provide an electronic device, including a memory, a processor, and a video encoding apparatus. The processor is configured to execute the computer program stored in the memory by invoking the computer program provided in the embodiments of the present application. A method for image processing in a video encoding device.
第五方面,本申请实施例还提供一种图像处理系统,包括视频编码装置、第一存储器和第二存储器,所述第二存储器的功耗大于所述第一存储器的功耗的第一预设倍数,所述视频编码装置包括第三存储器,所述第三存储器的读取速度大于所述第一存储器的读取速度的第二预设倍数,所述第一存储器和第二存储器分别存储历史帧图像的重构帧图像中多次重复读取的图像数据,所述视频编码装置在编码时,按照预设次数分别从所述第一存储器和第二存储器读取所述多次重复读取的图像数据,并从中确定出搜索窗(Search Window,SWin)内的图像数据,将所述搜索窗内的图像数据存储在所述第三存储器中,所述视频编码装置从所述第三存储器中读取所述搜索窗 内的图像数据,并确定出与待编码块相匹配的匹配块,根据所述匹配块与待编码块的运动矢量和残差进行编码。In a fifth aspect, an embodiment of the present application further provides an image processing system, including a video encoding device, a first memory, and a second memory, wherein the power consumption of the second memory is greater than a first preset of the power consumption of the first memory. Assuming a multiple, the video encoding device includes a third memory, the read speed of the third memory is greater than the second preset multiple of the read speed of the first memory, the first memory and the second memory respectively store The image data repeatedly read in the reconstructed frame images of the historical frame images, the video encoding device reads the repeatedly read data from the first memory and the second memory according to a preset number of times during encoding. The image data obtained, and determine the image data in the search window (Search Window, SWin) therefrom, store the image data in the search window in the third memory, and the video encoding device from the third The image data in the search window is read from the memory, and a matching block matching the block to be coded is determined, and coding is performed according to the motion vector and residual of the matching block and the block to be coded.
附图说明Description of drawings
下面结合附图,通过对本申请的具体实施方式详细描述,将使本申请的技术方案及其有益效果显而易见。The technical solutions of the present application and the beneficial effects thereof will be apparent through the detailed description of the specific embodiments of the present application in conjunction with the accompanying drawings.
图1是本申请实施例提供的在视频编码装置中进行图像处理的方法的第一种流程示意图。FIG. 1 is a first schematic flowchart of a method for performing image processing in a video encoding apparatus provided by an embodiment of the present application.
图2是相关技术中视频压缩系统的结构示意图。FIG. 2 is a schematic structural diagram of a video compression system in the related art.
图3是相关技术中视频编码装置中数据存储的示意图。FIG. 3 is a schematic diagram of data storage in a video encoding apparatus in the related art.
图4是相关技术中增加动态随机存取内存(Dynamic Random Access Memory,DRAM)的通道(channel)数量进行数据存取的示意图。FIG. 4 is a schematic diagram of increasing the number of channels of a dynamic random access memory (DRAM) for data access in the related art.
图5是本申请实施例提供的正方形块的比对示意图。FIG. 5 is a schematic diagram of a comparison of square blocks provided by an embodiment of the present application.
图6是本申请实施例提供的阶层式搜索的示意图。FIG. 6 is a schematic diagram of a hierarchical search provided by an embodiment of the present application.
图7是本申请实施例提供的非阶层式搜索的示意图。FIG. 7 is a schematic diagram of a non-hierarchical search provided by an embodiment of the present application.
图8是本申请实施例提供的在视频编码装置中进行图像处理的方法的第二种流程示意图。FIG. 8 is a second schematic flowchart of a method for performing image processing in a video encoding apparatus provided by an embodiment of the present application.
图9是本申请实施例提供的在历史帧图像的重构帧图像中进行搜索的场景示意图。FIG. 9 is a schematic diagram of a scene of searching in a reconstructed frame image of a historical frame image provided by an embodiment of the present application.
图10是本申请实施例提供的静态随机存取存储器(Static Random-Access Memory,SRAM)与动态随机存取内存在读取数据时所消耗的能量的对比示意图。10 is a schematic diagram illustrating a comparison of the energy consumed when reading data between a static random-access memory (Static Random-Access Memory, SRAM) provided by an embodiment of the present application and a dynamic random-access memory.
图11是本申请实施例提供的使用系统高速缓存(system cache,Sys$)的视频压缩系统的一种架构示意图。FIG. 11 is a schematic structural diagram of a video compression system using a system cache (Sys$) provided by an embodiment of the present application.
图12是本申请实施例提供的使用系统高速缓存的视频压缩系统的另一种架构示意图。FIG. 12 is another schematic structural diagram of a video compression system using a system cache provided by an embodiment of the present application.
图13是本申请实施例提供的使用系统缓冲存储器(System Buffer,SysBuf)的视频压缩系统的架构示意图。FIG. 13 is a schematic structural diagram of a video compression system using a system buffer memory (System Buffer, SysBuf) provided by an embodiment of the present application.
图14是本申请实施例提供的历史帧图像的重构帧图像下移一个块行时的场景示意图。FIG. 14 is a schematic diagram of a scene when the reconstructed frame image of the historical frame image provided by the embodiment of the present application is moved down by one block line.
图15是本申请实施例提供的从多通道DRAM读写数据时的功耗曲线示意图。FIG. 15 is a schematic diagram of a power consumption curve when reading and writing data from a multi-channel DRAM according to an embodiment of the present application.
图16是本申请实施例提供的分别从Sys$或SysBuf以及DRAM读写数据时的功耗曲线示意图。FIG. 16 is a schematic diagram of a power consumption curve when reading and writing data from Sys$ or SysBuf and DRAM, respectively, according to an embodiment of the present application.
图17为本申请实施例提供的历史帧图像的重构帧图像中搜索窗的搜索范围的场景示意图。FIG. 17 is a schematic diagram of a scene of a search range of a search window in a reconstructed frame image of a historical frame image provided by an embodiment of the present application.
图18是本申请实施例提供的视频编码装置编码的场景示意图。FIG. 18 is a schematic diagram of a scene encoded by a video encoding apparatus provided by an embodiment of the present application.
图19是本申请实施例提供的在视频编码装置中进行图像处理的方法的第三种流程示意图。FIG. 19 is a third schematic flowchart of a method for performing image processing in a video encoding apparatus provided by an embodiment of the present application.
图20是本申请实施例提供的在视频编码装置中进行图像处理的装置的结构示意图。FIG. 20 is a schematic structural diagram of an apparatus for performing image processing in a video encoding apparatus provided by an embodiment of the present application.
图21是本申请实施例提供的电子设备的结构示意图。FIG. 21 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
图22是本申请实施例提供的电子设备的另一结构示意图。FIG. 22 is another schematic structural diagram of an electronic device provided by an embodiment of the present application.
图23是本申请实施例提供的图像处理系统的结构示意图。FIG. 23 is a schematic structural diagram of an image processing system provided by an embodiment of the present application.
图24是本申请实施例提供的图像处理系统的另一结构示意图。FIG. 24 is another schematic structural diagram of an image processing system provided by an embodiment of the present application.
具体实施方式Detailed ways
请参照图示,其中相同的组件符号代表相同的组件,本申请的原理是以实施在一适当的运算环境中来举例说明。以下的说明是基于所例示的本申请具体实施例,其不应被视为限制本申请未在此详述的其它具体实施例。Please refer to the drawings, wherein the same component symbols represent the same components, and the principles of the present application are exemplified by being implemented in a suitable computing environment. The following description is based on illustrated specific embodiments of the present application and should not be construed as limiting other specific embodiments of the present application not detailed herein.
本申请实施例提供一种在视频编码装置中进行图像处理的方法,所述方法包括:An embodiment of the present application provides a method for image processing in a video encoding device, the method comprising:
从当前帧图像中确定出待编码块;Determine the block to be coded from the current frame image;
从历史帧图像的重构帧图像中确定出需要多次重复读取的第一区域,并将所述第一区域的图像数据存储在预设存储器中,所述预设存储器的功耗小于预设功耗阈值;Determine the first area that needs to be read repeatedly from the reconstructed frame images of the historical frame images, and store the image data of the first area in a preset memory, where the power consumption of the preset memory is less than that of the preset memory. Set the power consumption threshold;
从所述预设存储器中读取所述第一区域的图像数据;read the image data of the first area from the preset memory;
根据读取的所述第一区域的图像数据从所述第一区域中确定出与所述待编码块相匹配的匹 配块;according to the read image data of the first area, determine a matching block that matches the to-be-coded block from the first area;
根据所述匹配块与所述待编码块的相对关系,对所述待编码块进行编码。The to-be-encoded block is encoded according to the relative relationship between the matched block and the to-be-encoded block.
在一些实施例中,所述预设存储器包括第一存储器和第二存储器,所述第二存储器的功耗大于所述第一存储器的功耗的第一预设倍数,所述第一区域包括多个块行,所述从历史帧图像的重构帧图像中确定出需要多次重复读取的第一区域,并将所述第一区域的图像数据存储在预设存储器中,包括:In some embodiments, the preset memory includes a first memory and a second memory, the power consumption of the second memory is greater than a first preset multiple of the power consumption of the first memory, and the first area includes A plurality of block rows, the first area that needs to be read repeatedly for multiple times is determined from the reconstructed frame image of the historical frame image, and the image data of the first area is stored in the preset memory, including:
从存储在所述第二存储器中的所述历史帧图像的重构帧图像中确定出需要多次重复读取的第一区域;Determine from the reconstructed frame image of the historical frame image stored in the second memory the first area that needs to be read repeatedly;
从所述第二存储器中读取所述第一区域的图像数据并将其存储到所述第一存储器中;read the image data of the first area from the second memory and store it in the first memory;
所述从所述预设存储器中读取所述第一区域的图像数据,包括:The reading of the image data of the first region from the preset memory includes:
从所述第一存储器中逐块行读取所述第一区域的图像数据;Reading the image data of the first region from the first memory block by line;
若从所述第一存储器中读取的次数大于或等于预设次数阈值,则从第二存储器中逐块行读取第一区域中未被读取块行的图像数据。If the number of times of reading from the first memory is greater than or equal to a preset number of times threshold, the image data of unread block lines in the first region is read block line by block line from the second memory.
在一些实施例中,所述从所述第二存储器中读取所述第一区域的图像数据并将其存储到所述第一存储器中,包括:In some embodiments, the reading the image data of the first region from the second memory and storing the image data in the first memory includes:
若所述第一区域在所述历史帧图像的重构帧图像中下移一个块行,则从所述第二存储器中读取下移块行的图像数据并将其存储到所述第一存储器中;If the first area is moved down by one block line in the reconstructed frame image of the historical frame image, the image data of the moved block line is read from the second memory and stored in the first memory in memory;
将所述第一存储器中下一个待编码块行编码时用不到的块行进行移除。The block lines that are not used in the encoding of the next block line to be encoded in the first memory are removed.
在一些实施例中,所述根据读取的所述第一区域的图像数据从所述第一区域中确定出与所述待编码块相匹配的匹配块,包括:In some embodiments, the determining a matching block matching the block to be encoded from the first region according to the read image data of the first region includes:
从读取的所述第一区域的图像数据中确定出搜索窗的图像数据,所述搜索窗位于所述第一区域内;Determine the image data of the search window from the read image data of the first area, and the search window is located in the first area;
将所述搜索窗的图像数据存储在第三存储器中,所述第三存储器的度大于所述第一存储器的读写速度的第二预设倍数;storing the image data of the search window in a third memory, where the degree of the third memory is greater than the second preset multiple of the read-write speed of the first memory;
从所述第三存储器中读取所述搜索窗的图像数据,并根据所述搜索窗的图像数据,从所述搜索窗中确定出与所述待编码块的编码代价最小的块;The image data of the search window is read from the third memory, and according to the image data of the search window, the block with the least encoding cost of the block to be encoded is determined from the search window;
将与所述待编码块的编码代价最小的块作为所述匹配块。The block with the smallest coding cost of the block to be coded is used as the matching block.
在一些实施例中,所述从所述第三存储器中读取所述搜索窗的图像数据,并根据所述搜索窗的图像数据,从所述搜索窗中确定出与所述待编码块的编码代价最小的块,包括:In some embodiments, the image data of the search window is read from the third memory, and the image data of the to-be-coded block is determined from the search window according to the image data of the search window. The least expensive block to encode, including:
从所述第三存储器中读取所述搜索窗的图像数据;Read the image data of the search window from the third memory;
将所述搜索窗按照预设阶层数进行缩小,得到缩小后的搜索窗;reducing the search window according to the preset number of layers to obtain a reduced search window;
根据所述缩小后的搜索窗的图像数据,从所述缩小后的搜索窗中确定出与所述待编码块的编码代价最小的缩小后的块;According to the image data of the reduced search window, the reduced block with the smallest encoding cost of the block to be encoded is determined from the reduced search window;
根据所述缩小后的块在所述缩小后的搜索窗中的位置,从所述搜索窗中确定出与所述待编码块的编码代价最小的块。According to the position of the reduced block in the reduced search window, a block with the smallest encoding cost to the block to be encoded is determined from the search window.
在一些实施例中,所述相对关系为运动矢量和残差,所述根据所述匹配块与所述待编码块的相对关系,对所述待编码块进行编码,包括:In some embodiments, the relative relationship is a motion vector and a residual, and the encoding the block to be encoded according to the relative relationship between the matching block and the block to be encoded includes:
根据所述匹配块与所述待编码块的运动矢量和残差,对所述待编码块进行编码。The to-be-encoded block is encoded according to the motion vector sum residual of the matched block and the to-be-encoded block.
在一些实施例中,所述根据所述匹配块与所述待编码块的运动矢量和残差,对所述待编码块进行编码,包括:In some embodiments, the encoding the block to be encoded according to the motion vector and the residual of the matching block and the block to be encoded includes:
将所述匹配块与所述待编码块的残差进行正向变换和量化;performing forward transform and quantization on the residual of the matching block and the block to be encoded;
将所述匹配块与所述待编码块的运动矢量以及正向变换和量化后的第一残差数据进行熵编码,得到视频流编码数据;或者Entropy coding is performed on the motion vector of the matching block and the block to be coded and the forward transformed and quantized first residual data to obtain coded video stream data; or
将所述正向变换和量化后的第一残差数据进行反向量化与变换,得到第二残差数据;Perform inverse quantization and transformation on the first residual data after the forward transformation and quantization to obtain second residual data;
根据所述第二残差数据对所述待编码块进行重构。The to-be-coded block is reconstructed according to the second residual data.
在一些实施例中,所述预设存储器包括第一存储器和第二存储器,所述第二存储器的功耗大 于所述第一存储器的功耗的第一预设倍数,所述第一区域包括多个块行,所述从历史帧图像的重构帧图像中确定出需要多次重复读取的第一区域,并将所述第一区域的图像数据存储在预设存储器中,包括:In some embodiments, the preset memory includes a first memory and a second memory, the power consumption of the second memory is greater than a first preset multiple of the power consumption of the first memory, and the first area includes A plurality of block rows, the first area that needs to be read repeatedly for multiple times is determined from the reconstructed frame image of the historical frame image, and the image data of the first area is stored in the preset memory, including:
从存储在所述第二存储器中的多个历史帧图像的重构帧图像中确定出需要多次重复读取的多个第一区域;Determine from the reconstructed frame images of the plurality of historical frame images stored in the second memory a plurality of first regions that need to be read repeatedly;
从所述第二存储器中读取所述多个第一区域的图像数据并将其存储在所述第一存储器中;reading image data of the plurality of first regions from the second memory and storing it in the first memory;
所述从所述预设存储器中读取所述第一区域的图像数据,包括:The reading of the image data of the first region from the preset memory includes:
从所述第一存储器中逐块行读取所述多个第一区域的图像数据;read the image data of the plurality of first regions from the first memory block by line;
若从所述第一存储器中读取的次数大于或等于预设次数阈值,则从第二存储器中逐块行读取所述多个第一区域中未被读取块行的图像数据。If the number of times of reading from the first memory is greater than or equal to a preset number of times threshold, the image data of unread block lines in the plurality of first regions is read block line by block line from the second memory.
在一些实施例中,所述根据读取的所述第一区域的图像数据从所述第一区域中确定出与所述待编码块相匹配的匹配块,包括:In some embodiments, the determining a matching block matching the block to be encoded from the first region according to the read image data of the first region includes:
从读取的所述多个第一区域的图像数据中确定出多个搜索窗的图像数据,每个所述搜索窗位于对应的所述第一区域内;Determine the image data of a plurality of search windows from the read image data of the plurality of first areas, and each of the search windows is located in the corresponding first area;
将所述多个搜索窗的图像数据存储在第三存储器中,所述第三存储器的读写速度大于所述第一存储器的读写速度的第二预设倍数;storing the image data of the plurality of search windows in a third memory, where the read-write speed of the third memory is greater than the second preset multiple of the read-write speed of the first memory;
从所述第三存储器中读取所述多个搜索窗的图像数据,并根据所述多个搜索窗的图像数据,从所述多个搜索窗中分别确定出与所述待编码块的编码代价最小的一个或多个块;The image data of the plurality of search windows are read from the third memory, and according to the image data of the plurality of search windows, the codes corresponding to the blocks to be coded are respectively determined from the plurality of search windows. the least expensive block or blocks;
从与所述待编码块的编码代价最小的多个块中确定出所述匹配块。The matching block is determined from a plurality of blocks with the smallest coding cost of the block to be coded.
在一些实施例中,所述从所述第三存储器中读取所述多个搜索窗的图像数据,并根据所述多个搜索窗的图像数据,从所述多个搜索窗中分别确定出与所述待编码块的编码代价最小的一个或多个块,包括:In some embodiments, the image data of the plurality of search windows are read from the third memory, and according to the image data of the plurality of search windows, the image data of the plurality of search windows are determined respectively from the plurality of search windows. One or more blocks with the smallest coding cost of the block to be coded, including:
从所述第三存储器中读取所述多个搜索窗的图像数据;Read the image data of the plurality of search windows from the third memory;
将所述多个搜索窗按照预设阶层数进行缩小,得到多个缩小后的搜索窗;reducing the plurality of search windows according to the preset number of layers to obtain a plurality of reduced search windows;
根据所述多个缩小后的搜索窗的图像数据,从所述多个缩小后的搜索窗中分别确定出与所述待编码块的编码代价最小的一个或多个缩小后的块;According to the image data of the plurality of reduced search windows, from the plurality of reduced search windows, respectively determine one or more reduced blocks with the smallest coding cost of the to-be-coded block;
根据所述一个或多个缩小后的块在所述缩小后的搜索窗中的位置,从所述多个搜索窗中分别确定出与所述待编码块的编码代价最小的一个或多个块。According to the positions of the one or more reduced blocks in the reduced search window, one or more blocks with the smallest coding cost of the block to be coded are respectively determined from the plurality of search windows .
在一些实施例中,所述第一存储器包括设置在视频编码装置外部的系统高速缓存或系统缓冲存储器,所述第二存储器包括设置在视频编码装置外部的动态随机存取内存。In some embodiments, the first memory includes a system cache or system buffer memory disposed external to the video encoding device, and the second memory includes dynamic random access memory disposed external to the video encoding device.
在一些实施例中,所述第三存储器包括设置在视频编码装置内部的缓存或缓冲。In some embodiments, the third memory includes a buffer or buffer provided inside the video encoding device.
请参阅图1,图1是本申请实施例提供的在视频编码装置中进行图像处理的方法的第一种流程示意图。该在视频编码装置中进行图像处理的方法可以应用于视频编码装置中。该在视频编码装置中进行图像处理的方法的流程可以包括:Please refer to FIG. 1 . FIG. 1 is a first schematic flowchart of a method for performing image processing in a video encoding apparatus provided by an embodiment of the present application. The method for image processing in a video encoding device can be applied to a video encoding device. The flow of the method for image processing in a video encoding device may include:
101、从当前帧图像中确定出待编码块。101. Determine the block to be encoded from the current frame image.
随着技术的不断发展,视频编码装置的功能越来越强大。视频编码装置可以对视频图像进行编码。在对一帧视频图像进行编码时,通常会需要多帧已编码视频图像数据量的读取。然而,相关技术中,在对已编码视频图像的数据进行读取时,视频编码装置的功耗较大。With the continuous development of technology, the functions of the video encoding apparatus are becoming more and more powerful. The video encoding apparatus may encode video images. When encoding one frame of video image, it is usually necessary to read the data amount of multiple frames of encoded video images. However, in the related art, when the data of the encoded video image is read, the power consumption of the video encoding device is relatively large.
请参阅图2,图2为相关技术中视频压缩系统的结构示意图。该视频压缩系统中,中央处理器(Central Processing Unit/Processor,CPU)、视频编码装置、图像处理器(Image Signal Processor,ISP)和神经网络处理器(Neural Network Processing Unit,NPU)通过总线和动态随机存取内存控制器(Dynamic Random Access Memory Controller,DRAMC)从DRAM读写数据,中央处理器、视频编码装置、图像处理器和神经网络处理器分时共用带宽,中央处理器、图像处理器和神经网络处理器的优先级高于视频编码装置的优先级。视频编码装置在进行编码时需要进行搜索动作,会占用较大的带宽。Please refer to FIG. 2 , which is a schematic structural diagram of a video compression system in the related art. In the video compression system, a central processing unit (Central Processing Unit/Processor, CPU), a video encoding device, an image processor (Image Signal Processor, ISP) and a neural network processor (Neural Network Processing Unit, NPU) are connected through the bus and dynamic Random access memory controller (Dynamic Random Access Memory Controller, DRAMC) reads and writes data from DRAM, central processing unit, video encoding device, image processor and neural network processor time-sharing bandwidth, central processing unit, image processor and The priority of the neural network processor is higher than that of the video encoding device. The video encoding apparatus needs to perform a search operation when encoding, which will occupy a large bandwidth.
视频编码装置非常重视成本的高低,在帧缓冲时,为了达到最低成本与最高生产良率,通常 都是以DRAM作为主要的存放空间。请参阅图3,图3是相关技术中视频编码装置中数据存储的示意图。其中,当前帧(Current Frame)图像、参考帧(Reference Frame)图像、重构帧(Reconstructed Frame)图像、比特流(Bitstreams)以及临时数据(Temporary data)都存储在视频编码装置中的DRAM中。然而,DRAM提供的带宽较小。Video encoding devices attach great importance to cost. In frame buffering, in order to achieve the lowest cost and highest production yield, DRAM is usually used as the main storage space. Please refer to FIG. 3 , which is a schematic diagram of data storage in a video encoding apparatus in the related art. Wherein, the current frame (Current Frame) image, the reference frame (Reference Frame) image, the reconstructed frame (Reconstructed Frame) image, the bit stream (Bitstreams) and the temporary data (Temporary data) are all stored in the DRAM in the video encoding device. However, DRAM offers less bandwidth.
需要说明的是,对当前帧图像进行编码后变成重构帧图像,该当前帧图像的重构帧图像可以作为下一帧图像的参考帧图像,同理,对前一帧图像进行编码后变成前一帧图像的重构帧图像,前一帧图像的重构帧图像可以作为当前帧图像的参考帧图像。临时数据可以是时域运动矢量(Temporal Motion Vector,TMV)、缩放帧(scaled frames)或其它数据。It should be noted that after encoding the current frame image, it becomes a reconstructed frame image, and the reconstructed frame image of the current frame image can be used as the reference frame image of the next frame image. It becomes the reconstructed frame image of the previous frame image, and the reconstructed frame image of the previous frame image can be used as the reference frame image of the current frame image. The temporary data may be Temporal Motion Vector (TMV), scaled frames, or other data.
随着新型视频标准的出现,如高效率视讯编码(High Efficiency Video Coding,H.265/HEVC)、多功能影像编码(Versatile Video Coding,H.266/VVC),开放媒体联盟影像编码1代标准(Alliance for Open Media Video 1,AV1),必要影像编码(Essential Video Coding,MPEG-5/EVC)等,其针对越来越大画面尺寸且越来越高帧率。基于此,通常使用增加DRAM的带宽或提高DRAM频率的方式以达到加速吞吐数据量。With the emergence of new video standards, such as High Efficiency Video Coding (H.265/HEVC), Versatile Video Coding (H.266/VVC), the first generation of video coding standards of the Open Media Alliance (Alliance for Open Media Video 1, AV1), essential video coding (Essential Video Coding, MPEG-5/EVC), etc., which are aimed at increasingly larger screen sizes and higher frame rates. Based on this, it is usually used to increase the bandwidth of the DRAM or increase the frequency of the DRAM to accelerate the throughput of data.
即使是阶层式搜索(hierarchical search)的运动估计(Motion Estimation,ME)减轻了多倍参考帧的读取问题,在大尺寸高帧率的情况下,需要DRAM较高的吞吐量。通常会通过增加DRAM的通道数量来实现吞吐量的上升,这样会造成功耗过高的问题。Even the Motion Estimation (ME) of hierarchical search alleviates the problem of reading multiple reference frames, which requires higher throughput of DRAM in the case of large size and high frame rate. The increase in throughput is usually achieved by increasing the number of DRAM channels, which can cause excessive power consumption.
请参阅图4,图4是相关技术中增加DRAM的通道数量进行数据存取的示意图。通过增加DRAM的通道数量,可以增大带宽,提高频率,以增加DRAM吞吐数据的速度,但会造成较大的功耗。如,为了满足视频编码装置达到读取速度的需求,系统DRAM的带宽消耗较大的能量。但不论视频编码装置是执行即时操作还是非即时操作,维持最高效率是非常重要的。相关技术中的方法,当视频编码装置在预期时间完成编码的情况下,会造成DRAM极大的功耗。Please refer to FIG. 4 . FIG. 4 is a schematic diagram of increasing the number of DRAM channels for data access in the related art. By increasing the number of channels of the DRAM, the bandwidth and frequency can be increased to increase the data throughput speed of the DRAM, but it will cause greater power consumption. For example, in order to meet the requirement of the video encoding device to reach the reading speed, the bandwidth of the system DRAM consumes a large amount of energy. But whether the video encoding device performs immediate or non-real-time operations, it is very important to maintain the highest efficiency. In the method in the related art, when the video encoding apparatus completes the encoding at an expected time, it will cause great power consumption of the DRAM.
视频编码装置普遍使用块(可以认为是像素块)为基本单位,该块可以是长方形,正方形,或梯形,三角形拼凑出来的,这样的情况下就出现了以块为单位的比较算法。请参阅图5,图5是本申请实施例提供的正方形块的比对示意图。使用正方形块的形式将当前帧图像将要压缩的块与参考帧图像的块进行比对,该参考帧图像为历史帧图像的重构图像,即历史帧图像的已编码图像。其中,将要压缩的块与参考帧图像的块为N×N的块,N为大于或等于4的整数,通过块的比对,可以最大化减少时域上的信息冗余,达到压缩视频数据的效果。图5是以正方形块为基础的比对示例,但长方形、梯形或三角形拼凑出的块也可以使用同样的比对方法。Video coding devices generally use a block (which can be considered as a pixel block) as a basic unit, and the block can be a rectangle, a square, or a trapezoid, or a triangle pieced together. In this case, a block-based comparison algorithm appears. Please refer to FIG. 5 , which is a schematic diagram of a comparison of square blocks provided by an embodiment of the present application. The block to be compressed in the current frame image is compared with the block of the reference frame image in the form of a square block, and the reference frame image is the reconstructed image of the historical frame image, that is, the encoded image of the historical frame image. Among them, the block to be compressed and the block of the reference frame image are N×N blocks, and N is an integer greater than or equal to 4. Through the comparison of the blocks, the information redundancy in the time domain can be minimized, and the compressed video data can be achieved. Effect. Figure 5 is an example of an alignment based on square blocks, but the same alignment method can be used for blocks made up of rectangles, trapezoids or triangles.
本申请实施例中,在进行运动估计时,将图像划分为多个不互相重叠的块,这些块构成矩形阵列,其中每个块是N×N像素大小的块,比如,可以是4×4的块,32×32的块,128×128的块等等,其中,4×4、32×32、128×128指的是像素数量。对于每个待编码块,再到历史帧图像的重构帧图像中同一位置的周围寻找与其最匹配的块,即匹配块,该匹配块相对于待编码块的移动量,称为运动矢量(Motion Vector,MV)。In this embodiment of the present application, when performing motion estimation, the image is divided into a plurality of non-overlapping blocks, and these blocks form a rectangular array, where each block is a block of N×N pixel size, for example, may be 4×4 , 32×32 blocks, 128×128 blocks, etc., where 4×4, 32×32, 128×128 refer to the number of pixels. For each block to be coded, go to the surrounding of the same position in the reconstructed frame image of the historical frame image to find the block that best matches it, that is, the matching block. The movement of the matching block relative to the block to be coded is called the motion vector ( Motion Vector, MV).
本申请实施例中,从当前帧图像中确定出待编码块,该待编码块为当前帧图像中将要压缩的块,即当前帧图像中将要编码的块。该待编码块可以是N×N大小的块。在对待编码块进行编码时,通常需要将其与参考帧图像中的块进行比对,因此需要对参考帧图像中需要比对的块进行搜索。该参考帧图像为历史帧图像的重构帧图像,即历史帧图像的已编码图像。In this embodiment of the present application, the block to be encoded is determined from the current frame image, where the block to be encoded is a block to be compressed in the current frame image, that is, a block to be encoded in the current frame image. The block to be encoded may be a block of N×N size. When coding the block to be coded, it is usually necessary to compare it with the block in the reference frame image, so it is necessary to search for the block to be compared in the reference frame image. The reference frame image is a reconstructed frame image of the historical frame image, that is, an encoded image of the historical frame image.
102、从历史帧图像的重构帧图像中确定出需要多次重复读取的第一区域,并将第一区域的图像数据存储在预设存储器中,预设存储器的功耗小于预设功耗阈值。102. Determine a first area that needs to be read repeatedly from the reconstructed frame image of the historical frame image, and store the image data of the first area in a preset memory, where the power consumption of the preset memory is less than the preset function. consumption threshold.
比如,只有从历史帧图像的重构帧图像中搜索需要比对的块后,才能将该块与待编码块进行比对,该历史帧图像的重构帧图像可以是多个历史帧图像的重构帧图像中编码代价最小的。因此在对块进行搜索之前,需要预先知道搜索的区域。由此,当对历史帧图像的重构帧图像中的块进行搜索时,需要知道历史帧图像的重构帧图像中的搜索范围(search range,SRng),即需要确定出多次重复读取的第一区域,当在该第一区域中搜索块时,需要多次重复读取该第一区域的数据。因此,本申请实施例中从历史帧图像的重构帧图像中确定出需要重复读取的第一区域后,将该第一区域的图像数据存储在预设存储器中,便于后续搜索时从预设存储器中读取第一区域的图像数 据。另外,该预设存储器的功耗小于预设功耗阈值。通过采用功耗小的预设存储器读写数据,可以降低视频编码装置的功耗。For example, only after searching for the block to be compared in the reconstructed frame image of the historical frame image, the block can be compared with the block to be encoded, and the reconstructed frame image of the historical frame image may be a plurality of historical frame images. The encoding cost of the reconstructed frame image is the least. Therefore, before searching for a block, the area to be searched needs to be known in advance. Therefore, when searching for blocks in the reconstructed frame image of the historical frame image, it is necessary to know the search range (search range, SRng) in the reconstructed frame image of the historical frame image, that is, it is necessary to determine the repeated reading When searching for blocks in the first area, the data of the first area needs to be read repeatedly for many times. Therefore, in the embodiment of the present application, after determining the first area that needs to be read repeatedly from the reconstructed frame images of the historical frame images, the image data of the first area is stored in the preset memory, which is convenient for subsequent searches from the preset memory. It is assumed that the image data of the first area is read from the memory. In addition, the power consumption of the preset memory is less than the preset power consumption threshold. By using the preset memory with low power consumption to read and write data, the power consumption of the video encoding device can be reduced.
103、从预设存储器中读取第一区域的图像数据。103. Read the image data of the first region from the preset memory.
比如,在将第一区域的图像数据存储在预设存储器后,当对第一区域中的块进行搜索时,需要从预设存储器中读取第一区域的图像数据,以从中找到与待编码块最匹配的块。For example, after the image data of the first area is stored in the preset memory, when searching for blocks in the first area, it is necessary to read the image data of the first area from the preset memory, so as to find the corresponding data to be encoded. The block that best matches the block.
根据读取的第一区域的图像数据从第一区域中确定出与待编码块相匹配的匹配块。A matching block matching the block to be encoded is determined from the first region according to the read image data of the first region.
比如,通过读取预设存储器中存储的第一区域的图像数据,可以实现对第一区域的搜索,在搜索过程中,将第一区域中每个块分别与当前帧图像中待编码块进行比对,从第一区域中找到与待编码块最匹配的块,该最匹配的块就是匹配块。For example, by reading the image data of the first area stored in the preset memory, the search for the first area can be realized. During the search process, each block in the first area is respectively compared with the block to be encoded in the current frame image. By comparison, the block that best matches the block to be coded is found from the first area, and the best matching block is the matching block.
常见的块搜索比对(block matching)算法,可以采用阶层式搜索(hierarchical search)也可以采用非阶层式搜索(not hierarchical search),经由搜索以后得到运动矢量与像素残余量(pixel value residues),以进行后续进一步的压缩编码。其中,像素的实际值减去预测值即可得到像素残余量。The common block matching algorithm can use hierarchical search (hierarchical search) or non-hierarchical search (not hierarchical search), after searching, the motion vector and pixel value residues are obtained, for subsequent further compression coding. Among them, the pixel residual amount can be obtained by subtracting the predicted value from the actual value of the pixel.
请参阅图6,图6是本申请实施例提供的阶层式搜索的示意图。阶层式搜索就是将要搜索的块与被搜索的区域都缩小相同的倍率,例如1/2,1/4或1/8等,在缩小的图像(即被搜索的区域)上,先决定将要搜索的块的大致范围后,再回到未缩小的图像进行更精细的块搜索。在阶层式搜索中,每个阶层的缩小倍率可以相同,也可以不同,例如每个阶层的缩小倍率可以是1/2、1/4、1/8和1/16。Please refer to FIG. 6. FIG. 6 is a schematic diagram of a hierarchical search provided by an embodiment of the present application. Hierarchical search is to reduce the block to be searched and the area to be searched by the same magnification, such as 1/2, 1/4 or 1/8, etc. On the reduced image (that is, the area to be searched), first determine the area to be searched. After the approximate extent of the block, go back to the unreduced image for a finer block search. In the hierarchical search, the reduction ratio of each layer may be the same or different, for example, the reduction ratio of each layer may be 1/2, 1/4, 1/8 and 1/16.
图6是以3个阶层的运动搜索作为示例,先搜索1/4缩小的图像,然后,由1/4缩小图像范围得到的运动矢量,在1/2缩小的图像范围进行更精细范围更小的搜索,之后,由1/2缩小图像范围得到的运动矢量再搜索原始大小图像的范围,得到最终的运动矢量。Figure 6 is an example of motion search in 3 levels. First, the 1/4 reduced image is searched. Then, the motion vector obtained from the 1/4 reduced image range is finer and smaller in the 1/2 reduced image range. After that, the motion vector obtained by reducing the image range by 1/2 searches the range of the original size image to obtain the final motion vector.
请参阅图7,图7是本申请实施例提供的非阶层式搜索的示意图。非阶层式搜索指的是直接在未缩小的图像中进行块比对任务,常见的有全搜索(full search),n步搜索(n-step search)等方法。图7是在原始大小的图像上直接进行运动搜索,即在前一帧图像的重构帧图像中的搜索窗中,使用全搜索方式找寻当前块(current block)与前一帧图像的重构帧图像中哪一个块具有最小的编码代价。其中,最小的编码代价可以采用多种形式,如最小的编码代价可以是搜索的某个块与当前块的每个像素的残差的绝对值的总和最小。图7中的p为横向搜索范围。Please refer to FIG. 7 , which is a schematic diagram of a non-hierarchical search provided by an embodiment of the present application. Non-hierarchical search refers to performing block comparison tasks directly in unreduced images. Common methods include full search and n-step search. Figure 7 shows motion search directly on the original size image, that is, in the search window in the reconstructed frame image of the previous frame image, the full search method is used to find the current block (current block) and the reconstruction of the previous frame image Which block in the frame image has the smallest coding cost. Wherein, the minimum coding cost may take various forms, for example, the minimum coding cost may be the minimum sum of absolute values of residuals of each pixel of a certain block to be searched and the current block. p in Fig. 7 is the horizontal search range.
需要说明的是,动作估计是指基于块的运动估计,基本思想是将图像序列的每一帧分成许多互不重叠的块,并认为块内所有像素的位移量都相同,然后对每个块到参考帧某一给定特定搜索范围内根据一定的块匹配准则找出与当前块最相似的块,即匹配块,匹配块与当前块的相对位移即为运动矢量。运动估计搜索的是前面不同时间点编码过后的重建像素,即历史帧图像的重构帧图像中的重建像素。It should be noted that motion estimation refers to block-based motion estimation. The basic idea is to divide each frame of the image sequence into many non-overlapping blocks, and consider that the displacements of all pixels in the block are the same, and then calculate the value of each block. In a given specific search range of the reference frame, the block most similar to the current block is found according to certain block matching criteria, that is, the matching block, and the relative displacement between the matching block and the current block is the motion vector. Motion estimation searches for reconstructed pixels encoded at different previous time points, that is, reconstructed pixels in reconstructed frame images of historical frame images.
在帧间预测模式下,可以从历史帧图像中随意选择预设数量的历史帧图像,对选择的历史帧图像的重构帧图像进行搜索,搜索的结果是运动矢量,即匹配块与待编码块之间的位移,然后从中选择最优的运动矢量作为最终的搜索结果。可以理解的是,根据运动矢量可以确定出匹配块在历史帧图像的重构帧图像中的位置。In the inter-frame prediction mode, a preset number of historical frame images can be randomly selected from the historical frame images, and the reconstructed frame images of the selected historical frame images can be searched. displacement between blocks, and then select the optimal motion vector as the final search result. It can be understood that the position of the matching block in the reconstructed frame image of the historical frame image can be determined according to the motion vector.
105、根据匹配块与待编码块的相对关系,对待编码块进行编码。105. According to the relative relationship between the matching block and the block to be encoded, encode the block to be encoded.
比如,根据匹配块与待编码块的相对位移关系和相对误差关系,如将待编码块的二维像素减去匹配块对应位置的二维像素,得到匹配块与待编码块之间的相对误差关系,可以根据匹配块与待编码块的相对位移关系和相对误差关系对待编码块进行编码。For example, according to the relative displacement relationship and the relative error relationship between the matching block and the block to be encoded, for example, subtract the two-dimensional pixel of the corresponding position of the matching block from the two-dimensional pixel of the block to be encoded to obtain the relative error between the matching block and the block to be encoded The to-be-encoded block can be encoded according to the relative displacement relation and the relative error relation between the matching block and the to-be-encoded block.
可以理解的是,在本申请实施例中,视频编码装置可以从当前帧图像中确定出待编码块,从历史帧图像的重构帧图像中确定出需要重复读取的第一区域,并将第一区域的图像数据存储在预设存储器中,该预设存储器的功耗小于预设功耗阈值。然后,从预设存储器中读取第一区域的图像数据,根据读取的第一区域的图像数据从第一区域中确定出与待编码块相匹配的匹配块。之后,根据匹配块与待编码块的相对关系,对待编码块进行编码。即,本申请实施例中,通过将第一区域的图像数据存放在功耗较小的预设存储器中,以达到降低视频编码装置功耗的目的。因此,本申请实施例可以降低视频编码装置的功耗。It can be understood that, in this embodiment of the present application, the video encoding apparatus may determine the block to be encoded from the current frame image, determine the first area that needs to be read repeatedly from the reconstructed frame image of the historical frame image, and The image data of the first area is stored in a preset memory, and the power consumption of the preset memory is less than a preset power consumption threshold. Then, the image data of the first area is read from the preset memory, and a matching block matching the block to be encoded is determined from the first area according to the read image data of the first area. Then, according to the relative relationship between the matching block and the block to be encoded, the block to be encoded is encoded. That is, in the embodiment of the present application, the image data of the first region is stored in a preset memory with low power consumption, so as to achieve the purpose of reducing the power consumption of the video encoding apparatus. Therefore, the embodiments of the present application can reduce the power consumption of the video encoding apparatus.
请参阅图8,图8为本申请实施例提供的在视频编码装置中进行图像处理的方法的第二种流程示意图。该在视频编码装置中进行图像处理的方法可以应用于视频编码装置中。该在视频编码装置中进行图像处理的方法的流程可以包括:Please refer to FIG. 8 , which is a second schematic flowchart of a method for performing image processing in a video encoding apparatus provided by an embodiment of the present application. The method for image processing in a video encoding device can be applied to a video encoding device. The flow of the method for image processing in a video encoding device may include:
201、从当前帧图像中确定出待编码块。201. Determine the block to be encoded from the current frame image.
比如,每一帧图像都可以划分为多个块行(block line),每个块行可以划分为多个块。在确定当前帧图像的待编码块之前,需要从当前帧图像中确定出待编码的块行。待编码的块行指的是待编码块所在的块行。在当前帧图像中位于待编码的块行之前的块行均是已编码的块行。For example, each frame of image can be divided into multiple block lines, and each block line can be divided into multiple blocks. Before determining the block to be encoded in the current frame image, the block row to be encoded needs to be determined from the current frame image. The block row to be encoded refers to the block row where the block to be encoded is located. The block lines located before the block line to be encoded in the current frame image are all encoded block lines.
在确定出待编码的块行后,需要从待编码的块行中确定出待编码块。在该待编码的块行中,位于该待编码块左侧的块均是已编码块。请参阅图9,图9是本申请实施例提供的在历史帧图像的重构帧图像中进行搜索的场景示意图。从图9中可以看出,待编码块在垂直投影方向上位于搜索窗的范围内。After the block row to be encoded is determined, the block to be encoded needs to be determined from the block row to be encoded. In the block row to be coded, the blocks located to the left of the block to be coded are all coded blocks. Please refer to FIG. 9. FIG. 9 is a schematic diagram of a scene of searching in a reconstructed frame image of a historical frame image provided by an embodiment of the present application. It can be seen from FIG. 9 that the block to be encoded is located within the range of the search window in the vertical projection direction.
202、从存储在第二存储器中的历史帧图像的重构帧图像中确定出需要多次重复读取的第一区域。202. Determine, from the reconstructed frame images of the historical frame images stored in the second memory, a first area that needs to be read repeatedly for multiple times.
比如,第一区域可以包括多个块行。当确定出待编码块后,需要从历史帧图像(可以认为是与当前帧图像相关性最强的一帧历史帧图像)的重构帧图像中确定出需要重复读取的多个块行,需要重复读取的多个块行即是位于第一区域中的块行。其中,每个块行都包括多个块,该多个块排成一行。For example, the first area may include a plurality of block rows. After the block to be encoded is determined, it is necessary to determine a plurality of block lines that need to be read repeatedly from the reconstructed frame image of the historical frame image (which can be considered as a historical frame image with the strongest correlation with the current frame image), The multiple block lines that need to be read repeatedly are the block lines located in the first area. Wherein, each block row includes a plurality of blocks, and the plurality of blocks are arranged in a row.
预设存储器包括第一存储器和第二存储器,需要说明的是,可以事先将历史帧图像的重构帧图像存储在第二存储器中,然后从存储在第二存储器中的历史帧图像的重构帧图像中确定出需要多次重复读取的第一区域。The preset memory includes a first memory and a second memory. It should be noted that the reconstructed frame image of the historical frame image can be stored in the second memory in advance, and then the reconstructed frame image from the historical frame image stored in the second memory can be stored in the second memory. A first area that needs to be read repeatedly for multiple times is determined in the frame image.
203、从第二存储器中读取第一区域的图像数据并将其存储到第一存储器中。203. Read the image data of the first region from the second memory and store the image data in the first memory.
比如,当从第二存储器中存储的历史帧图像的重构帧图像中确定出需要多次重复读取的第一区域后,可以从第二存储器中读取第一区域的图像数据,并将读取的第一区域的图像数据存储到第一存储器中,等待视频编码装置进行编码时进行读取。For example, after the first area that needs to be read repeatedly is determined from the reconstructed frame images of the historical frame images stored in the second memory, the image data of the first area can be read from the second memory, and the image data of the first area can be read from the second memory. The read image data of the first area is stored in the first memory, and is read while waiting for the video encoding apparatus to perform encoding.
需要说明的是,本申请实施例中,第二存储器的功耗大于第一存储器的功耗的第一预设倍数,且第一存储器的功耗和第二存储器的功耗的总和小于预设功耗阈值,这样可以降低读写数据时的功耗。其中,预设功耗阈值可以认为是将第一区域的图像数据全部由第二存储器进行读写时的功耗。It should be noted that, in this embodiment of the present application, the power consumption of the second memory is greater than the first preset multiple of the power consumption of the first memory, and the sum of the power consumption of the first memory and the power consumption of the second memory is less than the preset value. Power consumption threshold, which can reduce the power consumption when reading and writing data. The preset power consumption threshold may be considered as the power consumption when all the image data in the first area is read and written by the second memory.
比如,第一存储和第二存储器均为视频编码装置外部的存储器,例如,第一存储器可以包括设置在视频编码装置外部的系统高速缓存或系统缓冲存储器,即第一存储器可以包括设置在视频编码装置外部的Sys$或SysBuf,第二存储器可以包括设置在视频编码装置外部的动态随机存取内存,即第二存储器可以包括设置在设置在视频编码装置外部的DRAM。当然第一存储器还可以是其它低功耗存储器等,本申请实施例以Sys$或SysBuf为例进行说明,Sys$或SysBuf由多个SRAM组成,第二存储器可以为DRAM,DRAM的功耗大于视频编码装置外部的Sys$或SysBuf的功耗的第一预设倍数,且视频编码装置外部的Sys$或SysBuf的功耗与DRAM的功耗的总和小于预设功耗阈值,这样可以降低读写数据时的功耗,该预设功耗阈值可以认为是将第一区域的图像数据全部由DRAM进行读写时的功耗。For example, both the first memory and the second memory are memory external to the video encoding device. For example, the first memory may include a system cache or system buffer memory provided outside the video encoding device, that is, the first memory may include a system cache or system buffer memory provided outside the video encoding device. The Sys$ or SysBuf outside the device, the second memory may include a dynamic random access memory disposed outside the video encoding device, that is, the second memory may include a DRAM disposed outside the video encoding device. Of course, the first memory can also be other low-power memory, etc. The embodiment of this application is described by taking Sys$ or SysBuf as an example, Sys$ or SysBuf is composed of multiple SRAMs, the second memory can be DRAM, and the power consumption of DRAM is greater than The first preset multiple of the power consumption of the Sys$ or SysBuf outside the video encoding device, and the sum of the power consumption of the Sys$ or SysBuf outside the video encoding device and the power consumption of the DRAM is less than the preset power consumption threshold, which can reduce the read The power consumption when writing data, the preset power consumption threshold can be considered as the power consumption when all the image data in the first area is read and written by the DRAM.
请参阅图10,图10是本申请实施例提供的静态随机存取存储器与动态随机存取内存在读取数据时所消耗的能量的对比示意图。读取SRAM中的数据与读取DRAM中的数据的所消耗的能量相差约为100倍,即读取SRAM中数据的功耗远远小于读取DRAM中数据的功耗。通过将多个块行的图像数据分别存放在Sys$或SysBuf,以及DRAM,当读取Sys$或SysBuf中的图像数据时,可以降低读取数据时的功耗。Please refer to FIG. 10 . FIG. 10 is a schematic diagram illustrating a comparison of the energy consumed when reading data in the static random access memory and the dynamic random access memory provided by the embodiment of the present application. The difference in energy consumption between reading data in SRAM and reading data in DRAM is about 100 times, that is, the power consumption of reading data in SRAM is far less than the power consumption of reading data in DRAM. By storing the image data of multiple block lines in Sys$ or SysBuf and DRAM respectively, when reading the image data in Sys$ or SysBuf, the power consumption when reading data can be reduced.
视频编码装置的运动估计步骤需要DRAM提供较大的带宽,因为在搜索过程中会读取历史帧图像的重构帧图像中某些关联的区域(即第一区域)来做块搜索比对。因成本考虑,通常不会将搜索范围涵盖到的块行(即第一区域)涵盖到的块行都完整的存于视频编码装置的内部,通常只会储存搜索范围内所需要的大小(如搜索窗范围内的),来满足运动估计时的高速数据存取要 求。The motion estimation step of the video encoding device requires a large bandwidth provided by the DRAM, because during the search process, certain associated regions (ie, the first region) in the reconstructed frame images of the historical frame images are read for block search and comparison. Due to cost considerations, the block lines covered by the search range (that is, the first area) are usually not completely stored in the video encoding device, and usually only the size required within the search range (such as within the search window range) to meet the high-speed data access requirements during motion estimation.
若将第一区域的图像数据都存放在视频编码装置的硬件内部,即缓存(cache)或缓冲(buffer),cache或buffer包括多个SRAM,若将第一区域的图像数据都存放在视频编码装置的硬件内部,则需要将视频编码装置内部的SRAM切分成更多的单元,每个单元就是一个区域(bank),这样会导致单个bank的面积变大。由于单个bank的面积变大,则SRAM的面积也随之变大,而SRAM的存储容量保持不变,这样造成成本较高。比如,以宽度为8192个像素且垂直搜索范围为±64为例,8位(bit)亮度(luma)部分至少需要1兆字节(MB)存储空间。另外,由于使用运动估计算法,需要SRAM切分成更多的单元来满足数据进出需求,造成SRAM的面积变大。If the image data of the first area is stored in the hardware of the video encoding device, that is, the cache or buffer, the cache or buffer includes multiple SRAMs, and if the image data of the first area is stored in the video encoding device Inside the hardware of the device, the SRAM inside the video encoding device needs to be divided into more units, each unit is a bank, which will lead to a larger area of a single bank. As the area of a single bank becomes larger, the area of the SRAM also becomes larger, while the storage capacity of the SRAM remains unchanged, resulting in higher costs. For example, taking a width of 8192 pixels and a vertical search range of ±64 as an example, an 8-bit (bit) luminance (luma) portion requires at least 1 megabyte (MB) of storage space. In addition, because the motion estimation algorithm is used, the SRAM needs to be divided into more units to meet the data entry and exit requirements, resulting in a larger area of the SRAM.
需要说明的是,用搜索窗的形式存储数据的运动估计设计,处理一个块行的压缩,会需要多条的历史帧图像的重构帧图像中块行的数据,这意味着处理一帧数据会需要多帧数据量的读取。It should be noted that a motion estimation design that stores data in the form of a search window and processing the compression of one block line will require multiple pieces of data of the block line in the reconstructed frame image of the historical frame image, which means processing one frame of data. It will require the reading of multiple frames of data.
比如,视频编码装置在进行运动矢量搜索时,请参见图9,通常因搜索窗内的需求带宽很大,而会让搜索窗的图像数据存储于视频编码装置内部的cache或buffer,该cache或buffer包括切分较细的SRAM群,切分的细代表同存储单位的面积变大,例如1个bit在bank中的平均面积占比就比SRAM中的平均面积占比大,这样可以提供足够的数据带宽给运动估计电路。这样不仅造成SRAM面积大,且因为切分成较多bank使得版图绕线较困难,因此不会将整个第一区域的块行都使用这种存储单位(例如1bit)面积高的方法实现。For example, when a video encoding device searches for motion vectors, see FIG. 9 . Usually, due to the large demand bandwidth in the search window, the image data of the search window is stored in a cache or buffer inside the video encoding device. The cache or buffer The buffer includes finely divided SRAM groups. The finer division means that the area of the same storage unit becomes larger. For example, the average area ratio of 1 bit in the bank is larger than that in the SRAM, which can provide sufficient of the data bandwidth to the motion estimation circuit. This not only results in a large area of SRAM, but also makes layout routing more difficult because it is divided into many banks. Therefore, the block row of the entire first area will not be implemented using this method with a high storage unit (eg, 1 bit) area.
也就是说第一区域涵盖到的块行在每次编码过程中下移一个块行时,第一区域会重新又被抓取一次。通常垂直搜索范围会是待编码块高度的多倍,也就造成了读取第一区域的图像数据的带宽会是多倍于写第一区域的图像数据的带宽。且该情况在要编码的画面到达4K或是8K时更加的严重。4K画面的分辨率为3840×2160像素,4K画面的分辨率为7680×4320像素,4K与8K画面编码时,垂直搜索范围必须要比1080P分辨率大一定程度,否则画面压缩程度会大打折扣。That is to say, each time the block line covered by the first area is moved down by one block line during the encoding process, the first area will be grabbed again. Usually, the vertical search range will be multiple times the height of the block to be encoded, which results in that the bandwidth for reading the image data in the first area will be multiple times the bandwidth for writing the image data in the first area. And this situation is even more serious when the picture to be encoded reaches 4K or 8K. The resolution of 4K pictures is 3840×2160 pixels, and the resolution of 4K pictures is 7680×4320 pixels. When encoding 4K and 8K pictures, the vertical search range must be larger than the 1080P resolution, otherwise the degree of picture compression will be greatly reduced.
本申请实施例中,203中的从第二存储器中读取第一区域的图像数据并将其存储在第一存储器中,可以包括:In this embodiment of the present application, in 203, the image data of the first region is read from the second memory and stored in the first memory, which may include:
若所述第一区域在所述历史帧图像的重构帧图像中下移一个块行,则从所述第二存储器中读取下移块行的图像数据并将其存储在所述第一存储器中;If the first area is moved down by one block line in the reconstructed frame image of the historical frame image, the image data of the moved block line is read from the second memory and stored in the first in memory;
将所述第一存储器中下一个待编码块行编码时用不到的块行进行移除。The block lines that are not used in the encoding of the next block line to be encoded in the first memory are removed.
比如,若把需要重复读取的历史帧图像的重构帧图像中的块行区域(即第一区域)事先存放在视频编码装置外部的Sys$或SysBuf中,请一并参阅图11至图13,图11是本申请实施例提供的使用系统高速缓存的视频压缩系统的一种架构示意图。图12是本申请实施例提供的使用系统高速缓存的视频压缩系统的另一种架构示意图。图13是本申请实施例提供的使用系统缓冲存储器的视频压缩系统的架构示意图。在Sys$或SysBuf中存储的是第一区域的图像数据,视频编码装置内部的cache或buffer存储的是搜索窗的图像数据。图11至图13中的n为数字,表示存储容量的大小。比如,在一个实施例中,DRAM读写数据的速度为0.5GB/s~2GB/s,Sys$或SysBuf读写数据的速度为3GB/s~8GB/s,cache或buffer读写数据的速度为10GB/s~50GB/s。For example, if the block line area (ie the first area) in the reconstructed frame image of the historical frame image that needs to be read repeatedly is stored in Sys$ or SysBuf outside the video encoding device in advance, please refer to FIG. 11 to FIG. 13. FIG. 11 is a schematic structural diagram of a video compression system using a system cache provided by an embodiment of the present application. FIG. 12 is another schematic structural diagram of a video compression system using a system cache provided by an embodiment of the present application. FIG. 13 is a schematic structural diagram of a video compression system using a system buffer memory provided by an embodiment of the present application. The image data of the first area is stored in Sys$ or SysBuf, and the image data of the search window is stored in the cache or buffer inside the video encoding device. n in Fig. 11 to Fig. 13 is a number indicating the size of the storage capacity. For example, in one embodiment, the DRAM reads and writes data at a speed of 0.5GB/s to 2GB/s, the Sys$ or SysBuf reads and writes data at a speed of 3GB/s to 8GB/s, and the cache or buffer reads and writes data at a speed of 3GB/s to 8GB/s. 10GB/s~50GB/s.
需要说明的是,在其他实施例中,DRAM读写数据的速度、Sys$或SysBuf读写数据的速度、cache或buffer读写数据的速度也可以为其他值,但要满足cache或buffer读写数据的速度大于Sys$或SysBuf读写数据的速度以及DRAM读写数据的速度,且Sys$或SysBuf读写数据的速度大于DRAM读写数据的速度。It should be noted that, in other embodiments, the speed of reading and writing data of DRAM, the speed of reading and writing data of Sys$ or SysBuf, and the speed of reading and writing data of cache or buffer can also be other values, but the speed of reading and writing data of cache or buffer must be satisfied. The speed of data is greater than the speed of Sys$ or SysBuf to read and write data and the speed of DRAM to read and write data, and the speed of Sys$ or SysBuf to read and write data is greater than the speed of DRAM to read and write data.
以图11为例,Sys$可以通过DramC从DRAM中读取数据,且Sys$通过DramC从DRAM读取的数据可以被中央处理器、视频编码装置、图像处理器和神经网络处理器读取。当第一区域在历史帧图像的重构帧图像中下移一个块行时,Sys$和DRAM均存入新的块行,Sys$同时将用不到的块行移除出去,当视频编码装置需要进行编码时,可以直接读取Sys$中存储的第一区域的数据,另外,Sys$中还通过DramC从DRAM中读取第一区域的图像数据,之后被视频编码装置读取。Taking FIG. 11 as an example, Sys$ can read data from DRAM through DramC, and the data read by Sys$ from DRAM through DramC can be read by central processing unit, video encoding device, image processor and neural network processor. When the first area moves down a block line in the reconstructed frame image of the historical frame image, both Sys$ and DRAM store the new block line, and Sys$ removes the unused block lines at the same time. When the device needs to perform encoding, it can directly read the data in the first area stored in Sys$. In addition, in Sys$, the image data in the first area is also read from DRAM through DramC, and then read by the video encoding device.
在存储时,视频编码装置外部的Sys$或SysBuf会移除下一行待编码块行编码时不会用到的块行,则可以让视频编码装置内部的cache或buffer从DRAM读取第一区域的图像数据的次数从 多倍变成1倍,同时由于DRAM比SRAM的存取消耗的能量高100倍,因此这样可以大幅降低功耗。When storing, the Sys$ or SysBuf outside the video encoding device will remove the block lines that will not be used in the next line to be encoded, so that the cache or buffer inside the video encoding device can read the first area from the DRAM The number of times the image data of DRAM is changed from multiple times to 1 times, and because the energy consumption of DRAM access is 100 times higher than that of SRAM, this can greatly reduce power consumption.
比如,由于历史帧图像的重构帧图像被读取的位置(即第一区域)与行为(重复读取)是可预测的,且读取历史帧图像的重构帧图像会是读取当前帧图像所需带宽的多倍。若将被读取多次的第一区域的图像数据存储于Sys$或SysBuf等低功耗的存储空间,在有效维持视频编码装置运算的同时,还能大大降低整个系统的功耗,从而可以改善使用者体验。可根据视频编码装置压缩的历史帧图像的重构帧图像的结构,决定要存多少历史帧图像的重构帧图像的相关块行到这类低功耗的存储空间。每当要编码的块行(待编码块行)下移一行,就移除存储在Sys$或SysBuf中最上面的一个块行,然后重新读取第一区域中下移时的新增块行,并存储新增块行的图像数据。For example, since the position where the reconstructed frame image of the historical frame image is read (ie the first area) and the behavior (repeated reading) are predictable, and reading the reconstructed frame image of the historical frame image will be the same as reading the current frame image. A multiple of the bandwidth required for the frame image. If the image data of the first area that has been read multiple times is stored in a low-power storage space such as Sys$ or SysBuf, the power consumption of the entire system can be greatly reduced while the operation of the video encoding device is effectively maintained. Improve user experience. According to the structure of the reconstructed frame image of the historical frame image compressed by the video encoding device, it can be determined how many relevant block rows of the reconstructed frame image of the historical frame image to be stored in such a low-power storage space. Whenever the block line to be encoded (the block line to be encoded) moves down one line, remove the uppermost block line stored in Sys$ or SysBuf, and then re-read the newly added block line in the first area when it was moved down , and store the image data of the newly added block row.
请参阅图14,图14是本申请实施例提供的历史帧图像的重构帧图像下移一个块行时的场景示意图。视频编码装置每往下编码一个块行,就驱逐原本存储在Sys$或SysBuf的上方无关的块行,然后把新需要的块行的图像数据送进Sys$或SysBuf。即当搜索范围涵盖到的第一区域能够跟着待编码块下移的时候,将用不到的区域驱逐出Sys$或SysBuf,且将即将用到的待编码的块行存放在Sys$或SysBuf中。即,视频编码装置每往下编码一个块行,就将Sys$或SysBuf中存储的第一区域上方无关的块行移除,然后把编码时新需要的块行存放在视频编码装置外部的Sys$或SysBuf中。Please refer to FIG. 14. FIG. 14 is a schematic diagram of a scene when the reconstructed frame image of the historical frame image provided by the embodiment of the present application is moved down by one block line. Each time the video encoding device encodes a block line downward, it evicts the unrelated block line originally stored in the upper part of Sys$ or SysBuf, and then sends the image data of the newly needed block line into Sys$ or SysBuf. That is, when the first area covered by the search range can move down with the block to be encoded, the unused area is expelled from Sys$ or SysBuf, and the block line to be encoded that will be used is stored in Sys$ or SysBuf middle. That is, every time the video encoding device encodes a block row downward, it removes the irrelevant block row above the first area stored in Sys$ or SysBuf, and then stores the newly required block row during encoding in the Sys $ or SysBuf outside the video encoding device. $ or SysBuf.
从第一存储器中逐块行读取第一区域的图像数据。The image data of the first region is read block by line from the first memory.
比如,当视频编码装置需要进行编码时,可以从第一存储器中逐块行读取第一区域的图像数据,如从Sys$或SysBuf读取第一区域的图像数据。在进行读取时,是逐块行进行读取的,即按照从上向下的顺序进行读取。For example, when the video encoding apparatus needs to perform encoding, the image data of the first region may be read block by line from the first memory, for example, the image data of the first region may be read from Sys$ or SysBuf. When reading, it is read block by row, that is, read in order from top to bottom.
以进阶视讯编码(Advanced Video Coding,H.264/AVC)为例,假设图14中的宏块(Macro block)的大小为16×16像素,即横向16个像素乘以纵向16个像素,当然,宏块的大小还可以是32×32像素,64×64像素等。该宏块为当前帧图像中的待编码块。垂直搜索范围为±64,该情况下,历史帧图像的当前帧图像被读取的次数是当前帧图像的9(=(64+16+64)/16)倍。Taking Advanced Video Coding (H.264/AVC) as an example, it is assumed that the size of the macro block (Macro block) in Figure 14 is 16×16 pixels, that is, 16 pixels in the horizontal direction times 16 pixels in the vertical direction, Of course, the size of the macroblock may also be 32×32 pixels, 64×64 pixels, or the like. The macroblock is the block to be encoded in the current frame image. The vertical search range is ±64. In this case, the number of times the current frame image of the historical frame image is read is 9 (=(64+16+64)/16) times the current frame image.
205、若从第一存储器中读取的次数大于或等于预设次数阈值,则从第二存储器中逐块行读取第一区域中未被读取块行的图像数据。205. If the number of times of reading from the first memory is greater than or equal to a preset number of times threshold, read image data of unread block lines in the first region from the second memory block line by block line.
比如,在对第一区域的图像数据进行读取时,可以将原来全部从第二存储器读取的次数分为前几次从第一存储器读取,剩余几次从第二存储器读取。如在满足Sys$或SysBuf最低需求量的条件下,将对DRAM读取的9份数据量拆解成1次从DRAM读取与8次从Sys$或SysBuf读取。在Sys$或SysBuf的辅助下读取数据的功耗降低到没有辅助时的11.81%,即(1×640+8×5)(9×640)=11.81%,该情况下的功耗较低。比如,还可以根据具体的需求,将对DRAM读取的9份数据量拆解成2次从DRAM读取与7次从Sys$或SysBuf读取,等等,当对功耗要求苛刻的条件下,还可以将对DRAM读取的9份数据量全部从Sys$或SysBuf读取,此时功耗最低,但成本会上升。For example, when reading the image data in the first area, the total number of times of reading from the second memory can be divided into the first few times of reading from the first memory, and the remaining times of reading from the second memory. If the minimum requirement of Sys$ or SysBuf is met, the 9 pieces of data read from DRAM will be disassembled into 1 read from DRAM and 8 reads from Sys$ or SysBuf. The power consumption of reading data with the assistance of Sys$ or SysBuf is reduced to 11.81% of that without assistance, that is (1×640+8×5)(9×640)=11.81%, the power consumption in this case is lower . For example, according to specific needs, the 9 data volumes read from DRAM can be disassembled into 2 reads from DRAM and 7 reads from Sys$ or SysBuf, etc. When the power consumption is demanding conditions In this case, all 9 copies of data read from DRAM can be read from Sys$ or SysBuf. At this time, the power consumption is the lowest, but the cost will increase.
由于SRAM成本较高,DRAM成本较低,在考虑成本的情况下,SRAM一般不会做的太大,而DRAM可以做的比较大,因此本申请实施例为了降低读取数据时的功耗,可以将原来从DRAM读取的次数,拆分成几次从SRAM读取,另外几次从DRAM读取,从整体上可以降低读取数据的功耗。而且从SRAM读取的次数与从DRAM读取的次数是可以调整的,以适应对不同功耗的需求。Since the cost of SRAM is relatively high and the cost of DRAM is relatively low, considering the cost, SRAM is generally not too large, but DRAM can be relatively large. Therefore, in order to reduce the power consumption when reading data in the embodiment of the present application, The original number of readings from DRAM can be split into several readings from SRAM, and the other several readings from DRAM, which can reduce the power consumption of reading data as a whole. And the number of reads from the SRAM and the number of reads from the DRAM can be adjusted to meet the needs of different power consumption.
比如,在读取第一区域的图像数据时,可以先从Sys$或SysBuf中读取,当读取的次数大于或等于预设次数阈值时,则切换到从DRAM中读取第一区域中未被读取块行的图像数据。当读取同样的图像数据时,DRAM消耗的能量大于SRAM消耗的能量的100倍。因此,通过将第一区域中图像数据的一部分从Sys$或SysBuf中读取,另一部分数据从DRAM中读取,可以降低读取数据的功耗。For example, when reading the image data of the first area, you can first read from Sys$ or SysBuf, and when the number of readings is greater than or equal to the preset number of times threshold, switch to reading the first area from DRAM The image data of the block row has not been read. When reading the same image data, DRAM consumes 100 times more energy than SRAM. Therefore, by reading a part of the image data in the first area from Sys$ or SysBuf and the other part of the data from DRAM, the power consumption of reading data can be reduced.
请参阅图15,图15是本申请实施例提供的从多通道DRAM读写数据时的功耗曲线示意图。图15中,横坐标是历史帧图像的重构帧图像的位置,比如,图像的顶端位置,图像的中间位置,图像的底端位置,纵坐标是视频编码时读写数据的功耗。在视频编码装置过度依赖DRAM或其 它便宜但耗电的存储以及高带宽的情况下,因视频压缩系统提供的功耗上限是有限的,会使视频编码装置无法满足速度要求,或者会使视频压缩系统过热。如果考虑功耗上限,则读写数据的速度受限,不能达到未考虑功耗上限时的读写速度。Please refer to FIG. 15. FIG. 15 is a schematic diagram of a power consumption curve when reading and writing data from a multi-channel DRAM according to an embodiment of the present application. In Figure 15, the abscissa is the position of the reconstructed frame image of the historical frame image, for example, the top position of the image, the middle position of the image, and the bottom position of the image, and the ordinate is the power consumption of reading and writing data during video encoding. In the case of video encoding devices that rely too much on DRAM or other cheap but power-hungry storage and high bandwidth, the limited power consumption provided by the video compression system can make the video encoding device unable to meet the speed requirements, or cause the video compression System is overheating. If the upper limit of power consumption is considered, the speed of reading and writing data is limited, and the reading and writing speed when the upper limit of power consumption is not considered cannot be reached.
请参阅图16,图16是本申请实施例提供的分别从Sys$或SysBuf以及DRAM读写数据时的功耗曲线示意图。视频编码装置将大量的DRAM功耗改由Sys$或SysBuf的功耗来取代,大大降低功耗。Please refer to FIG. 16. FIG. 16 is a schematic diagram of a power consumption curve when reading and writing data from Sys$ or SysBuf and DRAM, respectively, according to an embodiment of the present application. The video encoding device replaces a large amount of DRAM power consumption with the power consumption of Sys$ or SysBuf, which greatly reduces power consumption.
206、从读取的第一区域的图像数据中确定出搜索窗的图像数据,搜索窗位于第一区域内。206. Determine the image data of the search window from the read image data of the first area, where the search window is located in the first area.
比如,为了进一步缩小搜索范围,可以从第一区域中确定出搜索窗,这样就可以将搜索范围缩小,从中可以搜索出匹配块,从而可以进一步降低功耗。对于运动估计,采用非阶层式搜索时为非缩小的历史帧图像的重构帧图像,采用阶层式搜索时为缩小或非缩小的历史帧图像的重构帧图像,只要是可以预测垂直方向位置的搜索窗都可以适用。For example, in order to further narrow the search range, a search window may be determined from the first area, so that the search range may be narrowed, and matching blocks may be searched therefrom, thereby further reducing power consumption. For motion estimation, the reconstructed frame image of the historical frame image that is not reduced when using the non-hierarchical search, and the reconstructed frame image of the historical frame image that is reduced or non-reduced when using the hierarchical search, as long as the vertical position can be predicted. The search window is applicable.
请参阅图17,图17为本申请实施例提供的历史帧图像的重构帧图像中搜索窗的搜索范围的场景示意图。从图17中可以看出搜索窗位于第一区域内,第一区域中相邻虚线之间的区域即为块行,搜索的运动矢量可以指向搜索窗内的任何地方。其中,L、R、T和B分别为搜索窗中位于待编码块左侧的搜索范围、右侧的搜索范围和上方的搜索范围和下方的搜索范围。其中,R和B为正数,L和T为负数。且L不一定等于R,T不一定等于B。Please refer to FIG. 17 . FIG. 17 is a scene schematic diagram of a search range of a search window in a reconstructed frame image of a historical frame image provided by an embodiment of the present application. It can be seen from FIG. 17 that the search window is located in the first area, the area between the adjacent dotted lines in the first area is the block row, and the searched motion vector can point to any place in the search window. Wherein, L, R, T and B are respectively the search range located on the left side of the block to be coded, the search range on the right side, the upper search range and the lower search range in the search window. where R and B are positive numbers and L and T are negative numbers. And L is not necessarily equal to R, and T is not necessarily equal to B.
207、将搜索窗的图像数据存储在第三存储器中,第三存储器的读写速度大于第一存储器的读写速度的第二预设倍数。207. Store the image data of the search window in a third memory, where the read/write speed of the third memory is greater than the second preset multiple of the read/write speed of the first memory.
比如,当确定出搜索窗的图像数据后,将搜索窗的图像数据存储在第三存储器中,该第三存储器可以是视频编码装置内部的存储器,第三存储器可以包括设置在视频编码装置内部的缓存或缓冲。由于进行运动估计时,是对搜索窗范围内的块进行搜索,进行块搜索时对带宽的需求较高,因此第三存储器的读写速度均大于第一存储器的读写速度以及第二存储器的读写速度。以满足搜索速度和高带宽的需求。其中,第三存储器的读写速度大于第一存储器的读写速度的第二预设倍数。For example, after the image data of the search window is determined, the image data of the search window is stored in a third memory, the third memory may be a memory inside the video encoding device, and the third memory may include a memory set inside the video encoding device cache or buffer. Since the block within the search window is searched during motion estimation, and the demand for bandwidth is high during block search, the read and write speed of the third memory is higher than the read and write speed of the first memory and the speed of the second memory. read and write speed. To meet the needs of search speed and high bandwidth. Wherein, the read/write speed of the third memory is greater than the second preset multiple of the read/write speed of the first memory.
208、从第三存储器中读取搜索窗的图像数据,并根据搜索窗的图像数据,从搜索窗中确定出与待编码块的编码代价最小的块。208. Read the image data of the search window from the third memory, and according to the image data of the search window, determine from the search window the block with the smallest encoding cost compared to the block to be encoded.
比如,在进行搜索时,从第三存储器中读取搜索窗的图像数据,可以采用阶层式或非阶层式方式进行搜索,根据搜索窗的图像数据,将读取的搜索窗中的块与待编码块进行比对,可以确定出与待编码块的编码代价最小的块。比如,在一种实施方式中,编码代价可以包括残差,比如,在另外一种实施方式中,编码代价可以包括块矢量和残差,等等。可知,编码代价最小的块可以是与待编码块的残差最小的块,还可以是综合考虑与待编码块的块矢量和残差后编码代价最小的块。For example, when searching, the image data of the search window is read from the third memory, and the search can be carried out in a hierarchical or non-hierarchical manner. The coded blocks are compared to determine the block with the smallest coding cost compared to the block to be coded. For example, in one embodiment, the coding cost may include a residual, for example, in another embodiment, the coding cost may include a block vector and a residual, and so on. It can be known that the block with the smallest coding cost may be the block with the smallest residual difference with the block to be coded, or the block with the smallest coding cost after comprehensively considering the block vector and the residual difference with the block to be coded.
例如,对于运动估计,在搜索窗中逐块行扫描,将搜索的块与待编码块进行比对,从而可以从搜索窗中找到与待编码块的残差最小的块。其中,运动矢量可以是搜索出的块与待编码块之间的相对位移。残差可以是待编码块的二维像素减去搜索出的块对应位置的二维像素后得到的差值。For example, for motion estimation, scan block by row in the search window, and compare the searched block with the block to be coded, so that the block with the smallest residual error from the block to be coded can be found from the search window. The motion vector may be the relative displacement between the searched block and the block to be encoded. The residual may be a difference obtained by subtracting the two-dimensional pixels at the corresponding positions of the searched blocks from the two-dimensional pixels of the block to be encoded.
比如,在一种实施方式中,208中的从第三存储器中读取搜索窗的图像数据,并根据搜索窗的图像数据,从搜索窗中确定出与待编码块的编码代价最小的块,可以包括:For example, in one embodiment, in 208, the image data of the search window is read from the third memory, and according to the image data of the search window, the block with the least encoding cost of the block to be encoded is determined from the search window, Can include:
从所述第三存储器中读取所述搜索窗的图像数据;Read the image data of the search window from the third memory;
将所述搜索窗按照预设阶层数进行缩小,得到缩小后的搜索窗;reducing the search window according to the preset number of layers to obtain a reduced search window;
根据所述缩小后的搜索窗的图像数据,从所述缩小后的搜索窗中确定出与所述待编码块的编码代价最小的缩小后的块;According to the image data of the reduced search window, the reduced block with the smallest encoding cost of the block to be encoded is determined from the reduced search window;
根据所述缩小后的块在所述缩小后的搜索窗中的位置,从所述搜索窗中确定出与所述待编码块的编码代价最小的块。According to the position of the reduced block in the reduced search window, a block with the smallest encoding cost to the block to be encoded is determined from the search window.
比如,在进行搜索时,可以采用阶层式搜索方式,根据阶层数的不同,搜索的层级也不同。比如,若采用2个阶层的搜索,则进行2个层级的搜索,若采用采用3个阶层的搜索,则进行3 个层级的搜索。当然,阶层数越多,则搜索的结果越准确,但同时也会增加系统计算资源的消耗。在实际应用中,可以根据具体需求设置合适的阶层数。需要说明的是,每个阶层的缩小倍率可以相同,也可以不同。For example, when searching, a hierarchical search method can be used, and the search level is different according to the number of levels. For example, if a search using two layers is used, a search in two layers is performed, and if a search using three layers is used, a search in three layers is performed. Of course, the more layers there are, the more accurate the search results will be, but at the same time, it will also increase the consumption of system computing resources. In practical applications, an appropriate number of layers can be set according to specific needs. It should be noted that the reduction ratio of each layer may be the same or different.
例如,从第三存储器中读取搜索窗的图像数据后,将搜索窗按照预设阶层数进行缩小,如按照2个阶层数将搜索窗进行缩小,得到缩小后的搜索窗,该缩小后的搜索窗为原来搜索窗大小的1/2。然后,根据该缩小后的搜索窗的图像数据,从该缩小后的搜索窗中确定出与待编码块的编码代价最小的缩小后的块,该缩小后的块与缩小后的搜索窗的缩小倍率是相同的。在缩小的搜索窗的图像上,先决定将要搜索的缩小后的块的大致范围后,再回到未缩小的搜索窗的图像进行更精细的块搜索,即根据该缩小后的块在缩小后的搜索窗中的大致范围,对原始搜索窗进行更精细的搜索,可以从未缩小的搜索窗中确定出与待编码块的编码代价最小的块。For example, after reading the image data of the search window from the third memory, the search window is reduced according to the preset number of layers. For example, the search window is reduced according to the number of two layers to obtain a reduced search window. The search window is 1/2 of the original search window size. Then, according to the image data of the reduced search window, a reduced block with the smallest coding cost of the block to be encoded is determined from the reduced search window, and the reduced block and the reduced search window are reduced The magnification is the same. On the image of the reduced search window, first determine the approximate range of the reduced block to be searched, and then return to the image of the unreduced search window to perform a finer block search, that is, according to the reduced block, after the reduced block The approximate range in the search window of , and the original search window is more finely searched, and the block with the least encoding cost of the block to be encoded can be determined from the unreduced search window.
再例如,从第三存储器中读取搜索窗的图像数据后,将搜索窗按照预设阶层数进行缩小,如按照3个阶层数将搜索窗进行缩小,得到缩小后的搜索窗,该缩小后的搜索窗为原来搜索窗大小的1/4。然后,根据该1/4缩小范围的搜索窗的图像数据,从该1/4缩小范围的搜索窗中确定出与待编码块的编码代价最小的缩小后的块,得到该1/4缩小范围的搜索窗下对应的块矢量。之后,在1/2缩小范围的搜索窗进行更精细范围更小的搜索,最后,根据1/2缩小范围的搜索窗得到的块矢量再搜索原始大小的搜索窗的范围,得到最终的块矢量,从而可以确定出与待编码块的编码代价最小的块。For another example, after reading the image data of the search window from the third memory, the search window is reduced according to the preset number of layers, for example, the search window is reduced according to the number of three layers to obtain a reduced search window. The search window is 1/4 of the original search window size. Then, according to the image data of the 1/4 reduced range search window, determine the reduced block with the smallest coding cost of the block to be coded from the 1/4 reduced range search window, and obtain the 1/4 reduced range The corresponding block vector under the search window of . After that, perform a search with a finer and smaller range in the 1/2 reduced range search window, and finally search the range of the original size search window according to the block vector obtained from the 1/2 reduced range search window to obtain the final block vector , so that the block with the least coding cost to the block to be coded can be determined.
又如,从第三存储器中读取搜索窗的图像数据后,将搜索窗按照预设阶层数进行缩小,如按照3个阶层数将搜索窗进行缩小,得到缩小后的搜索窗,该缩小后的搜索窗为原来搜索窗大小的1/6。然后,根据该1/6缩小范围的搜索窗的图像数据,从该1/6缩小范围的搜索窗中确定出与待编码块的编码代价最小的缩小后的块,得到该1/6缩小范围的搜索窗下对应的块矢量。之后,在1/3缩小范围的搜索窗进行更精细范围更小的搜索,最后,根据1/3缩小范围的搜索窗得到的块矢量再搜索原始大小的搜索窗的范围,得到最终的块矢量,从而可以确定出与待编码块的编码代价最小的块。For another example, after reading the image data of the search window from the third memory, the search window is reduced according to the preset number of layers, for example, the search window is reduced according to the number of three layers, and a reduced search window is obtained. The search window is 1/6 of the original search window size. Then, according to the image data of the 1/6 reduced range search window, the reduced block with the smallest coding cost of the block to be coded is determined from the 1/6 reduced range search window, and the 1/6 reduced range is obtained. The corresponding block vector under the search window. After that, perform a search with a finer and smaller range in the 1/3 reduced range search window, and finally search the range of the original size search window according to the block vector obtained from the 1/3 reduced range search window to obtain the final block vector , so that the block with the least coding cost to the block to be coded can be determined.
由此可知,在缩小的搜索窗的图像上,先决定将要搜索的缩小后的块的大致范围后,再回到未缩小的搜索窗的图像进行更精细的块搜索,即根据该缩小后的块在缩小后的搜索窗中的大致范围,对原始搜索窗进行更精细的搜索,可以从未缩小的搜索窗中确定出与待编码块的编码代价最小的块。It can be seen that, on the image of the reduced search window, first determine the approximate range of the reduced block to be searched, and then return to the image of the unreduced search window to perform a finer block search. The approximate range of the block in the reduced search window, and the original search window is searched more precisely, and the block with the least encoding cost of the block to be encoded can be determined from the unreduced search window.
209、将与待编码块的编码代价最小的块作为匹配块。209. Use the block with the least coding cost of the block to be coded as a matching block.
比如,当从搜索窗中找到与待编码块之间编码代价最小(例如残差最小)的块后,将与待编码块之间编码代价最小的块作为匹配块。For example, after finding the block with the smallest coding cost (eg, the smallest residual) from the block to be coded, the block with the smallest coding cost with the block to be coded is used as the matching block.
210、根据匹配块与待编码块的运动矢量和残差,对待编码块进行编码。210. Code the block to be coded according to the motion vector and the residual of the matching block and the block to be coded.
比如,匹配块与待编码块的相对关系可以是运动矢量和残差,在找到匹配块后,可以根据匹配块与待编码块的运动矢量和残差,对待编码块进行编码。For example, the relative relationship between the matching block and the block to be coded may be a motion vector and a residual. After the matching block is found, the block to be coded can be coded according to the motion vector and the residual of the matching block and the block to be coded.
在一种实施方式中,210中的根据匹配块与待编码块的运动矢量和残差,对待编码块进行编码,可以包括:In an implementation manner, in 210, encoding the block to be coded according to the motion vector and the residual of the matching block and the block to be coded may include:
将所述匹配块与所述待编码块的残差进行正向变换和量化(Forward Transform&Quantization,FTQ);Carrying out forward transform and quantization (Forward Transform & Quantization, FTQ) on the residual of the matching block and the block to be encoded;
将所述匹配块与所述待编码块的运动矢量以及正向变换和量化后的第一残差数据进行熵编码(Entropy Coding,EC),得到视频流编码数据;或者Entropy coding (Entropy Coding, EC) is performed on the motion vector of the matching block and the block to be coded and the forward transformed and quantized first residual data to obtain video stream coded data; or
将所述正向变换和量化后的第一残差数据进行反向量化与变换,得到第二残差数据;Perform inverse quantization and transformation on the first residual data after the forward transformation and quantization to obtain second residual data;
根据所述第二残差数据对所述待编码块进行重构。The to-be-coded block is reconstructed according to the second residual data.
请参阅图18,图18是本申请实施例提供的视频编码装置编码的场景示意图。从图18中可以看出运动估计位于视频编码装置中与其它模块之间的数据流关系。比如,运动估计(可以采用阶层式搜索或非阶层式搜索)对多个历史帧图像的重构帧图像进行搜索,会搜到到匹配块,匹配块与当前块(即待编码块)的相对位移即是运动矢量,根据当前块与匹配块的误差得到残差。将该 残差进行正向变换与量化,其中,正向变换采用快速傅氏变换(Fast Fourier Transformation,FFT)变换得到频谱,频谱曲线上横坐标为频率,纵坐标为能量,经过正向变换,将空间中的像素转换成不相关而且能量集中的频谱系数,正向变换后数据只是转换到频域,数据量并没有变化,其可以减少失真。正向变换后的矩阵除以量化矩阵中对应位置的值,即可实现量化。频谱系数再用量化与熵编码进一步压缩,得到压缩的视频流。其中,量化过程去掉了一些不重要的高频信息,这样可以压缩图像数据量,所以量化是压缩的关键。经过正向变换和量化后得到第一残差数据。Please refer to FIG. 18 . FIG. 18 is a schematic diagram of a scene of encoding by a video encoding apparatus provided by an embodiment of the present application. It can be seen from FIG. 18 that the motion estimation is located in the data flow relationship between the video coding apparatus and other modules. For example, motion estimation (hierarchical search or non-hierarchical search can be used) searches the reconstructed frame images of multiple historical frame images, and a matching block is found. The relative relationship between the matching block and the current block (ie, the block to be encoded) The displacement is the motion vector, and the residual is obtained according to the error between the current block and the matching block. Perform forward transformation and quantization on the residual, wherein the forward transformation adopts Fast Fourier Transformation (FFT) to transform the spectrum, the abscissa on the spectrum curve is the frequency, and the ordinate is the energy. After forward transformation, The pixels in the space are converted into spectral coefficients that are uncorrelated and energy-concentrated. After the forward transformation, the data is only converted to the frequency domain, and the amount of data does not change, which can reduce distortion. Quantization can be achieved by dividing the forward transformed matrix by the value of the corresponding position in the quantization matrix. The spectral coefficients are further compressed by quantization and entropy coding to obtain a compressed video stream. Among them, the quantization process removes some unimportant high-frequency information, which can compress the amount of image data, so quantization is the key to compression. The first residual data is obtained after forward transformation and quantization.
将经过正向变换和量化后得到第一残差数据经过反向量化与变换(De-Quantization&Inv.Transform,DQIT)到空域,即得到匹配块与待编码块的第二残差数据,将当前帧图像的待编码块经过画面块区域重构(Block Reconstruction,BlkRec),作为下一个待编码块的邻居。环路内滤波器(In-loop Filter,InF)用于处理块之间的连续性问题,使其更加平滑。常用的环路滤波器是一个线性低通滤波器,可以滤除高频分量和噪声。用正向变换与量化可以消除视频图像空间上的冗余,用熵编码可以消除编码冗余。The first residual data obtained after forward transformation and quantization is subjected to inverse quantization and transformation (De-Quantization & Inv.Transform, DQIT) to the air domain, that is, the second residual data of the matching block and the block to be encoded are obtained, and the current frame is obtained. The to-be-coded block of the image is reconstructed (Block Reconstruction, BlkRec) in the picture block area as the neighbor of the next to-be-coded block. In-loop filter (InF) is used to deal with the continuity problem between blocks to make it smoother. A commonly used loop filter is a linear low-pass filter that filters out high frequency components and noise. Using forward transformation and quantization can eliminate the redundancy in the video image space, and using entropy coding can eliminate the coding redundancy.
可以理解的是,本申请实施例基于视频编码时可预测数据存取行为(即重复读取的行为),从而实现智能选择数据存储方式,以降低视频编码装置的功耗。可以根据编码时帧参考读取策略改变将要读取的数据是否该先存储到低功耗的Sys$或SysBuf,使存入Sys$或SysBuf中的部分或全部历史帧图像的重构帧图像重复读取的次数最高,以最大程度降低功耗,保证视频编码装置进出数据时能一直维持在最低功耗状态。若该Sys$或SysBuf同时具有高速带宽,由于该Sys$或SysBuf可以满足重复读取数据时所需带宽,这样可以进一步降低DRAM的带宽。It can be understood that the embodiments of the present application are based on predictable data access behavior (ie, repeated reading behavior) during video encoding, so as to realize intelligent selection of a data storage mode, so as to reduce the power consumption of the video encoding apparatus. Whether the data to be read should be stored in the low-power Sys$ or SysBuf can be changed according to the frame reference reading strategy during encoding, so that the reconstructed frame images of some or all of the historical frame images stored in Sys$ or SysBuf are repeated. The number of readings is the highest, so as to reduce power consumption to the greatest extent, and ensure that the video encoding device can always maintain the lowest power consumption state when entering and exiting data. If the Sys$ or SysBuf has high-speed bandwidth at the same time, since the Sys$ or SysBuf can satisfy the bandwidth required for repeatedly reading data, the bandwidth of the DRAM can be further reduced.
本申请实施例可以保证视频编码装置的功耗可控,且能让视频编码装置的硬件或软件尽快完成编码工作,充分利用视频编码装置会有多次重复读取第一区域的图像数据的可预期行为来改变所读取数据的存储特性,因为存取数据省电,而使视频编码装置可以维持其运行速度,同时又降低功耗。读取数据的速度不会受功耗的限制,因此视频编码装置不会过热。另外,Sys$或SysBuf中SRAM在读写时本身的时延就低,这样可以提高处理帧率,降低反应时延。由于可以大幅降低功耗,则可以提高视频编码装置中电池的使用时间,提升用户体验。The embodiment of the present application can ensure that the power consumption of the video encoding device is controllable, and the hardware or software of the video encoding device can complete the encoding work as soon as possible, and make full use of the possibility that the video encoding device will repeatedly read the image data of the first area for many times. Desired behavior to change the storage characteristics of the read data allows the video encoding device to maintain its operating speed while reducing power consumption because accessing the data saves power. The speed of reading data is not limited by power consumption, so the video encoding device does not overheat. In addition, the SRAM in Sys$ or SysBuf has low latency when reading and writing, which can improve the processing frame rate and reduce the response latency. Since the power consumption can be greatly reduced, the usage time of the battery in the video encoding device can be increased, and the user experience can be improved.
请参阅图19,图19是本申请实施例提供的在视频编码装置中进行图像处理的方法的第三种流程示意图。该在视频编码装置中进行图像处理的方法可以应用于视频编码装置等中。该在视频编码装置中进行图像处理的方法的流程可以包括:Please refer to FIG. 19. FIG. 19 is a third schematic flowchart of a method for performing image processing in a video encoding apparatus provided by an embodiment of the present application. The method of performing image processing in a video encoding device can be applied to a video encoding device or the like. The flow of the method for image processing in a video encoding device may include:
从当前帧图像中确定出待编码块。The block to be encoded is determined from the current frame image.
步骤301的具体实施可参见步骤201的实施例,在此不再赘述。For the specific implementation of step 301, reference may be made to the embodiment of step 201, and details are not described herein again.
302、从存储在第二存储器中的多个历史帧图像的重构帧图像中确定出需要多次重复读取的多个第一区域。302. Determine, from the reconstructed frame images of the plurality of historical frame images stored in the second memory, a plurality of first regions that need to be read repeatedly for multiple times.
比如,当确定出待编码块后,需要从多个历史帧图像的重构帧图像中确定出需要重复读取的多个第一区域,即在每个历史帧图像的重构帧图像中均确定出需要重复读取的第一区域。其中,第一区域可以包括多个块行,每个块行都包括多个块,该多个块排成一行。For example, after the block to be encoded is determined, it is necessary to determine multiple first regions that need to be read repeatedly from the reconstructed frame images of multiple historical frame images, that is, in the reconstructed frame images of each historical frame image, Identify the first region that needs to be read repeatedly. Wherein, the first area may include a plurality of block rows, each block row includes a plurality of blocks, and the plurality of blocks are arranged in a row.
预设存储器包括第一存储器和第二存储器,需要说明的是,可以事先将多个历史帧图像的重构帧图像存储在第二存储器中,然后从存储在第二存储器中的多个历史帧图像的重构帧图像中确定出需要多次重复读取的多个第一区域。The preset memory includes a first memory and a second memory. It should be noted that the reconstructed frame images of multiple historical frame images can be stored in the second memory in advance, and then the reconstructed frame images of the multiple historical frame images stored in the second memory can be stored in the second memory. A plurality of first regions that need to be read repeatedly for many times are determined in the reconstructed frame image of the image.
303、从第一存储器中读取多个第一区域的图像数据并将其存储在第一存储器中。303. Read the image data of multiple first regions from the first memory and store them in the first memory.
比如,当从第二存储器中存储的多个历史帧图像的重构帧图像中确定出需要多次重复读取的多个第一区域后,可以从第二存储器中读取多个第一区域的图像数据,并将读取的第一区域的图像数据存储到第一存储器中,等待视频编码装置进行编码时进行读取。For example, after multiple first areas that need to be read repeatedly for multiple times are determined from the reconstructed frame images of multiple historical frame images stored in the second memory, the multiple first areas can be read from the second memory and store the read image data of the first region in the first memory, and read it while waiting for the video encoding device to perform encoding.
需要说明的是,本申请实施例中,比如,第二存储器的功耗大于第一存储器的功耗的第一预设倍数。当从第二存储器中存储的多个历史帧图像的重构帧图像中确定出需要重复读取的多个第一区域后,从第二存储器中读取该多个第一区域的图像数据,并将读取的多个第一区域的图像数据存储在第一存储器中,从第一存储器和第二存储器读写数据的功耗总和小于预设功耗阈值,这样可以降低读取数据时的功耗。It should be noted that, in this embodiment of the present application, for example, the power consumption of the second memory is greater than the first preset multiple of the power consumption of the first memory. After multiple first regions that need to be read repeatedly are determined from the reconstructed frame images of multiple historical frame images stored in the second memory, the image data of the multiple first regions is read from the second memory, and store the read image data of multiple first areas in the first memory, and the total power consumption of reading and writing data from the first memory and the second memory is less than the preset power consumption threshold, which can reduce the time when reading data. power consumption.
比如,第一存储器可以为Sys$或SysBuf,第二存储器可以为DRAM,DRAM的能耗大于Sys$或SysBuf的能耗的第一预设倍数。请参阅图10,读取SRAM与读取DRAM的能量差异约相差100倍,即读取SRAM的能量远远小于读取DRAM的能量。通过将多个第一区域的图像数据分别存放在Sys$或SysBuf(Sys$或SysBuf由多个SRAM构成)以及DRAM,当分别从Sys$或SysBuf以及DRAM读取第一区域的图像数据时,整体上可以降低读取数据时的功耗。For example, the first memory may be Sys$ or SysBuf, the second memory may be DRAM, and the power consumption of the DRAM is greater than the first preset multiple of the power consumption of Sys$ or SysBuf. Referring to Figure 10, the energy difference between reading SRAM and reading DRAM is about 100 times different, that is, the energy of reading SRAM is much smaller than that of reading DRAM. By storing the image data of multiple first areas in Sys$ or SysBuf respectively (Sys$ or SysBuf is composed of multiple SRAMs) and DRAM, when reading the image data of the first area from Sys$ or SysBuf and DRAM respectively, The power consumption when reading data can be reduced as a whole.
304、从第一存储器中逐块行读取多个第一区域的图像数据。304. Read image data of multiple first regions from the first memory block by row.
比如,可以从第一存储器中读取多个第一区域的图像数据,例如从第一存储器中读取的次数可以大于从第二存储器中读取的次数,从第一存储器中读取的次数可以小于从第二存储器中读取的次数,或者从第一存储器中读取的次数可以等于从第二存储器中读取的次数,具体从第一存储器和第二存储器中分别读取几次,要根据具体场景进行相应设置,本申请实施例对此不做具体限制。For example, the image data of a plurality of first regions may be read from the first memory, for example, the number of times of reading from the first memory may be greater than the number of times of reading from the second memory, and the number of times of reading from the first memory may be less than the number of times read from the second memory, or the number of times read from the first memory may be equal to the number of times read from the second memory, specifically the number of times read from the first memory and the second memory, respectively, Corresponding settings should be made according to specific scenarios, which are not specifically limited in this embodiment of the present application.
比如,在一种实施方式中,在读取多个第一区域的图像数据时,可以先从Sys$或SysBuf(ys$或SysBuf由多个SRAM构成)中读取,在进行读取时,是逐块行进行读取的,即按照从上向下的顺序对第一区域中的块行进行读取。当从Sys$或SysBuf读取的次数大于或等于预设次数阈值时,则切换到从DRAM中读取剩余的数据。当存取同样的数据时,DRAM消耗的能量大于SRAM消耗的能量的100倍。因此,通过将多个第一区域的图像数据的一部分从Sys$或SysBuf中读取,另一部分数据从DRAM中读取,可以降低读取数据的功耗。从图16中可以看出,视频编码装置将大量的DRAM功耗改由Sys$或SysBuf的功耗取代,大大降低功耗。For example, in one embodiment, when reading the image data of multiple first regions, it can be read from Sys$ or SysBuf (ys$ or SysBuf is composed of multiple SRAMs), and when reading, It is read block by row, that is, the block rows in the first area are read in order from top to bottom. When the number of times read from Sys$ or SysBuf is greater than or equal to the preset times threshold, it switches to read the remaining data from the DRAM. When accessing the same data, DRAM consumes 100 times more energy than SRAM. Therefore, by reading a part of the image data of the multiple first regions from Sys$ or SysBuf, and reading another part of the data from the DRAM, the power consumption of reading data can be reduced. As can be seen from Fig. 16, the video encoding device replaces a large amount of DRAM power consumption with the power consumption of Sys$ or SysBuf, which greatly reduces the power consumption.
若从第一存储器中读取的次数大于或等于预设次数阈值,则从第二存储器中逐块行读取多个第一区域中未被读取块行的图像数据。If the number of times of reading from the first memory is greater than or equal to the preset number of times threshold, the image data of a plurality of unread block lines in the first region is read block line by block line from the second memory.
步骤305的具体实施可参见步骤205的实施例,在此不再赘述。For the specific implementation of step 305, reference may be made to the embodiment of step 205, and details are not described herein again.
306、从读取的多个第一区域的图像数据中确定出多个搜索窗的图像数据,每个搜索窗位于对应的第一区域内。306. Determine image data of multiple search windows from the read image data of multiple first areas, where each search window is located in a corresponding first area.
比如,当视频编码装置从第一存储器和第二存储器中读取多个第一区域的图像数据后,可以从每个第一区域的图像数据中分别确定出搜索窗的图像数据,即每个历史帧图像的重构帧图像的第一区域中均可以确定出一个搜索窗,这样就可以确定出多个搜索窗,每个搜索窗位于对应的第一区域内。当从每个历史帧图像的重构帧图像的第一区域中确定出搜索窗时,其具体实施可参见步骤206中的实施例,在此不再赘述。For example, after the video encoding apparatus reads the image data of a plurality of first regions from the first memory and the second memory, the image data of the search window can be separately determined from the image data of each first region, that is, each One search window can be determined in the first area of the reconstructed frame image of the historical frame image, so that multiple search windows can be determined, and each search window is located in the corresponding first area. When the search window is determined from the first region of the reconstructed frame image of each historical frame image, reference may be made to the embodiment in step 206 for its specific implementation, which will not be repeated here.
307、将多个搜索窗的图像数据存储在第三存储器中,第三存储器的读写速度大于第一存储器的读写速度的第二预设倍数。307. Store the image data of the multiple search windows in a third memory, where the read/write speed of the third memory is greater than the second preset multiple of the read/write speed of the first memory.
比如,当确定出多个搜索窗的图像数据后,可以将其存储在第三存储器中,第三存储器的读写速度大于第一存储器的读写速度的第二预设倍数。比如,该第三存储器可以为cache或buffer,该cache或buffer的读写速度大于Sys$或SysBuf的读写速度的第二预设倍数。For example, after the image data of multiple search windows are determined, they may be stored in a third memory, where the read/write speed of the third memory is greater than the second preset multiple of the read/write speed of the first memory. For example, the third memory may be a cache or a buffer, and the read/write speed of the cache or buffer is greater than the second preset multiple of the read/write speed of Sys$ or SysBuf.
308、从第三存储器中读取多个搜索窗的图像数据,并根据多个搜索窗的图像数据,从多个搜索窗中分别确定出与待编码块的编码代价最小的一个或多个块。308. Read the image data of a plurality of search windows from the third memory, and according to the image data of the plurality of search windows, respectively determine one or more blocks with the least coding cost of the block to be coded from the plurality of search windows .
比如,根据读取的多个搜索窗的图像数据,针对每个搜索窗的图像数据,将搜索窗中每个块行中的块与待编码块进行比对,从而可以得到每个块与待编码块的编码代价,按照编码代价从小到大的顺序可以从中确定出一个或多个块,即从每个搜索窗中确定出与待编码块的编码代价最小的一个或多个块。例如,对于运动估计,在当前搜索窗中逐块行扫描,对当前搜索窗中的块进行搜索,将搜索的块与待编码块进行比对,从当前搜索窗中找到与待编码块的编码代价最小的一个或多个块。For example, according to the read image data of multiple search windows, for the image data of each search window, the blocks in each block row in the search window are compared with the blocks to be encoded, so that each block and the block to be encoded can be obtained. For the coding cost of the coding block, one or more blocks can be determined from the coding cost in ascending order, that is, one or more blocks with the smallest coding cost of the block to be coded are determined from each search window. For example, for motion estimation, scan block by line in the current search window, search for blocks in the current search window, compare the searched block with the block to be coded, and find the code corresponding to the block to be coded from the current search window. The least expensive block or blocks.
比如,在一种实施方式中,308中的从第三存储器中读取多个搜索窗的图像数据,并根据多个搜索窗的图像数据,从多个搜索窗中分别确定出与待编码块的编码代价最小的一个或多个块,可以包括:For example, in one embodiment, in 308, the image data of multiple search windows are read from the third memory, and according to the image data of the multiple search windows, the blocks corresponding to the blocks to be encoded are respectively determined from the multiple search windows. One or more blocks with the least coding cost, which can include:
从所述第三存储器中读取所述多个搜索窗的图像数据;Read the image data of the plurality of search windows from the third memory;
将所述多个搜索窗按照预设阶层数进行缩小,得到多个缩小后的搜索窗;reducing the plurality of search windows according to the preset number of layers to obtain a plurality of reduced search windows;
根据所述多个缩小后的搜索窗的图像数据,从所述多个缩小后的搜索窗中分别确定出与所述待编码块的编码代价最小的一个或多个缩小后的块;According to the image data of the plurality of reduced search windows, from the plurality of reduced search windows, respectively determine one or more reduced blocks with the smallest coding cost of the to-be-coded block;
根据所述一个或多个缩小后的块在所述缩小后的搜索窗中的位置,从所述多个搜索窗中分别确定出与所述待编码块的编码代价最小的一个或多个块。According to the positions of the one or more reduced blocks in the reduced search window, one or more blocks with the smallest coding cost of the block to be coded are respectively determined from the plurality of search windows .
比如,在进行搜索时,可以采用阶层式搜索方式,根据阶层数的不同,搜索的层级也不同。比如,若采用2个阶层的搜索,则进行2个层级的搜索,若采用采用3个阶层的搜索,则进行3个层级的搜索。当然,阶层数越多,则搜索的结果越准确,但同时也会增加系统计算资源的消耗。在实际应用中,可以根据具体需求设置合适的阶层数。需要说明的是,每个阶层的缩小倍率可以相同,也可以不同。For example, when searching, a hierarchical search method can be used, and the search level is different according to the number of levels. For example, if a search using two layers is used, a search in two layers is performed, and if a search using three layers is used, a search in three layers is performed. Of course, the more layers there are, the more accurate the search results will be, but at the same time, it will also increase the consumption of system computing resources. In practical applications, an appropriate number of layers can be set according to specific needs. It should be noted that the reduction ratio of each layer may be the same or different.
例如,从第三存储器中读取多个搜索窗的图像数据后,将多个搜索窗按照预设阶层数进行缩小,如按照2个阶层数将多个搜索窗进行缩小,得到多个缩小后的搜索窗,该多个缩小后的搜索窗分别为原来搜索窗大小的1/2。然后,根据该多个缩小后的搜索窗的图像数据,从该多个缩小后的搜索窗中分别确定出与待编码块的编码代价最小的一个或多个缩小后的块,该一个或多个缩小后的块与缩小后的搜索窗的缩小倍率是相同的。在每个缩小的搜索窗的图像上,先决定将要搜索的一个或多个缩小后的块的大致范围后,再回到未缩小的搜索窗的图像进行更精细的块搜索,即根据该缩小后的块在缩小后的搜索窗中的大致范围,对原始搜索窗进行更精细的搜索,可以从未缩小的搜索窗中确定出与待编码块的编码代价最小的一个或多个块。For example, after reading the image data of multiple search windows from the third memory, the multiple search windows are reduced according to the preset number of layers. search windows, the multiple reduced search windows are respectively 1/2 of the size of the original search windows. Then, according to the image data of the plurality of reduced search windows, one or more reduced blocks with the smallest coding cost of the block to be coded are respectively determined from the plurality of reduced search windows, and the one or more reduced blocks are respectively determined. The reduction ratio of each reduced block and the reduced search window is the same. On the image of each reduced search window, first determine the approximate range of one or more reduced blocks to be searched, and then return to the image of the unreduced search window to perform a finer block search, that is, according to the narrowed The approximate range of the latter block in the reduced search window is to perform a finer search on the original search window, and one or more blocks with the least encoding cost of the block to be encoded can be determined from the unreduced search window.
再例如,从第三存储器中读取多个搜索窗的图像数据后,将多个搜索窗按照预设阶层数进行缩小,如按照3个阶层数将搜索窗进行缩小,得到多个缩小后的搜索窗,该多个缩小后的搜索窗分别为原来搜索窗大小的1/4。然后,根据多个1/4缩小范围的搜索窗的图像数据,从多个1/4缩小范围的搜索窗中分别确定出与待编码块的编码代价最小的一个或多个缩小后的块,得到多个1/4缩小范围的搜索窗下分别对应的一个或多个块矢量。之后,在多个1/2缩小范围的搜索窗进行更精细范围更小的搜索,最后,根据多个1/2缩小范围的搜索窗得到的一个或多个块矢量再搜索多个原始大小的搜索窗的范围,得到最终的一个或多个块矢量,从而可以确定出与待编码块的编码代价最小的一个或多个块。For another example, after reading the image data of the plurality of search windows from the third memory, the plurality of search windows are reduced according to the preset number of layers, for example, the search windows are reduced according to the number of three layers, and a plurality of reduced search windows are obtained. Search windows, the multiple reduced search windows are respectively 1/4 of the size of the original search windows. Then, according to the image data of multiple 1/4 narrowed search windows, one or more reduced blocks with the smallest coding cost of the block to be coded are respectively determined from the multiple 1/4 reduced search windows, Obtain one or more block vectors corresponding to multiple 1/4-reduced search windows respectively. After that, a search with a finer and smaller range is performed in multiple 1/2-reduced search windows, and finally, according to one or more block vectors obtained from multiple 1/2-reduced search windows, a plurality of original-sized search windows are searched again. Search the range of the window to obtain the final one or more block vectors, so that one or more blocks with the smallest coding cost of the block to be coded can be determined.
由此可知,在多个缩小的搜索窗的图像上,先决定将要搜索的缩小后的块的大致范围后,再回到多个未缩小的搜索窗的图像进行更精细的块搜索,即根据缩小后的块在缩小后的搜索窗中的大致范围,对原始搜索窗进行更精细的搜索,可以从多个未缩小的搜索窗中确定出与待编码块的编码代价最小的一个或多个块。It can be seen that, in the images of multiple reduced search windows, after first determining the approximate range of the reduced blocks to be searched, then returning to the images of multiple unreduced search windows to perform a more refined block search, that is, according to The approximate range of the reduced block in the reduced search window, the original search window can be searched more precisely, and one or more of the unreduced search windows can be determined with the least encoding cost of the block to be encoded. piece.
309、从与待编码块的编码代价最小的多个块中确定出匹配块。309. Determine a matching block from a plurality of blocks with the smallest coding cost of the block to be coded.
当从每个搜索窗中确定出与待编码块的编码代价最小的一个或多个块后,将这些块与待编码块的编码代价再次进行比较,按照编码代价从小到大的顺序从中确定出一个或多个块,以进一步优化搜索结果。需要书说明的是,可以将与待编码块的编码代价最小的一个或多个块与待编码块的相对位移作为运动矢量,将待编码块与编码代价最小的一个或多个块的差值作为残差。After determining one or more blocks with the smallest coding cost of the block to be coded from each search window, compare these blocks with the coding cost of the block to be coded again, and determine from the coding cost in ascending order of coding cost. One or more blocks to further refine search results. It should be noted that the relative displacement between the block or blocks with the least coding cost and the block to be coded can be used as the motion vector, and the difference between the block to be coded and the block or blocks with the least coding cost can be used as the motion vector. as a residual.
比如,当从多个搜索窗中分别找到与待编码块的编码代价最小的一个或多个块后,由于每个搜索窗中至少能找到一个与待编码块的编码代价最小的块,则多个搜索窗中至少可以找到多个与待编码块的编码代价最小的块,从这些块中可以按照编码代价从小到大的顺序选择一个或多个块,通常会选择一个或两个编码代价最小的块,将其作为匹配块。可知,匹配块的数量可以是一个,可以是两个,也可以是多个,取决于所要求的参考块的数量。如当要考参考块的数量为两个,则需要确定出两个匹配块。For example, after finding one or more blocks with the smallest coding cost of the block to be coded from multiple search windows, since at least one block with the smallest coding cost of the block to be coded can be found in each search window, the At least a plurality of blocks with the least coding cost of the block to be coded can be found in each search window. From these blocks, one or more blocks can be selected in order of coding cost from small to large. Usually, one or two blocks with the smallest coding cost are selected. block as a matching block. It can be known that the number of matching blocks can be one, two, or more, depending on the required number of reference blocks. If the number of reference blocks to be examined is two, two matching blocks need to be determined.
310、根据匹配块与待编码块的相对关系,对待编码块进行编码。310. According to the relative relationship between the matching block and the block to be encoded, encode the block to be encoded.
步骤310的具体实施可参见步骤210的实施例,在此不再赘述。For the specific implementation of step 310, reference may be made to the embodiment of step 210, and details are not described herein again.
可以理解的是,本申请实施例可以根据摄影装置长时间拍摄需求,以及低散热成本需求和可预测行为造成的较大功耗,可以选择数据读取的目标位置或属性。比如,将需要重复读取的数据分别从Sys$或SysBuf以及DRAM进行读取,而不是全部都是从DRAM读取,由于读取相同的数据,SRAM的功耗远远小于DRAM的功耗,因此本申请实施例可以大大降低读取数据时的功 耗。It can be understood that, in the embodiment of the present application, the target position or attribute of data reading can be selected according to the long-term shooting requirement of the photographing device, the requirement of low heat dissipation cost and the relatively large power consumption caused by the predictable behavior. For example, the data that needs to be read repeatedly are read from Sys$ or SysBuf and DRAM respectively, not all of them are read from DRAM, because the same data is read, the power consumption of SRAM is far less than that of DRAM. Therefore, the embodiments of the present application can greatly reduce the power consumption when reading data.
本申请实施例以运动估计为示例详细说明了如何降低读取数据的功耗。在其它实施方式中,还可以适用于所有需要高带宽但存取数据行为可预测的模块与应用,如视频译码器,帧频提升(frame rate up conversion)装置等。这些模块与应用的行为通常是可以预测的,如重复读取的次数,通过这些可以预测的行为,可以预先分配相应的存储特性,即将重复读取的数据存放在低功耗的存储器中,例如根据全部帧或部分帧的图像数据的存取次数需求,来对应不同等级存储器的能量消耗,即根据全部帧或部分帧的图像数据的存取次数需求,来选择对应不同等级的能量消耗,当能量消耗不同时,可以合理分配从Sys$或SysBuf以及DRAM读取数据的次数。This embodiment of the present application uses motion estimation as an example to describe in detail how to reduce the power consumption of reading data. In other embodiments, it can also be applied to all modules and applications that require high bandwidth but predictable data access behavior, such as video decoders, frame rate up conversion devices, etc. The behavior of these modules and applications is usually predictable, such as the number of repeated reads. Through these predictable behaviors, the corresponding storage characteristics can be pre-allocated, that is, the repeatedly read data is stored in low-power memory, such as The energy consumption of different levels of memory is corresponding to the access times of the image data of all or part of the frames, that is, the energy consumption corresponding to different levels is selected according to the access times of the image data of all or part of the frames. When the energy consumption is different, the times of reading data from Sys$ or SysBuf and DRAM can be reasonably allocated.
如,视频译码器事先解析码流也可以确定存取数据的行为,帧频提升装置可以通过简单分析得知哪些区域在处理时会被用到多次,等等。还可以适用于固定的人工智能(Artificial Intelligence,AI)网络行为,AI网络行为重复读取的部分是特征图(feature map)部分,该AI网络行是可预期的。For example, the video decoder can also determine the behavior of accessing data by analyzing the code stream in advance, and the frame rate boosting device can know which areas will be used multiple times during processing through simple analysis, and so on. It can also be applied to fixed artificial intelligence (AI) network behavior. The repeated reading part of AI network behavior is the feature map part, and the AI network behavior is predictable.
请参阅图20,图20为本申请实施例提供的在视频编码装置中进行图像处理的装置的结构示意图。该在视频编码装置中进行图像处理的装置400可以包括:第一确定模块401,第二确定模块402,读取模块403,第三确定模块404,编码模块405。Please refer to FIG. 20 , which is a schematic structural diagram of an apparatus for performing image processing in a video encoding apparatus according to an embodiment of the present application. The apparatus 400 for performing image processing in a video encoding apparatus may include: a first determination module 401 , a second determination module 402 , a reading module 403 , a third determination module 404 , and an encoding module 405 .
第一确定模块401,用于从当前帧图像中确定出待编码块;The first determination module 401 is used to determine the block to be encoded from the current frame image;
第二确定模块402,用于从历史帧图像的重构帧图像中确定出需要多次重复读取的第一区域,并将所述第一区域的图像数据存储在预设存储器中,所述预设存储器的功耗小于预设功耗阈值;The second determination module 402 is configured to determine, from the reconstructed frame images of the historical frame images, a first area that needs to be read repeatedly for many times, and store the image data of the first area in a preset memory, and the The power consumption of the preset memory is less than the preset power consumption threshold;
读取模块403,用于从所述预设存储器中读取所述第一区域的图像数据;a reading module 403, configured to read the image data of the first region from the preset memory;
第三确定模块404,用于根据读取的所述第一区域的图像数据从所述第一区域中确定出与所述待编码块相匹配的匹配块;A third determining module 404, configured to determine a matching block matching the block to be encoded from the first region according to the read image data of the first region;
编码模块405,用于根据所述匹配块与所述待编码块的相对关系,对所述待编码块进行编码。The encoding module 405 is configured to encode the block to be encoded according to the relative relationship between the matching block and the block to be encoded.
在一种实施方式中,所述预设存储器包括第一存储器和第二存储器,所述第二存储器的功耗大于所述第一存储器的功耗的第一预设倍数,所述第一区域包括多个块行,所述第二确定模块402可以用于:In one embodiment, the preset memory includes a first memory and a second memory, the power consumption of the second memory is greater than a first preset multiple of the power consumption of the first memory, and the first area Including a plurality of block lines, the second determining module 402 can be used for:
从存储在所述第二存储器中的所述历史帧图像的重构帧图像中确定出需要多次重复读取的第一区域;Determine from the reconstructed frame image of the historical frame image stored in the second memory the first area that needs to be read repeatedly;
从所述第二存储器中读取所述第一区域的图像数据并将其存储到所述第一存储器中;read the image data of the first area from the second memory and store it in the first memory;
所述读取模块403可以用于:The reading module 403 can be used for:
从所述第一存储器中逐块行读取所述第一区域的图像数据;Reading the image data of the first region from the first memory block by line;
若从所述第一存储器中读取的次数大于或等于预设次数阈值,则从第二存储器中逐块行读取第一区域中未被读取块行的图像数据。If the number of times of reading from the first memory is greater than or equal to a preset number of times threshold, the image data of unread block lines in the first region is read block line by block line from the second memory.
在一种实施方式中,所述第二确定模块402可以用于:In one embodiment, the second determining module 402 may be used to:
若所述第一区域在所述历史帧图像的重构帧图像中下移一个块行,则从所述第二存储器中读取下移块行的图像数据并将其存储到所述第一存储器中;If the first area is moved down by one block line in the reconstructed frame image of the historical frame image, the image data of the moved block line is read from the second memory and stored in the first memory in memory;
将所述第一存储器中下一个待编码块行编码时用不到的块行进行移除。The block lines that are not used in the encoding of the next block line to be encoded in the first memory are removed.
在一种实施方式中,所述第三确定模块404可以用于:In one embodiment, the third determining module 404 may be used to:
从读取的所述第一区域的图像数据中确定出搜索窗的图像数据,所述搜索窗位于所述第一区域内;Determine the image data of the search window from the read image data of the first area, and the search window is located in the first area;
将所述搜索窗的图像数据存储在第三存储器中,所述第三存储器的读写速度大于所述第一存储器的读写速度的第二预设倍数;The image data of the search window is stored in a third memory, and the read-write speed of the third memory is greater than the second preset multiple of the read-write speed of the first memory;
从所述第三存储器中读取所述搜索窗的图像数据,并根据所述搜索窗的图像数据,从所述搜索窗中确定出与所述待编码块的编码代价最小的块;The image data of the search window is read from the third memory, and according to the image data of the search window, the block with the least encoding cost of the block to be encoded is determined from the search window;
将与所述待编码块的编码代价最小的块作为所述匹配块。The block with the smallest coding cost of the block to be coded is used as the matching block.
在一种实施方式中,所述第三确定模块404可以用于:In one embodiment, the third determining module 404 may be used to:
从所述第三存储器中读取所述搜索窗的图像数据;Read the image data of the search window from the third memory;
将所述搜索窗按照预设阶层数进行缩小,得到缩小后的搜索窗;reducing the search window according to the preset number of layers to obtain a reduced search window;
根据所述缩小后的搜索窗的图像数据,从所述缩小后的搜索窗中确定出与所述待编码块的编码代价最小的缩小后的块;According to the image data of the reduced search window, the reduced block with the smallest encoding cost of the block to be encoded is determined from the reduced search window;
根据所述缩小后的块在所述缩小后的搜索窗中的位置,从所述搜索窗中确定出与所述待编码块的编码代价最小的块。According to the position of the reduced block in the reduced search window, a block with the smallest encoding cost to the block to be encoded is determined from the search window.
在一种实施方式中,所述相对关系为运动矢量和残差,所述编码模块405可以用于:In one embodiment, the relative relationship is a motion vector and a residual, and the encoding module 405 can be used to:
根据所述匹配块与所述待编码块的运动矢量和残差,对所述待编码块进行编码。The to-be-encoded block is encoded according to the motion vector sum residual of the matched block and the to-be-encoded block.
在一种实施方式中,所述第三确定模块404可以用于:In one embodiment, the third determining module 404 may be used to:
将所述匹配块与所述待编码块的残差进行正向变换和量化;performing forward transform and quantization on the residual of the matching block and the block to be encoded;
将所述匹配块与所述待编码块的运动矢量以及正向变换和量化后的第一残差数据进行熵编码,得到视频流编码数据;或者Entropy coding is performed on the motion vector of the matching block and the block to be coded and the forward transformed and quantized first residual data to obtain coded video stream data; or
将所述正向变换和量化后的第一残差数据进行反向量化与变换,得到第二残差数据;Perform inverse quantization and transformation on the first residual data after the forward transformation and quantization to obtain second residual data;
根据所述第二残差数据对所述待编码块进行重构。The to-be-coded block is reconstructed according to the second residual data.
在一种实施方式中,所述预设存储器包括第一存储器和第二存储器,所述第二存储器的功耗大于所述第一存储器的功耗的第一预设倍数,所述第一区域包括多个块行,所述第二确定模块402可以用于:In one embodiment, the preset memory includes a first memory and a second memory, the power consumption of the second memory is greater than a first preset multiple of the power consumption of the first memory, and the first area Including a plurality of block lines, the second determining module 402 can be used for:
从存储在所述第二存储器中多个历史帧图像的重构帧图像中确定出需要多次重复读取的多个第一区域;Determine from the reconstructed frame images of the plurality of historical frame images stored in the second memory, a plurality of first regions that need to be read repeatedly;
将所述第二存储器中读取所述多个第一区域的图像数据并将其存储在所述第一存储器中。The image data of the plurality of first regions is read from the second memory and stored in the first memory.
所述读取模块403可以用于:The reading module 403 can be used for:
从所述第一存储器中逐块行读取所述多个第一区域的图像数据;read the image data of the plurality of first regions from the first memory block by line;
若从所述第一存储器中读取的次数大于或等于预设次数阈值,则从第二存储器中逐块行读取所述多个第一区域中未被读取块行的图像数据。If the number of times of reading from the first memory is greater than or equal to a preset number of times threshold, the image data of unread block lines in the plurality of first regions is read block line by block line from the second memory.
在一种实施方式中,所述第三确定模块404可以用于:In one embodiment, the third determining module 404 may be used to:
从读取的所述多个第一区域的图像数据中确定出多个搜索窗的图像数据,每个所述搜索窗位于对应的所述第一区域内;Determine the image data of a plurality of search windows from the read image data of the plurality of first areas, and each of the search windows is located in the corresponding first area;
将所述多个搜索窗的图像数据存储在第三存储器中,所述第三存储器的读写速度大于所述第一存储器的读写速度的第二预设倍数;storing the image data of the plurality of search windows in a third memory, where the read-write speed of the third memory is greater than the second preset multiple of the read-write speed of the first memory;
从所述第三存储器中读取所述多个搜索窗的图像数据,并根据所述多个搜索窗的图像数据,从所述多个搜索窗中分别确定出与所述待编码块的编码代价最小的一个或多个块;The image data of the plurality of search windows are read from the third memory, and according to the image data of the plurality of search windows, the codes corresponding to the blocks to be coded are respectively determined from the plurality of search windows. the least expensive block or blocks;
从与所述待编码块的编码代价最小的多个块中确定出所述匹配块。The matching block is determined from a plurality of blocks with the smallest coding cost of the block to be coded.
在一种实施方式中,所述第三确定模块404可以用于:In one embodiment, the third determining module 404 may be used to:
从所述第三存储器中读取所述多个搜索窗的图像数据;Read the image data of the plurality of search windows from the third memory;
将所述多个搜索窗按照预设阶层数进行缩小,得到多个缩小后的搜索窗;reducing the plurality of search windows according to the preset number of layers to obtain a plurality of reduced search windows;
根据所述多个缩小后的搜索窗的图像数据,从所述多个缩小后的搜索窗中分别确定出与所述待编码块的编码代价最小的一个或多个缩小后的块;According to the image data of the plurality of reduced search windows, from the plurality of reduced search windows, respectively determine one or more reduced blocks with the smallest coding cost of the to-be-coded block;
根据所述一个或多个缩小后的块在所述缩小后的搜索窗中的位置,从所述多个搜索窗中分别确定出与所述待编码块的编码代价最小的一个或多个块。According to the positions of the one or more reduced blocks in the reduced search window, one or more blocks with the smallest coding cost of the block to be coded are respectively determined from the plurality of search windows .
在一种实施方式中,所述第一存储器包括设置在视频编码装置外部的系统高速缓存或系统缓冲存储器,所述第二存储器包括设置在视频编码装置外部的动态随机存取内存。In one embodiment, the first memory includes a system cache or a system buffer memory provided outside the video encoding device, and the second memory includes a dynamic random access memory provided outside the video encoding device.
在一种实施方式中,所述第三存储器包括设置在视频编码装置内部的缓存或缓冲。In one embodiment, the third memory includes a buffer or buffer provided inside the video encoding device.
本申请实施例提供一种计算机可读的存储介质,其上存储有计算机程序,当所述计算机程序在计算机上执行时,使得所述计算机执行如本实施例提供的在视频编码装置中进行图像处理的方法中的流程。An embodiment of the present application provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed on a computer, the computer is made to execute the image encoding in a video encoding device as provided in this embodiment. Process in the method of processing.
本申请实施例还提供一种电子设备,包括存储器,处理器以及视频编码装置,所述处理器通过调用所述存储器中存储的计算机程序,用于执行本实施例提供的在视频编码装置中进行图像处 理的方法中的流程。An embodiment of the present application further provides an electronic device, including a memory, a processor, and a video encoding apparatus. The processor is configured to execute the video encoding apparatus provided in this embodiment by calling a computer program stored in the memory. The flow in the method of image processing.
例如,上述电子设备可以是诸如平板电脑或者智能手机等移动终端。请参阅图21,图21为本申请实施例提供的电子设备的结构示意图。For example, the above-mentioned electronic device may be a mobile terminal such as a tablet computer or a smart phone. Please refer to FIG. 21 , which is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
该电子设备500可以包括视频编码装置501、存储器502、处理器503等部件。本领域技术人员可以理解,图21中示出的电子设备结构并不构成对电子设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。The electronic device 500 may include a video encoding apparatus 501, a memory 502, a processor 503 and other components. Those skilled in the art can understand that the structure of the electronic device shown in FIG. 21 does not constitute a limitation on the electronic device, and may include more or less components than the one shown, or combine some components, or arrange different components.
视频编码装置501可以用于对视频图像进行编码,以对视频图像的内容进行压缩。The video encoding device 501 may be used for encoding video images to compress the content of the video images.
存储器502可用于存储应用程序和数据。存储器502存储的应用程序中包含有可执行代码。应用程序可以组成各种功能模块。处理器503通过运行存储在存储器502的应用程序,从而执行各种功能应用以及数据处理。 Memory 502 may be used to store applications and data. The application program stored in the memory 502 contains executable code. Applications can be composed of various functional modules. The processor 503 executes various functional applications and data processing by running the application programs stored in the memory 502 .
处理器503是电子设备的控制中心,利用各种接口和线路连接整个电子设备的各个部分,通过运行或执行存储在存储器502内的应用程序,以及调用存储在存储器502内的数据,执行电子设备的各种功能和处理数据,从而对电子设备进行整体监控。The processor 503 is the control center of the electronic device, uses various interfaces and lines to connect various parts of the entire electronic device, and executes the electronic device by running or executing the application program stored in the memory 502 and calling the data stored in the memory 502. The various functions and processing data of the device are used to monitor the electronic equipment as a whole.
在本实施例中,电子设备中的处理器503会按照如下的指令,将一个或一个以上的应用程序的进程对应的可执行代码加载到存储器502中,并由处理器503来运行存储在存储器502中的应用程序,从而执行:In this embodiment, the processor 503 in the electronic device loads the executable code corresponding to the process of one or more application programs into the memory 502 according to the following instructions, and the processor 503 executes and stores it in the memory 502 in the application, which executes:
从当前帧图像中确定出待编码块;Determine the block to be coded from the current frame image;
从历史帧图像的重构帧图像中确定出需要多次重复读取的第一区域,并将所述第一区域的图像数据存储在预设存储器中,所述预设存储器的功耗小于预设功耗阈值;Determine the first area that needs to be read repeatedly from the reconstructed frame images of the historical frame images, and store the image data of the first area in a preset memory, where the power consumption of the preset memory is less than that of the preset memory. Set the power consumption threshold;
从所述预设存储器中读取所述第一区域的图像数据;read the image data of the first area from the preset memory;
根据读取的所述第一区域的图像数据从所述第一区域中确定出与所述待编码块相匹配的匹配块;determining a matching block matching the block to be encoded from the first region according to the read image data of the first region;
根据所述匹配块与所述待编码块的相对关系,对所述待编码块进行编码。The to-be-encoded block is encoded according to the relative relationship between the matched block and the to-be-encoded block.
请参阅图22,电子设备500可以包括视频编码模器501、存储器502、处理器503、电池504、输入单元505、输出单元506等部件。Referring to FIG. 22 , the electronic device 500 may include components such as a video encoder 501 , a memory 502 , a processor 503 , a battery 504 , an input unit 505 , and an output unit 506 .
视频编码模器501可以用于对视频图像进行编码,以对视频图像的内容进行压缩。The video coding module 501 may be used for coding video images to compress the content of the video images.
存储器502可用于存储应用程序和数据。存储器502存储的应用程序中包含有可执行代码。应用程序可以组成各种功能模块。处理器503通过运行存储在存储器502的应用程序,从而执行各种功能应用以及数据处理。 Memory 502 may be used to store applications and data. The application program stored in the memory 502 contains executable code. Applications can be composed of various functional modules. The processor 503 executes various functional applications and data processing by running the application programs stored in the memory 502 .
处理器503是电子设备的控制中心,利用各种接口和线路连接整个电子设备的各个部分,通过运行或执行存储在存储器502内的应用程序,以及调用存储在存储器502内的数据,执行电子设备的各种功能和处理数据,从而对电子设备进行整体监控。The processor 503 is the control center of the electronic device, uses various interfaces and lines to connect various parts of the entire electronic device, and executes the electronic device by running or executing the application program stored in the memory 502 and calling the data stored in the memory 502. The various functions and processing data of the device are used to monitor the electronic equipment as a whole.
电池504可用于为电子设备的各个部件提供电力支持,从而保障各个部件的正常运行。The battery 504 may be used to provide electrical support for various components of the electronic device, thereby ensuring the normal operation of the various components.
输入单元505可用于接收视频图像的输入视频流,例如可以用于接收需要进行视频压缩的视频流。The input unit 505 can be used to receive an input video stream of video images, for example, can be used to receive a video stream that needs to be compressed.
输出单元506可以用于输出已压缩的视频流。The output unit 506 may be used to output the compressed video stream.
在本实施例中,电子设备中的处理器503会按照如下的指令,将一个或一个以上的应用程序的进程对应的可执行代码加载到存储器502中,并由处理器503来运行存储在存储器502中的应用程序,从而执行:In this embodiment, the processor 503 in the electronic device loads the executable code corresponding to the process of one or more application programs into the memory 502 according to the following instructions, and the processor 503 executes and stores it in the memory 502 in the application, which executes:
从当前帧图像中确定出待编码块;Determine the block to be coded from the current frame image;
从历史帧图像的重构帧图像中确定出需要多次重复读取的第一区域,并将所述第一区域的图像数据存储在预设存储器中,所述预设存储器的功耗小于预设功耗阈值;Determine the first area that needs to be read repeatedly from the reconstructed frame images of the historical frame images, and store the image data of the first area in a preset memory, where the power consumption of the preset memory is less than that of the preset memory. Set the power consumption threshold;
从所述预设存储器中读取所述第一区域的图像数据;read the image data of the first area from the preset memory;
根据读取的所述第一区域的图像数据从所述第一区域中确定出与所述待编码块相匹配的匹配块;determining a matching block matching the block to be encoded from the first region according to the read image data of the first region;
根据所述匹配块与所述待编码块的相对关系,对所述待编码块进行编码。The to-be-encoded block is encoded according to the relative relationship between the matched block and the to-be-encoded block.
本申请实施例还提供一种图像处理系统,请参阅图23和图24,图23是本申请实施例提供的图像处理系统的结构示意图。图24是本申请实施例提供的图像处理系统的另一结构示意图。该图像处理系统600包括视频编码装置601、第一存储器602和第二存储器603,其中,第二存储器603的功耗大于第一存储器602的功耗的第一预设倍数,视频编码装置601可以包括第三存储器,第三存储器的读取速度大于第一存储器的读取速度的第二预设倍数,第一存储器602和第二存储器603分别存储历史帧图像的重构帧图像中需要多次重复读取的图像数据,视频编码装置601在进行编码时,按照预设次数分别从第一存储器602和第二存储器603读取多次重复读取的图像数据,并从中确定出搜索窗内的图像数据,将搜索窗内的图像数据存储在第三存储器中。An embodiment of the present application further provides an image processing system. Please refer to FIG. 23 and FIG. 24 . FIG. 23 is a schematic structural diagram of the image processing system provided by the embodiment of the present application. FIG. 24 is another schematic structural diagram of an image processing system provided by an embodiment of the present application. The image processing system 600 includes a video encoding apparatus 601, a first memory 602 and a second memory 603, wherein the power consumption of the second memory 603 is greater than a first preset multiple of the power consumption of the first memory 602, and the video encoding apparatus 601 may Including a third memory, the reading speed of the third memory is greater than the second preset multiple of the reading speed of the first memory, and the first memory 602 and the second memory 603 respectively store the reconstructed frame images of the historical frame images. For the image data read repeatedly, the video encoding device 601 reads the image data repeatedly read from the first memory 602 and the second memory 603 according to a preset number of times when encoding, and determines the image data in the search window. Image data, storing the image data in the search window in the third memory.
比如,当第二存储器603中存储历史帧图像的重构帧图像后,可以从存储在第二存储器603中的历史帧图像的重构帧图像中确定出需要多次重复读取的第一区域,然后从第二存储器603中读取第一区域的图像数据,并读取的第一区域的图像数据存储到第一存储器602中,在编码时,视频编码装置601可以从第一存储器602中逐块行读取第一区域的图像数据。若从第一存储器602中读取的次数大于或等于预设次数阈值,则从第二存储器603中逐块行读取第一区域中未被读取块行的图像数据。For example, after the reconstructed frame images of the historical frame images are stored in the second memory 603, the first area that needs to be read repeatedly for multiple times can be determined from the reconstructed frame images of the historical frame images stored in the second memory 603 , and then read the image data of the first area from the second memory 603 , and store the read image data of the first area into the first memory 602 . During encoding, the video encoding apparatus 601 can retrieve the image data from the first memory 602 The image data of the first region is read block by line. If the number of times of reading from the first memory 602 is greater than or equal to the preset number of times threshold, the image data of the unread block lines in the first region is read block line by block line from the second memory 603 .
需要说明的是,当从第二存储器603读取图像数据时,视频编码装置601可以直接从第二存储器603读取图像数据,或者是由第一存储器602从第二存储器603读取图像数据后进行存储,该部分图像数据由视频编码装置601直接从第一存储器602中读取。It should be noted that when the image data is read from the second memory 603 , the video encoding apparatus 601 may directly read the image data from the second memory 603 , or the first memory 602 may read the image data from the second memory 603 after For storage, this part of the image data is directly read from the first memory 602 by the video encoding device 601 .
视频编码装置601可以从第三存储器读取搜索窗内的图像数据,根据从第三存储器中读取的搜索窗内的图像数据,从搜索窗中确定出与待编码模块相匹配的匹配块,并根据匹配块与待编码块的运动矢量和残差进行编码。The video encoding device 601 can read the image data in the search window from the third memory, according to the image data in the search window read from the third memory, from the search window, determine the matching block that matches the module to be encoded, And encoding is performed according to the motion vector and residual of the matching block and the block to be encoded.
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见上文针对在视频编码装置中进行图像处理的方法的详细描述,此处不再赘述。In the above embodiments, the description of each embodiment has its own emphasis. For the part that is not described in detail in a certain embodiment, please refer to the above detailed description of the method for performing image processing in a video encoding device. Repeat.
本申请实施例提供的所述在视频编码装置中进行图像处理的装置与上文实施例中的在视频编码装置中进行图像处理的方法属于同一构思,在所述在视频编码装置中进行图像处理的装置上可以运行所述在视频编码装置中进行图像处理的方法实施例中提供的任一方法,其具体实现过程详见所述在视频编码装置中进行图像处理的方法实施例,此处不再赘述。The apparatus for performing image processing in a video coding apparatus provided by the embodiments of the present application and the method for performing image processing in a video coding apparatus in the above embodiments belong to the same concept. Any of the methods provided in the method embodiments for performing image processing in a video encoding device can be run on the device of a video encoding device. For details of the specific implementation process, please refer to the method embodiments for performing image processing in a video encoding device. Repeat.
需要说明的是,对本申请实施例所述在视频编码装置中进行图像处理的方法而言,本领域普通技术人员可以理解实现本申请实施例所述在视频编码装置中进行图像处理的方法的全部或部分流程,是可以通过计算机程序来控制相关的硬件来完成,所述计算机程序可存储于一计算机可读取存储介质中,如存储在存储器中,并被至少一个处理器执行,在执行过程中可包括如所述在视频编码装置中进行图像处理的方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储器(ROM,Read Only Memory)、随机存取记忆体(RAM,Random Access Memory)等。It should be noted that, for the method for performing image processing in a video encoding apparatus described in the embodiments of the present application, those of ordinary skill in the art can understand all aspects of implementing the method for performing image processing in a video encoding apparatus described in the embodiments of the present application Or part of the process can be completed by controlling the relevant hardware through a computer program, and the computer program can be stored in a computer-readable storage medium, such as a memory, and executed by at least one processor. The flow of an embodiment of a method for image processing in a video encoding device as described may be included in the . Wherein, the storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM, Read Only Memory), a random access memory (RAM, Random Access Memory), and the like.
对本申请实施例的所述在视频编码装置中进行图像处理的装置而言,其各功能模块可以集成在一个处理芯片中,也可以是各个模块单独物理存在,也可以两个或两个以上模块集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中,所述存储介质譬如为只读存储器,磁盘或光盘等。For the apparatus for performing image processing in a video encoding apparatus described in the embodiments of the present application, each functional module may be integrated in a processing chip, or each module may exist physically alone, or two or more modules may be used. integrated in one module. The above-mentioned integrated modules can be implemented in the form of hardware, and can also be implemented in the form of software function modules. If the integrated module is implemented in the form of a software function module and sold or used as an independent product, it can also be stored in a computer-readable storage medium, such as a read-only memory, a magnetic disk or an optical disk, etc. .
以上对本申请实施例所提供的一种在视频编码装置中进行图像处理的方法、装置、存储介质、电子设备及系统进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。A method, device, storage medium, electronic device, and system for image processing in a video encoding device provided by the embodiments of the present application have been described above in detail. Specific examples are used in this paper to describe the principles and implementations of the present application. For elaboration, the description of the above embodiment is only used to help understand the method and the core idea of the application; meanwhile, for those skilled in the art, according to the idea of the application, there will be changes in the specific implementation and application scope. In conclusion, the content of this specification should not be construed as a limitation to the present application.

Claims (20)

  1. 一种在视频编码装置中进行图像处理的方法,其中,所述方法包括:A method for image processing in a video encoding device, wherein the method comprises:
    从当前帧图像中确定出待编码块;Determine the block to be coded from the current frame image;
    从历史帧图像的重构帧图像中确定出需要多次重复读取的第一区域,并将所述第一区域的图像数据存储在预设存储器中,所述预设存储器的功耗小于预设功耗阈值;Determine the first area that needs to be read repeatedly from the reconstructed frame images of the historical frame images, and store the image data of the first area in a preset memory, where the power consumption of the preset memory is less than that of the preset memory. Set the power consumption threshold;
    从所述预设存储器中读取所述第一区域的图像数据;read the image data of the first area from the preset memory;
    根据读取的所述第一区域的图像数据从所述第一区域中确定出与所述待编码块相匹配的匹配块;determining a matching block matching the block to be encoded from the first region according to the read image data of the first region;
    根据所述匹配块与所述待编码块的相对关系,对所述待编码块进行编码。The to-be-encoded block is encoded according to the relative relationship between the matched block and the to-be-encoded block.
  2. 根据权利要求1所述的在视频编码装置中进行图像处理的方法,其中,所述预设存储器包括第一存储器和第二存储器,所述第二存储器的功耗大于所述第一存储器的功耗的第一预设倍数,所述第一区域包括多个块行,所述从历史帧图像的重构帧图像中确定出需要多次重复读取的第一区域,并将所述第一区域的图像数据存储在预设存储器中,包括:The method for image processing in a video encoding device according to claim 1, wherein the preset memory comprises a first memory and a second memory, and the power consumption of the second memory is greater than that of the first memory The first preset multiple of power consumption, the first area includes a plurality of block lines, the first area that needs to be read repeatedly is determined from the reconstructed frame image of the historical frame image, and the first area needs to be read repeatedly. The image data of the area is stored in the preset memory, including:
    从存储在所述第二存储器中的所述历史帧图像的重构帧图像中确定出需要多次重复读取的第一区域;Determine from the reconstructed frame image of the historical frame image stored in the second memory the first area that needs to be read repeatedly;
    从所述第二存储器中读取所述第一区域的图像数据并将其存储到所述第一存储器中;read the image data of the first area from the second memory and store it in the first memory;
    所述从所述预设存储器中读取所述第一区域的图像数据,包括:The reading of the image data of the first region from the preset memory includes:
    从所述第一存储器中逐块行读取所述第一区域的图像数据;Reading the image data of the first region from the first memory block by line;
    若从所述第一存储器中读取的次数大于或等于预设次数阈值,则从第二存储器中逐块行读取第一区域中未被读取块行的图像数据。If the number of times of reading from the first memory is greater than or equal to a preset number of times threshold, the image data of unread block lines in the first region is read block line by block line from the second memory.
  3. 根据权利要求2所述的在视频编码装置中进行图像处理的方法,其中,所述从所述第二存储器中读取所述第一区域的图像数据并将其存储到所述第一存储器中,包括:The method of image processing in a video encoding apparatus according to claim 2, wherein the image data of the first region is read from the second memory and stored in the first memory ,include:
    若所述第一区域在所述历史帧图像的重构帧图像中下移一个块行,则从所述第二存储器中读取下移块行的图像数据并将其存储到所述第一存储器中;If the first area is moved down by one block line in the reconstructed frame image of the historical frame image, the image data of the moved block line is read from the second memory and stored in the first memory in memory;
    将所述第一存储器中下一个待编码块行编码时用不到的块行进行移除。The block lines that are not used in the encoding of the next block line to be encoded in the first memory are removed.
  4. 根据权利要求3所述的在视频编码装置中进行图像处理的方法,其中,所述根据读取的所述第一区域的图像数据从所述第一区域中确定出与所述待编码块相匹配的匹配块,包括:The method for performing image processing in a video encoding device according to claim 3, wherein the block corresponding to the block to be encoded is determined from the first region according to the read image data of the first region. Matched match blocks, including:
    从读取的所述第一区域的图像数据中确定出搜索窗的图像数据,所述搜索窗位于所述第一区域内;Determine the image data of the search window from the read image data of the first area, and the search window is located in the first area;
    将所述搜索窗的图像数据存储在第三存储器中,所述第三存储器的度大于所述第一存储器的读写速度的第二预设倍数;storing the image data of the search window in a third memory, where the degree of the third memory is greater than the second preset multiple of the read-write speed of the first memory;
    从所述第三存储器中读取所述搜索窗的图像数据,并根据所述搜索窗的图像数据,从所述搜索窗中确定出与所述待编码块的编码代价最小的块;The image data of the search window is read from the third memory, and according to the image data of the search window, the block with the least encoding cost of the block to be encoded is determined from the search window;
    将与所述待编码块的编码代价最小的块作为所述匹配块。The block with the smallest coding cost of the block to be coded is used as the matching block.
  5. 根据权利要求4所述的在视频编码装置中进行图像处理的方法,其中,所述从所述第三存储器中读取所述搜索窗的图像数据,并根据所述搜索窗的图像数据,从所述搜索窗中确定出与所述待编码块的编码代价最小的块,包括:The method for performing image processing in a video encoding device according to claim 4, wherein the image data of the search window is read from the third memory, and the image data of the search window is retrieved from the image data of the search window. The block with the least coding cost of the block to be coded is determined in the search window, including:
    从所述第三存储器中读取所述搜索窗的图像数据;Read the image data of the search window from the third memory;
    将所述搜索窗按照预设阶层数进行缩小,得到缩小后的搜索窗;reducing the search window according to the preset number of layers to obtain a reduced search window;
    根据所述缩小后的搜索窗的图像数据,从所述缩小后的搜索窗中确定出与所述待编码块的编码代价最小的缩小后的块;According to the image data of the reduced search window, the reduced block with the smallest encoding cost of the block to be encoded is determined from the reduced search window;
    根据所述缩小后的块在所述缩小后的搜索窗中的位置,从所述搜索窗中确定出与所述待编码块的编码代价最小的块。According to the position of the reduced block in the reduced search window, a block with the smallest encoding cost to the block to be encoded is determined from the search window.
  6. 根据权利要求1所述的在视频编码装置中进行图像处理的方法,其中,所述相对关系为运动矢量和残差,所述根据所述匹配块与所述待编码块的相对关系,对所述待编码块进行编码,包括:The method for performing image processing in a video encoding device according to claim 1, wherein the relative relationship is a motion vector and a residual, and the relative relationship between the matching block and the block to be encoded is determined according to the relative relationship between the matching block and the block to be encoded. The block to be encoded is encoded, including:
    根据所述匹配块与所述待编码块的运动矢量和残差,对所述待编码块进行编码。The to-be-encoded block is encoded according to the motion vector sum residual of the matched block and the to-be-encoded block.
  7. 根据权利要求6所述的在视频编码装置中进行图像处理的方法,其中,所述根据所述匹配块与所述待编码块的运动矢量和残差,对所述待编码块进行编码,包括:The method for performing image processing in a video encoding device according to claim 6, wherein the encoding the to-be-encoded block according to a motion vector and a residual of the matching block and the to-be-encoded block comprises the following steps: :
    将所述匹配块与所述待编码块的残差进行正向变换和量化;performing forward transform and quantization on the residual of the matching block and the block to be encoded;
    将所述匹配块与所述待编码块的运动矢量以及正向变换和量化后的第一残差数据进行熵编码,得到视频流编码数据;或者Entropy coding is performed on the motion vector of the matching block and the block to be coded and the forward transformed and quantized first residual data to obtain coded video stream data; or
    将所述正向变换和量化后的第一残差数据进行反向量化与变换,得到第二残差数据;Perform inverse quantization and transformation on the first residual data after the forward transformation and quantization to obtain second residual data;
    根据所述第二残差数据对所述待编码块进行重构。The to-be-coded block is reconstructed according to the second residual data.
  8. 根据权利要求1所述的在视频编码装置中进行图像处理的方法,其中,所述预设存储器包括第一存储器和第二存储器,所述第二存储器的功耗大于所述第一存储器的功耗的第一预设倍数,所述第一区域包括多个块行,所述从历史帧图像的重构帧图像中确定出需要多次重复读取的第一区域,并将所述第一区域的图像数据存储在预设存储器中,包括:The method for image processing in a video encoding device according to claim 1, wherein the preset memory comprises a first memory and a second memory, and the power consumption of the second memory is greater than that of the first memory The first preset multiple of power consumption, the first area includes a plurality of block lines, the first area that needs to be read repeatedly is determined from the reconstructed frame image of the historical frame image, and the first area needs to be read repeatedly. The image data of the area is stored in the preset memory, including:
    从存储在所述第二存储器中的多个历史帧图像的重构帧图像中确定出需要多次重复读取的多个第一区域;Determine from the reconstructed frame images of the plurality of historical frame images stored in the second memory a plurality of first regions that need to be read repeatedly;
    从所述第二存储器中读取所述多个第一区域的图像数据并将其存储在所述第一存储器中;reading image data of the plurality of first regions from the second memory and storing it in the first memory;
    所述从所述预设存储器中读取所述第一区域的图像数据,包括:The reading of the image data of the first region from the preset memory includes:
    从所述第一存储器中逐块行读取所述多个第一区域的图像数据;read the image data of the plurality of first regions from the first memory block by line;
    若从所述第一存储器中读取的次数大于或等于预设次数阈值,则从第二存储器中逐块行读取所述多个第一区域中未被读取块行的图像数据。If the number of times of reading from the first memory is greater than or equal to a preset number of times threshold, the image data of unread block lines in the plurality of first regions is read block line by block line from the second memory.
  9. 根据权利要求8所述的在视频编码装置中进行图像处理的方法,其中,所述根据读取的所述第一区域的图像数据从所述第一区域中确定出与所述待编码块相匹配的匹配块,包括:The method for performing image processing in a video encoding device according to claim 8, wherein the block corresponding to the block to be encoded is determined from the first region according to the read image data of the first region. Matched match blocks, including:
    从读取的所述多个第一区域的图像数据中确定出多个搜索窗的图像数据,每个所述搜索窗位于对应的所述第一区域内;Determine the image data of a plurality of search windows from the read image data of the plurality of first areas, and each of the search windows is located in the corresponding first area;
    将所述多个搜索窗的图像数据存储在第三存储器中,所述第三存储器的读写速度大于所述第一存储器的读写速度的第二预设倍数;storing the image data of the plurality of search windows in a third memory, where the read-write speed of the third memory is greater than the second preset multiple of the read-write speed of the first memory;
    从所述第三存储器中读取所述多个搜索窗的图像数据,并根据所述多个搜索窗的图像数据,从所述多个搜索窗中分别确定出与所述待编码块的编码代价最小的一个或多个块;The image data of the plurality of search windows are read from the third memory, and according to the image data of the plurality of search windows, the codes corresponding to the blocks to be coded are respectively determined from the plurality of search windows. the least expensive block or blocks;
    从与所述待编码块的编码代价最小的多个块中确定出所述匹配块。The matching block is determined from a plurality of blocks with the smallest coding cost of the block to be coded.
  10. 根据权利要求9所述的在视频编码装置中进行图像处理的方法,其中,所述从所述第三存储器中读取所述多个搜索窗的图像数据,并根据所述多个搜索窗的图像数据,从所述多个搜索窗中分别确定出与所述待编码块的编码代价最小的一个或多个块,包括:The method for performing image processing in a video encoding device according to claim 9, wherein the image data of the plurality of search windows are read from the third memory, and the image data of the plurality of search windows are read according to the The image data, from the plurality of search windows, respectively determine one or more blocks with the smallest coding cost of the block to be coded, including:
    从所述第三存储器中读取所述多个搜索窗的图像数据;Read the image data of the plurality of search windows from the third memory;
    将所述多个搜索窗按照预设阶层数进行缩小,得到多个缩小后的搜索窗;reducing the plurality of search windows according to the preset number of layers to obtain a plurality of reduced search windows;
    根据所述多个缩小后的搜索窗的图像数据,从所述多个缩小后的搜索窗中分别确定出与所述待编码块的编码代价最小的一个或多个缩小后的块;According to the image data of the plurality of reduced search windows, from the plurality of reduced search windows, respectively determine one or more reduced blocks with the smallest coding cost of the to-be-coded block;
    根据所述一个或多个缩小后的块在所述缩小后的搜索窗中的位置,从所述多个搜索窗中分别确定出与所述待编码块的编码代价最小的一个或多个块。According to the positions of the one or more reduced blocks in the reduced search window, one or more blocks with the smallest coding cost of the block to be coded are respectively determined from the plurality of search windows .
  11. 根据权利要求2所述的在视频编码装置中进行图像处理的方法,其中,所述第一存储器包括设置在视频编码装置外部的系统高速缓存或系统缓冲存储器,所述第二存储器包括设置在视频编码装置外部的动态随机存取内存。The method for performing image processing in a video encoding device according to claim 2, wherein the first memory includes a system cache or a system buffer memory provided outside the video encoding device, and the second memory includes a system cache provided in the video encoding device. Dynamic random access memory external to the encoding device.
  12. 根据权利要求4所述的在视频编码装置中进行图像处理的方法,其中,所述第三存储器包括设置在视频编码装置内部的缓存或缓冲。The method for performing image processing in a video encoding device according to claim 4, wherein the third memory comprises a buffer or buffer provided inside the video encoding device.
  13. 一种在视频编码装置中进行图像处理的装置,其中,所述装置包括:A device for image processing in a video encoding device, wherein the device comprises:
    第一确定模块,用于从当前帧图像中确定出待编码块;a first determining module, configured to determine the block to be encoded from the current frame image;
    第二确定模块,用于从历史帧图像的重构帧图像中确定出需要多次重复读取的第一区域,并将所述第一区域的图像数据存储在预设存储器中,所述预设存储器的功耗小于预设功耗阈值;The second determination module is configured to determine, from the reconstructed frame images of the historical frame images, a first area that needs to be read repeatedly for many times, and store the image data of the first area in a preset memory, and the preset memory Set the power consumption of the memory to be less than the preset power consumption threshold;
    读取模块,用于从所述预设存储器中读取所述第一区域的图像数据;a reading module for reading the image data of the first area from the preset memory;
    第三确定模块,用于根据读取的所述第一区域的图像数据从所述第一区域中确定出与所述待编码块相匹配的匹配块;a third determining module, configured to determine a matching block matching the block to be encoded from the first region according to the read image data of the first region;
    编码模块,用于根据所述匹配块与所述待编码块的相对关系,对所述待编码块进行编码。An encoding module, configured to encode the to-be-encoded block according to the relative relationship between the matched block and the to-be-encoded block.
  14. 如权利要求13所述的在视频编码装置中进行图像处理的装置,其中,所述预设存储器包括第一存储器和第二存储器,所述第二存储器的功耗大于所述第一存储器的功耗的第一预设倍数,所述第一区域包括多个块行;所述第二确定模块,还用于从存储在所述第二存储器中的所述历史帧图像的重构帧图像中确定出需要多次重复读取的第一区域;The apparatus for performing image processing in a video encoding apparatus according to claim 13, wherein the preset memory includes a first memory and a second memory, and the power consumption of the second memory is greater than that of the first memory The first preset multiple of power consumption, the first area includes a plurality of block rows; the second determination module is further configured to reconstruct the frame image from the historical frame image stored in the second memory Determine the first area that needs to be read repeatedly;
    从所述第二存储器中读取所述第一区域的图像数据并将其存储到所述第一存储器中;read the image data of the first area from the second memory and store it in the first memory;
    所述读取模块,还用于从所述第一存储器中逐块行读取所述第一区域的图像数据;The reading module is further configured to read the image data of the first area block by row from the first memory;
    若从所述第一存储器中读取的次数大于或等于预设次数阈值,则从第二存储器中逐块行读取第一区域中未被读取块行的图像数据。If the number of times of reading from the first memory is greater than or equal to a preset number of times threshold, the image data of unread block lines in the first region is read block line by block line from the second memory.
  15. 如权利要求14所述的在视频编码装置中进行图像处理的装置,其中,所述读取模块,还用于若所述第一区域在所述历史帧图像的重构帧图像中下移一个块行,则从所述第二存储器中读取下移块行的图像数据并将其存储到所述第一存储器中;The apparatus for performing image processing in a video encoding apparatus according to claim 14, wherein the reading module is further configured to move down by one in the reconstructed frame image of the historical frame image if the first area is moved down by one block row, read the image data of the block row down from the second memory and store it in the first memory;
    将所述第一存储器中下一个待编码块行编码时用不到的块行进行移除。The block lines that are not used in the encoding of the next block line to be encoded in the first memory are removed.
  16. 如权利要求15所述的在视频编码装置中进行图像处理的装置,其中,所述第三确定模块还用于从读取的所述第一区域的图像数据中确定出搜索窗的图像数据,所述搜索窗位于所述第一区域内;The device for performing image processing in a video encoding device according to claim 15, wherein the third determining module is further configured to determine the image data of the search window from the read image data of the first region, the search window is located in the first area;
    将所述搜索窗的图像数据存储在第三存储器中,所述第三存储器的度大于所述第一存储器的读写速度的第二预设倍数;storing the image data of the search window in a third memory, where the degree of the third memory is greater than the second preset multiple of the read-write speed of the first memory;
    从所述第三存储器中读取所述搜索窗的图像数据,并根据所述搜索窗的图像数据,从所述搜索窗中确定出与所述待编码块的编码代价最小的块;The image data of the search window is read from the third memory, and according to the image data of the search window, the block with the least encoding cost of the block to be encoded is determined from the search window;
    将与所述待编码块的编码代价最小的块作为所述匹配块。The block with the smallest coding cost of the block to be coded is used as the matching block.
  17. 一种计算机可读的存储介质,其上存储有计算机程序,其中,当所述计算机程序在计算机上执行时,使得所述计算机执行如权利要求1至12中任一项所述的方法。A computer-readable storage medium having a computer program stored thereon, wherein when the computer program is executed on a computer, the computer is caused to perform the method according to any one of claims 1 to 12.
  18. 一种电子设备,包括存储器,处理器以及视频编码装置,其中,所述处理器通过调用所述存储器中存储的计算机程序,以执行如权利要求1至12中任一项所述的方法。An electronic device includes a memory, a processor and a video encoding apparatus, wherein the processor executes the method according to any one of claims 1 to 12 by invoking a computer program stored in the memory.
  19. 一种图像处理系统,其中,包括视频编码装置、第一存储器和第二存储器,所述第二存储器的功耗大于所述第一存储器的功耗的第一预设倍数,所述视频编码装置包括第三存储器,所述第三存储器的读取速度大于所述第一存储器的读取速度的第二预设倍数,所述第一存储器和第二存储器分别存储历史帧图像的重构帧图像中多次重复读取的图像数据,所述视频编码装置在编码时,按照预设次数分别从所述第一存储器和第二存储器读取所述多次重复读取的图像数据,并从中确定出搜索窗内的图像数据,将所述搜索窗内的图像数据存储在所述第三存储器中,所述视频编码装置从所述第三存储器中读取所述搜索窗内的图像数据,并确定出与待编码块相匹配的匹配块,根据所述匹配块与待编码块的运动矢量和残差进行编码。An image processing system, comprising a video encoding device, a first memory and a second memory, the power consumption of the second memory is greater than a first preset multiple of the power consumption of the first memory, the video encoding device including a third memory, the read speed of the third memory is greater than the second preset multiple of the read speed of the first memory, the first memory and the second memory respectively store the reconstructed frame images of the historical frame images image data repeatedly read in the video encoding device, when encoding, the video encoding device reads the repeatedly read image data from the first memory and the second memory according to a preset number of times, and determines from the data. extract the image data in the search window, store the image data in the search window in the third memory, and the video encoding device reads the image data in the search window from the third memory, and A matching block matching the block to be coded is determined, and coding is performed according to the motion vector and residual of the matching block and the block to be coded.
  20. 根据权利要求19所述的图像处理系统,其中,从存储在所述第二存储器中的所述历史帧图像的重构帧图像中确定出需要多次重复读取的第一区域,从所述第二存储器中读取所述第一区域的图像数据并将其存储到所述第一存储器中,在编码时,所述视频编码装置从所述第一存储器中逐块行读取所述第一区域的图像数据,若从所述第一存储器中读取的次数大于或等于预设次数阈值,则从所述第二存储器中逐块行读取所述第一区域中未被读取块行的图像数据。The image processing system according to claim 19, wherein the first area that needs to be read repeatedly is determined from the reconstructed frame image of the historical frame image stored in the second memory, and the first area needs to be read repeatedly. The image data of the first area is read from the second memory and stored in the first memory. During encoding, the video encoding apparatus reads the first memory block by line from the first memory. For the image data of an area, if the number of times of reading from the first memory is greater than or equal to a preset number of times threshold, the unread blocks in the first area are read from the second memory block by block line by block row of image data.
PCT/CN2022/074533 2021-04-01 2022-01-28 Method and device for performing image processing in a video encoding device, and system WO2022206166A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110358171.1 2021-04-01
CN202110358171.1A CN115190307A (en) 2021-04-01 2021-04-01 Method, device and system for processing image in video coding device

Publications (1)

Publication Number Publication Date
WO2022206166A1 true WO2022206166A1 (en) 2022-10-06

Family

ID=83455572

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/074533 WO2022206166A1 (en) 2021-04-01 2022-01-28 Method and device for performing image processing in a video encoding device, and system

Country Status (2)

Country Link
CN (1) CN115190307A (en)
WO (1) WO2022206166A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006270683A (en) * 2005-03-25 2006-10-05 Sanyo Electric Co Ltd Coding device and method
US20150172706A1 (en) * 2013-12-17 2015-06-18 Megachips Corporation Image processor
CN104935933A (en) * 2015-06-05 2015-09-23 广东中星电子有限公司 Video coding and decoding method
CN110800301A (en) * 2018-09-30 2020-02-14 深圳市大疆创新科技有限公司 Control method and device of coding equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006270683A (en) * 2005-03-25 2006-10-05 Sanyo Electric Co Ltd Coding device and method
US20150172706A1 (en) * 2013-12-17 2015-06-18 Megachips Corporation Image processor
CN104935933A (en) * 2015-06-05 2015-09-23 广东中星电子有限公司 Video coding and decoding method
CN110800301A (en) * 2018-09-30 2020-02-14 深圳市大疆创新科技有限公司 Control method and device of coding equipment and storage medium

Also Published As

Publication number Publication date
CN115190307A (en) 2022-10-14

Similar Documents

Publication Publication Date Title
US8395634B2 (en) Method and apparatus for processing information
US9380312B2 (en) Encoding blocks in video frames containing text using histograms of gradients
JP7492051B2 (en) Chroma block prediction method and apparatus
JP2016508327A (en) Content adaptive bitrate and quality management using frame hierarchy responsive quantization for highly efficient next generation video coding
US10757440B2 (en) Motion vector prediction using co-located prediction units
JP2014123830A (en) Moving image compression/expansion device
CN112333446B (en) Intra-frame block copy reference block compression method
US20220279224A1 (en) Systems and methods for video processing
JP5260757B2 (en) Moving picture encoding method, moving picture decoding method, moving picture encoding apparatus, and moving picture decoding apparatus
WO2022206217A1 (en) Method and apparatus for performing image processing in video encoder, and medium and system
WO2020143585A1 (en) Video encoder, video decoder, and corresponding method
TWI816684B (en) Video encoding device and encoder
JP4675383B2 (en) Image decoding apparatus and method, and image encoding apparatus
CN116233453B (en) Video coding method and device
WO2022206166A1 (en) Method and device for performing image processing in a video encoding device, and system
US20110249959A1 (en) Video storing method and device based on variable bit allocation and related video encoding and decoding apparatuses
US9363524B2 (en) Method and apparatus for motion compensation reference data caching
CN113767626B (en) Video enhancement method and device
US7764845B2 (en) Signal processing method and device and video system
US20230022526A1 (en) Video processing method and apparatus, device, and storage medium
TWI418219B (en) Data-mapping method and cache system for use in a motion compensation system
JP5580541B2 (en) Image decoding apparatus and image decoding method
JP5053774B2 (en) Video encoding device
KR100891116B1 (en) Apparatus and method for bandwidth aware motion compensation
WO2022206199A1 (en) Method and apparatus for performing image processing in video decoding apparatus, and system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22778354

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22778354

Country of ref document: EP

Kind code of ref document: A1