CN115190302A

CN115190302A - Method, device and system for processing image in video decoding device

Info

Publication number: CN115190302A
Application number: CN202110357601.8A
Authority: CN
Inventors: 赵娟萍
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2021-04-01
Filing date: 2021-04-01
Publication date: 2022-10-14
Also published as: WO2022206199A1

Abstract

The application discloses a method, a device, a storage medium, an electronic device and a system for processing images in a video decoding device. The method comprises the following steps: acquiring a video code stream; determining one or more reference positions from the video code stream; determining a reference number of one or more reference positions; according to the preset power consumption threshold value and the reference times of one or more reference positions, determining a reference position needing to be stored in a preset memory from the one or more reference positions, storing the reference position in the preset memory, and storing the reference position in the preset memory and reading the reference position from the preset memory to generate power consumption which is less than or equal to the preset power consumption threshold value; and decoding the object to be decoded according to the reference position. The power consumption of the video decoding device can be reduced.

Description

Method, device and system for processing image in video decoding device

Technical Field

The present application relates to the field of electronic devices, and in particular, to a method, an apparatus, a storage medium, an electronic device, and a system for processing an image in a video decoding apparatus.

Background

With the continuous development of the technology, the video decoding apparatus has more and more powerful functions. The video decoding apparatus may decode the video image. When decoding a frame of video image, it is common to refer to data of a plurality of decoded video images. However, in the related art, when reading data of a decoded video image that needs to be referred to, the power consumption of the video decoding apparatus is large.

Disclosure of Invention

Embodiments of the present application provide a method, an apparatus, a storage medium, and an electronic device for performing image processing in a video decoding apparatus, which can reduce power consumption of the video decoding apparatus.

In a first aspect, an embodiment of the present application provides a method for image processing in a video decoding apparatus, where the method includes:

acquiring a video code stream;

determining one or more reference positions from the video code stream;

determining a reference number of times of the one or more reference locations;

according to a preset power consumption threshold value and the reference times of the one or more reference positions, determining a reference position needing to be stored in a preset memory from the one or more reference positions, storing the reference position in the preset memory, and storing the reference position in the preset memory and reading the reference position from the preset memory to generate power consumption which is less than or equal to the preset power consumption threshold value; and

and decoding the object to be decoded according to the reference position.

In a second aspect, an embodiment of the present application provides a method for image processing in a video decoding device, where the method includes:

acquiring a video code stream;

acquiring one or more reference Motion Vectors (MVs) according to the video code stream;

acquiring one or more corresponding reference blocks from one or more image frames of the video code stream according to the one or more reference motion vectors;

determining a reference number of times of the one or more reference blocks;

determining one or more reference blocks to be stored in a preset memory from the one or more reference blocks according to a preset power consumption threshold and the reference times of the one or more reference blocks, storing the one or more reference blocks in the preset memory, and storing the reference blocks in the preset memory and reading the reference blocks from the preset memory to generate power consumption which is less than or equal to the preset power consumption threshold; and

and decoding the block to be decoded or the sub-blocks of the block to be decoded according to the reference block.

In a third aspect, an embodiment of the present application provides an apparatus for performing image processing in a video decoding apparatus, the apparatus including:

the acquisition module is used for acquiring a video code stream;

the first determining module is used for determining one or more reference positions from the video code stream;

a second determination module for determining a reference number of times of the one or more reference positions;

a third determining module, configured to determine, according to a preset power consumption threshold and the reference times of the one or more reference locations, a location that needs to be stored in a preset memory from the one or more reference locations, store the location in the preset memory, and store the reference location in the preset memory and read the reference location from the preset memory, where power consumption generated by the power consumption is less than or equal to the preset power consumption threshold; and

and the decoding module is used for decoding the object to be decoded according to the reference position.

In a fourth aspect, an embodiment of the present application provides an image processing apparatus, including:

the first acquisition module is used for acquiring a video code stream;

the second acquisition module is used for acquiring one or more reference motion vectors according to the video code stream;

a third obtaining module, configured to obtain one or more corresponding reference blocks from one or more image frames of the video bitstream according to the one or more reference motion vectors;

a first determining module for determining a reference number of times of the one or more reference blocks;

a third determining module, configured to determine, according to a preset power consumption threshold and reference times of the one or more reference blocks, one or more reference blocks that need to be stored in a preset memory from the one or more reference blocks, store the one or more reference blocks in the preset memory, and store the reference blocks in the preset memory and read the reference blocks from the preset memory, where power consumption generated by storing the reference blocks in the preset memory and reading the reference blocks from the preset memory is less than or equal to the preset power consumption threshold; and

and the decoding module is used for decoding the block to be decoded or the sub-blocks of the block to be decoded according to the reference block.

In a fifth aspect, an embodiment of the present application provides a storage medium having a computer program stored thereon, where the computer program is executed on a computer, so that the computer executes the method for image processing in a video decoding apparatus provided in the embodiment of the present application.

In a sixth aspect, an embodiment of the present application further provides an electronic device, which includes a memory, a processor, and a video decoding apparatus, where the processor is configured to execute the method for processing an image in the video decoding apparatus, provided by the embodiment of the present application, by calling a computer program stored in the memory.

In a seventh aspect, an embodiment of the present application further provides an image processing system, including a video decoding apparatus, a first memory, and a second memory, where power consumption of the first memory is greater than power consumption of the second memory, the first memory stores a reference position with reference times being one time, or stores a reference position with reference times being one time and multiple times, the second memory stores a reference position with reference times being multiple times, and when decoding, the video decoding apparatus reads the reference position with reference times being one time from the first memory, and reads the reference position with reference times being multiple times from the second memory, and decodes an object to be decoded according to the reference positions.

In the embodiment of the application, the video decoding device may obtain the video code stream, and determine one or more reference positions from the video code stream. Then, determining the reference times of one or more reference positions, determining the reference position to be stored in the preset memory from the one or more reference positions according to the preset power consumption threshold and the reference times of the one or more reference positions, storing the reference position in the preset memory, and reading the reference position from the preset memory to generate power consumption which is less than or equal to the preset power consumption threshold. And then, decoding the object to be decoded according to the reference position. That is, in the embodiment of the present application, the image data of the reference position that needs to be stored in the preset memory is stored in the preset memory with lower power consumption, so as to achieve the purpose of reducing the power consumption of the video decoding apparatus. Therefore, the embodiment of the application can reduce the power consumption of the video decoding device.

Drawings

The technical solutions and advantages of the present application will be apparent from the following detailed description of specific embodiments of the present application with reference to the accompanying drawings.

Fig. 1 is a first flowchart illustrating a method for image processing in a video decoding apparatus according to an embodiment of the present application.

Fig. 2 is a schematic structural diagram of a video decoding system in the related art.

Fig. 3 is a schematic diagram of data storage in a video decoding apparatus in the related art.

Fig. 4 is a diagram illustrating a related art method for increasing the number of channels (channels) of a Dynamic Random Access Memory (DRAM) to perform data Access.

Fig. 5 is a diagram illustrating a power consumption curve when data is read from and written to a multi-channel DRAM in video decoding according to the related art.

Fig. 6 is a schematic view of a scene of a reference relationship between image frames in an image group in a current video bitstream according to an embodiment of the present application.

Fig. 7 is a second flowchart illustrating a method for image processing in a video decoding apparatus according to an embodiment of the present application.

Fig. 8 is a schematic diagram illustrating a comparison of energy consumed by a Static Random-Access Memory (SRAM) and a dynamic Random Access Memory (dram) according to an embodiment of the present disclosure.

Fig. 9 is a schematic view of a scenario in which image data to be referred to multiple times is stored in a System cache (Sys $) or a System Buffer memory (sysbuffer) according to an embodiment of the present application.

Fig. 10 is a scene schematic diagram for roughly analyzing reference relationships of a plurality of image frames in a video code stream according to an embodiment of the present application.

Fig. 11 is an architectural diagram of a video decoding system using a system cache according to an embodiment of the present application.

Fig. 12 is a schematic diagram of another architecture of a video decoding system using a system cache according to an embodiment of the present application.

Fig. 13 is a schematic diagram of an architecture of a video decoding system using a system buffer memory according to an embodiment of the present application.

Fig. 14 is a schematic diagram of power consumption curves when data is read from or written to the Sys $ or SysBuf provided in the embodiment of the present application.

Fig. 15 is a schematic flowchart of a third method for image processing in a video decoding apparatus according to an embodiment of the present application.

Fig. 16 is a schematic view of a scene of reference relationships between blocks in image frames of an image group in a video bitstream according to an embodiment of the present application.

Fig. 17 is a fourth flowchart illustrating a method for image processing in a video decoding apparatus according to an embodiment of the present application.

Fig. 18 is a schematic view of a scene for fine analyzing reference relationships of blocks in multiple image frames in a video bitstream according to an embodiment of the present application.

Fig. 19 is a schematic view of a scene decoded by the video decoding apparatus according to the embodiment of the present application.

Fig. 20 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application.

Fig. 21 is another schematic structural diagram of an image processing apparatus according to an embodiment of the present application.

Fig. 22 is a schematic structural diagram of an electronic device provided in an embodiment of the present application.

Fig. 23 is another schematic structural diagram of an electronic device provided in an embodiment of the present application.

Fig. 24 is a schematic structural diagram of an image processing system according to an embodiment of the present application.

Fig. 25 is another schematic structural diagram of an image processing system according to an embodiment of the present application.

Detailed Description

Reference is made to the drawings, wherein like reference numerals refer to like elements, which are illustrated in the various figures, and which are implemented in a suitable computing environment. The following description is based on illustrated embodiments of the application and should not be taken as limiting the application with respect to other embodiments that are not detailed herein.

Referring to fig. 1, fig. 1 is a first flowchart illustrating a method for image processing in a video decoding apparatus according to an embodiment of the present disclosure. The method for processing the image in the video decoding device can be applied to the video decoding device. The flow of the method for image processing in the video decoding device may include:

101. and acquiring a video code stream.

Referring to fig. 2, fig. 2 is a schematic diagram illustrating a video decoding system in the related art. In the video decoding system, a Central Processing Unit (CPU), a video decoding device, and a Display Processor (DISP) read and write data from a DRAM through a bus and a Dynamic Random Access Memory Controller (DRAMC), the Central Processing Unit, the video decoding device, and the Display Processor share a bandwidth in a time-sharing manner, and the priority of the Central Processing Unit and the Display Processor is higher than that of a video encoder. It should be noted that, according to specific requirements, the video decoding system may be provided with a display processor, or may not be provided with a display processor. The video decoding apparatus needs Motion Compensation (MC) for decoding, and occupies a large bandwidth.

The video decoding apparatus places great importance on the cost, and usually uses DRAM as the main storage space for frame buffering to achieve the lowest cost and the highest production yield. Referring to fig. 3, fig. 3 is a schematic diagram illustrating data storage in a video decoding apparatus according to the related art. Among them, bitstreams (Bitstreams), image frames (Image frames) to be buffered, and temporal data (temporal data) are stored in a DRAM in the video decoding apparatus. However, the bandwidth provided by DRAM is small. Among them, the temporary data may be a Temporal Motion Vector (TMV) and other data.

Although the Video decoding apparatus may employ an internal cache (internal cache) policy of the Video decoding apparatus for various Video bitstreams (Video bitstreams), for example, the Video bitstreams may be the 1 st Video and audio lossy compression standard (Moving Picture Experts Group Phase 1, MPEG-1) of the motion Picture Experts Group organization, the 2 nd Video and audio lossy compression standard (Moving Picture Experts Group 2, MPEG-2) of the motion Picture Experts Group organization, the 4 th Video and audio lossy compression standard (Moving Picture Experts Group 4, MPEG-4) of the motion Picture Experts Group organization, essential Video Coding (MPEG-5/EVC), international Telecommunication Sector (ITU Telecommunication Standardization Sector, ITU-T) established low Video Coding standard h.263, advanced Video Coding (264/264), high-Efficiency Video Coding (Video Coding) of Video conference, video Coding standard (v.9/AVC) of the Video Coding standard, video Coding standard h.263, video Coding standard (Video Coding standard) of the ITU-T, video Coding standard (Video Coding standard h.9/AVC), video Coding standard (Video Coding standard for Video Coding standard, video Coding standard h.9/9, video Coding standard, etc.

But with the advent of new video standards, such as h.265/HEVC, h.266/VVC, AV1, MPEG-5, etc., it is directed to larger and larger picture sizes and higher frame rates. Based on this, ways of increasing the bandwidth of the DRAM or increasing the frequency of the DRAM are often used to achieve accelerated throughput data.

Referring to fig. 4, fig. 4 is a diagram illustrating data access by increasing the number of channels of a DRAM in the related art. By increasing the number of channels of the DRAM, the bandwidth can be increased and the frequency can be increased to increase the data throughput speed of the DRAM, but the larger power consumption is caused.

For example, the bandwidth of the system DRAM consumes a large amount of energy in order to meet the demand of the video decoding apparatus to achieve the desired decoding speed. It is very important to maintain the highest efficiency regardless of whether the video decoding apparatus performs the on-demand operation or the non-on-demand operation. The related art method causes the DRAM to consume much power when the video decoding apparatus completes decoding at a desired time.

Referring to fig. 5, fig. 5 is a diagram illustrating a power consumption curve when reading and writing data from and to a multi-channel DRAM during video decoding according to the related art. In fig. 5, the abscissa is the position of the reference image frame, for example, the top position, the middle position, the bottom position, etc. of the reference image frame, and the ordinate is the power consumption of reading and writing data at the time of video decoding. When video decoding is performed, data needs to enter and exit the DRAM, and when the video decoding apparatus relies too much on the DRAM or other inexpensive but power consuming memories and high bandwidths, the upper limit of power consumption provided by the video decoding system is limited, so that the video decoding apparatus cannot meet the decoding speed requirement, or the video decoding system is overheated. If the upper limit of power consumption is considered, the speed of reading and writing data from the DRAM is limited, and the reading and writing speed when the upper limit of power consumption is not considered cannot be reached.

In the embodiment of the present application, a video bitstream is obtained, where the video bitstream may include one or more groups of pictures (GOPs), and one Group of pictures includes multiple image frames. In the embodiment of the present application, a video stream includes one image group as an example. The obtained video code stream is a coded video code stream, and image frames in the video code stream may not be decoded yet, or a part of the image frames may be decoded already, and another part of the image frames waits for decoding. It should be noted that the already decoded image frame may be used as a reference image frame when decoding other subsequent image frames.

102. One or more reference locations are determined from the video bitstream.

For example, in the embodiment of the present application, when an image frame to be decoded is decoded, it is generally required to compare with a reference position, and in an implementation manner, the reference position may include the image frame, a reference slice (slice), or a reference region. For example, the image frame to be decoded is compared with a reference image frame, a reference strip or a reference region, that is, the image frame to be decoded needs to refer to the already decoded image frame, strip or region. The image frame to be decoded is an image frame needing to be decoded. It should be noted that a slice contains part or all of the data of an image frame, in other words, an image frame may be encoded as one or more slices. A slice contains at least one block and at most the data of the entire image frame.

In different coding implementations, the number of slices formed by images in the same image frame is not necessarily the same. For example, the purpose of designing a stripe in h.264 is mainly to prevent the spreading of bit errors. Because the decoding operations are independent between different slices. The data referenced by the decoding process for a slice cannot cross the boundary of that slice.

In the embodiment of the present application, the video code stream may be analyzed through software or hardware of the video decoding apparatus, and the one or more reference positions may be roughly determined from the video code stream, for example, one or more reference image frames, reference stripes, or reference regions are determined from the video code stream, that is, one or more reference image frames are determined from the video code stream, or one or more reference stripes are determined from the video code stream, or one or more reference regions are determined from the video code stream, and the determined one or more reference image frames, reference stripes, or reference regions form a reference queue, so as to reference the reference image frames, reference stripes, or reference regions when decoding the image frame to be decoded.

For example, in one embodiment, the reference region may be a region in an image frame that needs to be referenced by an image to be decoded. For another example, in another embodiment, the reference region may be a region in a slice, which needs to be referenced by the image to be decoded, and so on.

103. A reference number of one or more reference locations is determined.

For example, when one or more reference positions are determined, the number of times of reference of the one or more reference positions may be determined.

For example, taking a reference image frame as an example, please refer to fig. 6, where fig. 6 is a scene schematic diagram of a reference relationship between image frames in an image group in a video stream according to an embodiment of the present application. Fig. 6 illustrates an example of an image group including 9 image frames. In other embodiments, the number of image frames included in an image group may be adjusted according to specific needs. The display order of the image frames may be the same as or different from the decoding order within an image group. The display order and decoding order of the image frames in the image group as shown in fig. 6 are different.

From the direction of the arrow in fig. 6, it can be seen how many times each image frame is referred to by other image frames in the image group, that is, how many times each image frame is referred to by the image frame pointed by the arrow may be determined according to the direction of the arrow, and when a certain image frame is referred to by other image frames in the image group one or more times, the image frame may be used as a reference image frame. For example, the reference number of I frames in fig. 6 is four, the reference number of B frames in display order 1 is two, the reference number of B frames in display order 3 is one, the reference number of P frames in display order 4 is five, the reference number of B frames in display order 6 is two, and so on. Fig. 6 is only a rough frame-level analysis by the relevant hardware or software of the video decoding apparatus, and the reference relationship of each specific block in the reference frame image cannot be analyzed by the rough analysis.

104. According to the preset power consumption threshold value and the reference times of one or more reference positions, the reference position needing to be stored in the preset memory is determined from the one or more reference positions and stored in the preset memory, and the power consumption generated by storing the reference position in the preset memory and reading the reference position from the preset memory is smaller than or equal to the preset power consumption threshold value.

For example, it can be known from fig. 6 that the number of times that the I frame in the display order of 0, the B frame in the display order of 2, the P frame in the display order of 4, the B frame in the display order of 6, and the B frame in the display order of 8 are referred to by other image frames in the image group is plural.

However, in video decoding, the portion to be referred to will go in and out of the DRAM many times, i.e., the portion to be referred to will be read from the DRAM many times, and if one copy of data is expected to be read many times, the consumed energy is hundreds of times when it is repeatedly read. In order to improve compression rate, the current video standard uses a common method of multiple reference image frames for encoding, and this behavior represents that some data are repeatedly used, usually because of high correlation in time domain. In addition, as the video code streams with high frame rates are more and more, the repeatability in the time domain is improved, the coding effect of the same reference image frame can be repeatedly used, and most video coders can generate the video code streams. If the part repeatedly read during decoding is stored in a storage medium with low power consumption, the energy consumption during video playing can be greatly reduced.

For example, in the embodiment of the present application, a reference position that needs to be stored in a preset memory may be determined from one or more reference positions, for example, a reference image frame, a reference stripe, or a reference area that is referred to multiple times is stored in the preset memory, so that image data of the reference image frame, the reference stripe, or the reference area may be read when an object to be decoded (for example, the image frame to be decoded, the stripe to be decoded, or the area to be decoded) is subsequently decoded. In addition, the power consumption generated by storing the reference position in the preset memory and reading the reference position from the preset memory is less than or equal to the preset power consumption threshold value, and the power consumption of the video decoding device can be reduced by reading and writing data by adopting the preset memory with low power consumption.

105. And decoding the object to be decoded according to the reference position.

For example, the reference image frame, the reference stripe, or the reference area stored in the preset memory may be referred to by the object to be decoded a plurality of times, and thus the reference image frame, the reference stripe, or the reference area stored in the preset memory may be read a plurality of times when decoding is performed. After reading the image data of the reference image frame, the reference strip or the reference area from the preset memory, the object to be decoded may refer to the image data of the reference image frame, the reference strip or the reference area, that is, the object to be decoded may be decoded according to the read image data of the reference image frame, the reference strip or the reference area.

For example, in one embodiment, the object to be decoded may include a block to be decoded in an image frame to be decoded, a block to be decoded in a slice to be decoded, or a block to be decoded in a region to be decoded. The block to be decoded in the image frame to be decoded may be decoded according to the read image data of the reference image frame, the block to be decoded in the stripe to be decoded may be decoded according to the read image data of the reference stripe, or the block to be decoded in the area to be decoded may be decoded according to the read image data of the reference area.

It is understood that, in the embodiment of the present application, the video decoding apparatus may obtain the video bitstream, and determine one or more reference positions from the video bitstream. Then, determining the reference times of one or more reference positions, determining the reference position to be stored in the preset memory from the one or more reference positions according to the preset power consumption threshold and the reference times of the one or more reference positions, storing the reference position in the preset memory, and storing the reference position in the preset memory and the power consumption generated by reading the reference position from the preset memory to be less than or equal to the preset power consumption threshold. And then, decoding the object to be decoded according to the reference position. That is, in the embodiment of the present application, the determined image data that needs to be stored in the reference position in the preset memory is stored in the preset memory with lower power consumption, so as to achieve the purpose of reducing the power consumption of the video decoding apparatus. Therefore, the embodiment of the application can reduce the power consumption of the video decoding device.

Referring to fig. 7, fig. 7 is a second flowchart illustrating a method for image processing in a video decoding apparatus according to an embodiment of the present disclosure. The method for processing the image in the video decoding device can be applied to the video decoding device. The flow of the method for image processing in the video decoding device may include:

201. and acquiring a video code stream.

The specific implementation of step 201 can refer to the embodiment of step 101, and is not described herein again.

202. One or more reference image frames, reference stripes or reference regions are determined from the video code stream according to frame header information of the image frames in the video code stream or stripe header (Slice header) information of one or more stripes in the image frames.

For example, the reference relationship of each image frame may be determined according to frame header information of each image frame in the video code stream or slice header information of a slice in the image frame. The data of each image frame may be regarded as a Network Abstraction Layer (NAL) unit, the frame header information is used to identify the start of an image frame, and the frame header information may also be regarded as NAL unit header information, which image frame can be determined by the frame header information, so that the reference image frame can be determined. The slice header is used to store the overall information of the slice, such as the type of the current slice, and the slice header information can determine which slice is, and thus can determine the reference slice. The reference region may be a region in the image frame or a region in the strip.

203. Determining the reference times of one or more reference image frames, reference strips or reference regions by preset parameters, the preset parameters including any one or more of the following: the system comprises a network abstraction layer analysis parameter, a strip head analysis parameter, a reference image list correction parameter and a reference image frame marking parameter.

For example, in one embodiment, the reference position may include a reference image frame, a reference stripe, or a reference region, and after determining one or more reference image frames, reference stripes, or reference regions from the video bitstream, the number of times each reference image frame, reference stripe, or reference region is referred to by the object to be decoded needs to be further determined, so as to obtain the number of times each reference frame, reference stripe, or reference region is read.

It should be noted that, when determining the reference times of each reference image frame, reference strip, or reference area, the reference times of one or more reference image frames, reference strips, or reference areas may be determined through preset parameters, where the preset parameters may include any one or more of the following: network abstraction layer analysis parameters, strip head analysis parameters, reference image list correction parameters, reference image frame marking parameters and the like. For example, the network abstraction layer resolution parameter may be a nal _ unit () function, the slice header resolution parameter may be a slice _ header () function, the reference picture list modification parameter may be a ref _ pic _ list _ modification () function, and the reference picture frame marker parameter may be a ref _ pic _ list _ modification () function.

For example, taking h.264 as an example, when performing a coarse frame level analysis, when parsing NAL unit header information or slice header information included in a plurality of image frames, it can be recognized which Zhang Cankao image frames are referred to a plurality of times, for example, it can be determined in advance by using information such as NAL _ ref _ idc variable in NAL _ unit () function, num _ ref _ idx _ active _ override _ flag variable in slice _ header () function, ref _ pic _ list _ modification () function, and dec _ ref _ pic _ marking () function.

For example, the NAL _ unit () function separates NAL units beginning with 00 00 00 00 01 and 00 00 01 from the image frame of h.264 and then directly fills the length of the NAL units. The variable nal _ ref _ idc represents the reference level, representing the reference by other image frames, the higher the reference level, the more important the reference image frame is.

The num _ ref _ idx _ active _ override _ flag variable represents whether the number of actually available reference image frames of the current image frame needs to be reloaded. The syntax elements num _ ref _ idx _ l0_ active _ minus1 and num _ ref _ idx _ l1_ active _ minus1 that have appeared in the picture parameter set specify the number of reference frames actually available in the current reference image frame queue. The syntax elements may be reloaded at the slice header to give a particular image frame a greater degree of flexibility. The location of the slice is known by the num _ ref _ idx _ active _ override _ flag variable.

The ref _ pic _ list _ modification () function is a reference picture list modification function that can be stored in the structure of the slice header, and is defined as follows: the reference picture list RefPicList0 is modified when ref _ pic _ list _ modification _ flag _ l0 is 1, and the reference picture list RefPicList1 is modified when ref _ pic _ list _ modification _ flag _ l1 is 1. The dec _ ref _ pic _ marking () function identifies the decoded reference image frame and the marking (marking) operation is used to move the reference image frame into and out of the reference image frame queue, specifying the symbol of the reference image.

204. If the reference frequency is multiple, storing one or more reference image frames, reference stripes or reference areas with the reference frequency being multiple in a system cache according to a preset power consumption threshold value, and storing the reference image frames, the reference stripes or the reference areas in a dynamic random access memory.

For example, the preset memory may include a first memory and a second memory, where the power consumption of the first memory is greater than that of the second memory, and it should be noted that the first memory may include a dynamic random access memory disposed outside the video decoding apparatus, and the second memory may include a system cache disposed outside the video decoding apparatus, and when the number of times that one or more reference image frames, reference stripes, or reference regions are referred to by the object to be decoded is determined, if the number of times of reference is multiple, the one or more reference image frames, reference stripes, or reference regions that are referred to multiple times may be stored in the system cache and stored in the dynamic random access memory to wait for reading when the video decoding apparatus decodes.

It should be noted that, according to the size of the preset power consumption threshold, the size of the data amount of one or more reference image frames, reference stripes or reference regions which are stored in the system cache and are referred to multiple times can be adjusted.

It is to be understood that, when storing the one or more reference image frames, reference stripes, or reference regions that are referred to a plurality of times in the system cache and in the dynamic random access memory, the one or more reference image frames, reference stripes, or reference regions that are referred to a plurality of times may be stored in the system cache and then stored in the dynamic random access memory, or the one or more reference image frames, reference stripes, or reference regions that are referred to a plurality of times may be stored in both the dynamic random access memory and the system cache, or the one or more reference image frames, reference stripes, or reference regions that are referred to a plurality of times may be stored in the dynamic random access memory and then stored in the system cache.

It should be noted that the reference times of the reference image frame, the reference stripe, or the reference region may be times of being referred to by the same image frame to be decoded, the stripe to be decoded, or the region to be decoded, or times of being referred to by other image frames to be decoded, the stripes to be decoded, or the regions to be decoded in one image group. When the reference image frame, the reference slice, or the reference area, which is referred to a plurality of times, is stored in the system cache and stored in the first memory, the storage may be performed in units of frames, in units of slices, or in units of areas.

It should be noted that, in the embodiment of the present application, the power consumption of the dynamic random access memory is greater than the power consumption of the system cache, and the power consumption generated when the reference image frame, the reference stripe, or the reference region is stored in and read from the dynamic random access memory and the system cache is less than the preset power consumption threshold, so that the power consumption when reading and writing data can be reduced. The preset power consumption threshold may be considered as power consumption generated when the reference image frame, the reference stripe or the reference area are all stored and read by the dram.

205. And if the reference times are multiple times, storing one or more reference image frames, reference strips or reference areas with the reference times being multiple times in a system buffer memory according to a preset power consumption threshold value.

For example, the first memory may include a dynamic random access memory disposed outside the video decoding apparatus, and the second memory may include a system buffer memory disposed outside the video decoding apparatus, and when the number of times that one or more reference image frames, reference stripes, or reference regions are referred to by an object to be decoded is determined, if the number of times is multiple, the one or more reference image frames, reference stripes, or reference regions that are referred to multiple times may be stored in the system buffer memory to wait for reading when the video decoding apparatus decodes.

It should be noted that, according to the size of the preset power consumption threshold, the size of the data amount of one or more reference image frames, reference stripes or reference areas that are stored in the system buffer memory and are referenced multiple times may be adjusted.

It should be noted that the reference times of the reference image frame, the reference stripe, or the reference region may be times of being referred to by the same image frame to be decoded, the stripe to be decoded, or the region to be decoded, or times of being referred to by other image frames to be decoded, the stripes to be decoded, or the regions to be decoded in one image group. When a reference image frame, a reference slice, or a reference area, which is referenced a plurality of times, is stored in the system buffer memory, the storage may be performed in units of frames, in units of slices, or in units of areas.

It should be noted that, in the embodiment of the present application, the power consumption of the dynamic random access memory is greater than the power consumption of the system buffer memory, and the power consumption generated when the reference image frame, the reference stripe, or the reference area is stored in and read from the dynamic random access memory and the system buffer memory is less than the preset power consumption threshold, so that the power consumption when reading and writing data can be reduced. The preset power consumption threshold may be considered as power consumption generated when the reference image frame, the reference stripe or the reference area are all stored and read by the dram.

206. And if the reference times are one, storing one or more reference image frames, reference strips or reference areas with the reference times as one in a dynamic random access memory according to a preset power consumption threshold value.

For example, the first storage may include a dynamic random access memory disposed outside the video decoding apparatus, and after determining the number of times that one or more reference image frames, reference stripes, or reference regions are referred to by an object to be decoded, if the number of reference times is one, the one or more reference image frames, reference stripes, or reference regions that are referred to by the number of reference times is stored in the dynamic random access memory to wait for the video decoding apparatus to decode.

It should be noted that the reference times of the reference image frame, the reference stripe, or the reference area may be times of being referred to by the same image frame to be decoded, stripe to be decoded, or area to be decoded, or times of being referred to by other image frames to be decoded, stripes to be decoded, or areas to be decoded in an image group. When the reference image frame, the reference slice, or the reference area, which is referred to once, is stored in the dynamic random access memory, the reference image frame, the reference slice, or the reference area may be stored in units of frames, slices, or areas.

It should be noted that, in the embodiments of the present application, one or more reference image frames, reference stripes, or reference regions that are referenced once may be read once when referenced by the object to be decoded (for example, one or more reference image frames that are referenced once may be read once when referenced by the image frame to be decoded, for example, one or more reference stripes that are referenced once may be read once when referenced by the stripe to be decoded, for example, one or more reference regions that are referenced once may be read once when referenced by the region to be decoded).

In the embodiment of the application, the power consumption generated when the reference image frame, the reference strip or the reference area is stored in and read from the dynamic random access memory is less than the preset power consumption threshold value, so that the power consumption during reading and writing data can be reduced. The preset power consumption threshold may be considered as power consumption generated when the reference image frame, the reference stripe or the reference area are all stored and read by the dram.

For example, in one embodiment, the first memory may comprise a dynamic random access memory disposed external to the video encoder, i.e., the first memory may comprise a DRAM disposed external to the video decoding apparatus, and the second memory may comprise a system cache or system buffer memory disposed external to the video decoding apparatus, i.e., the second memory may comprise a Sys $ or SysBuf disposed external to the video decoding apparatus. Of course, the second memory may also be other low power consumption memories, etc.

In the embodiment of the present application, the Sys $ or SysBuf is composed of a plurality of SRAMs, the first memory may be a DRAM, a power consumption of the DRAM is greater than a power consumption of the Sys $ or SysBuf external to the video decoding apparatus, and a power consumption of storing and reading the reference image frame, the reference stripe, or the reference region to and from the Sys $ and the DRAM is less than a preset power consumption threshold, or a power consumption of storing and reading the reference image frame, the reference stripe, or the reference region to and from the SysBuf and the DRAM is less than a preset power consumption threshold. Therefore, the power consumption during reading and writing data can be reduced, and the preset power consumption threshold can be considered as the power consumption generated when the reference image frame, the reference strip or the reference area are stored and read by the DRAM.

Referring to fig. 8, fig. 8 is a schematic diagram illustrating a comparison of energy consumed by the sram and the dram in reading data according to an embodiment of the present disclosure. The energy consumed for reading data in SRAM is about 100 times different from that consumed for reading data in DRAM, i.e., the power consumption for reading data in SRAM is much less than that for reading data in DRAM. By storing the reference image frame, the reference strip, or the reference region, which is referenced a plurality of times, in the Sys $ and the DRAM, or in the SysBuf only, when reading the image data of the reference image frame, the reference strip, or the reference region in the Sys $ or the SysBuf, the power consumption in reading the data can be reduced. As can be seen from the above, by storing reference image frames, reference stripes, or reference regions, which are referenced multiple times by an object to be decoded, in Sys $ and DRAM, or only in SysBuf, the energy consumption required for the overall video playback can be greatly reduced.

When decoding is performed, the amount of data in and out of a high-power storage (such as a DRAM) is limited during parsing of a video stream, so that the parsing step does not become a bottleneck for power consumption, no matter NAL unit parsing and slice header parsing, or Entropy decoding (Entropy decoding), motion vectors or other symbols are interpreted from the video stream. In the decoding process, the most important part requiring high data throughput is the data required by the motion compensation step, which requires a large bandwidth from the DRAM. However, since the video code stream is analyzed in advance, the reference image frame that is referred to many times or the part of the reference image frame that is used many times can be pre-stored in the low-power-consumption storage (such as Sys $ or SysBuf composed of SRAM), thereby ensuring that the power consumption of the read data is controlled near the expected value during video decoding, and enabling the hardware or software of the video decoding apparatus to complete the decoding operation as soon as possible.

For example, referring to fig. 9, fig. 9 is a schematic view of a scenario that image data to be referred to multiple times is stored in a system cache or a system buffer according to an embodiment of the present application. In fig. 9, storing image data that is referred to by an object to be decoded a plurality of times in Sys $ and DRAM, or storing image data that is referred to by an object to be decoded a plurality of times only in SysBuf, can greatly reduce the energy consumption required for the entire video playback.

For another example, as shown in fig. 6, after performing rough frame-level analysis or estimation, the related hardware or software of the video decoding apparatus may determine that some reference image frames will be used multiple times, that is, may determine that some reference image frames are referred to by the image frame to be decoded multiple times. The reference image frames with multiple reference times can be stored in the power-saving Sys and stored in the DRAM, or the reference image frames with multiple reference times can be stored only in the power-saving sysBuf, so that the video decoding device can maintain an expected low power consumption state as much as possible, the service time of the video decoding system is prolonged, and the system is prevented from overheating.

For example, in one embodiment, by storing the entire reference image frame in the video bitstream or the image data in the reference image frame that is referred to by the object to be decoded multiple times in a low power consumption storage space, for example, in Sys $ and DRAM, or only in SysBuf, etc., the power consumption of the entire video decoding system during operation of the video decoding apparatus can be effectively maintained, and the user experience can be further improved. The prediction can be actually analyzed by hardware or software of the video decoding apparatus, or can be derived from known factors such as application scene or Group of pictures structure (Group of pictures structure) to determine which image data is suitable for writing into low power consumption memory such as Sys $ or SysBuf.

For example, in the case where only rough judgment of reference image frames is performed, the reference times of these reference image frames are generally used as priorities, and the higher the reference times, the higher the priority stored in Sys $ or SysBuf. For example, the reference image frame with the largest number of reference times is stored in the Sys $ or SysBuf first, and so on, the reference image frames with the largest number of reference times are stored in the Sys $ or SysBuf in sequence according to the sequence of the reference times from high to low, so that the power consumption and the energy consumption can be reduced to the expected target to the greatest extent in the decoding process. But since this decision method is rather crude, it cannot be 100% achieved to approach or fall below the desired power saving goal.

Referring to fig. 10, fig. 10 is a scene schematic diagram for roughly analyzing reference relationships of a plurality of image frames in a video bitstream according to an embodiment of the present application. In fig. 10, taking an h.264 video code stream as an example, a rough analysis is performed on the h.264 video code stream, and through the analysis of header information or slice header information of a NAL unit, it can be analyzed that the number of times that some image frames are referred to by other image frames is greater than that of times that other image frames are referred to. For example, for the code stream corresponding to 7 image frames in fig. 10, by roughly analyzing the code stream corresponding to the 7 image frames, the reference image list can be known, and the data input amount of how many DRAMs are omitted can be calculated. Then, some number of image frames that the maximum probability can satisfy are selected to be stored in Sys $ or SysBuf.

For example, by storing one or more reference image frames, reference slices, or reference regions that are referenced multiple times in the Sys $ and in the DRAM, the video decoding apparatus can read from the DRAM and the Sys $ respectively when decoding.

If the reference number is one, one or more reference image frames, reference stripes, or reference areas with one reference number may be stored in the DRAM for reading when the video decoding apparatus decodes the reference image frames, the reference stripes, or the reference areas.

For example, in fig. 10, the reference number of the image frame 0 is one, the reference number of the image frame 1 is three, the reference number of the image frame 2 is three, the reference number of the image frame 3 is one, the reference list includes the image frame 0, the image frame 1, the image frame 2, and the image frame 3, and since the reference numbers of the image frame 1 and the image frame 2 are three, the image frame 1 and the image frame 2 are stored in the Sys $ and also stored in the DRAM, or only the image frame 1 and the image frame 2 are stored in the SysBuf. Image frame 0 and image frame 3 are stored in DRAM.

For example, in one embodiment, the access power consumption models of DRAM, system bus, sys $ and SysBuf are obtained through some simple measurements or experiments, and are not described herein. Assuming that there is already data flowing in and out of the DRAM, the system bus, the Sys $ and the SysBuf-related power consumption model, it can be deduced how much power consumption is reduced or how much the power consumption corresponds to how much the data access amount of the DRAM is reduced. According to the code stream header information of the image frames or the strips, which image frames or strips need to be written into the Sys $ or SysBuf can be judged, and the expected power consumption reduction value in decoding can be achieved. The coarse analysis approach may achieve the desired power reduction but not the optimal power reduction.

For example, if reference image frames, reference stripes or reference regions with multiple reference times are stored in the Sys $ and DRAM outside the video decoding apparatus in advance, or stored in the SysBuf outside the video decoding apparatus in advance, please refer to fig. 11 to 13 together, and fig. 11 is an architecture diagram of a video decoding system using a system cache according to an embodiment of the present invention. Fig. 12 is a schematic diagram of another architecture of a video decoding system using a system cache according to an embodiment of the present application. Fig. 13 is a schematic architecture diagram of a video decoding system using a system buffer memory according to an embodiment of the present application. Stored in the Sys $ or SysBuf are reference image frames, reference strips, or reference regions that are referenced multiple times.

Taking fig. 11 as an example, sys $ may read data from DRAM through DramC, and data that Sys $ reads from DRAM through DramC may be read by the central processor and the video decoding apparatus. Fig. 11 is merely one configuration of the video decoding system when the system cache is used, and other configurations may be adopted for the video decoding system when the system cache is used, for example, the video decoding system further includes a display processor and the like. When the video decoding device needs to decode, the image data of the reference image frame, the reference strip or the reference region with the reference times stored in the Sys $ can be directly read, and in addition, the image data of the reference image frame, the reference strip or the reference region with the reference times is also read from the DRAM through the DRAM, and then is read by the video decoding device.

The access behavior of the image data can be predicted during video decoding, so that the corresponding reduction of power consumption according to requirements is realized, the storage mode of the image data is intelligently selected, and the power consumption of the video decoding device is further reduced. The position of image data storage can be changed according to the frame reference relation during decoding, so that the repeated reading times of the reference image frame stored in a low-power-consumption memory such as Sys $ and the like are properly increased, the power consumption is properly reduced, and the power consumption caused by data entering and exiting in the video decoding device can be ensured to be always maintained in a desired state. If the low power memory, such as Sys $ has high speed bandwidth at the same time, the bandwidth of the DRAM can be reduced even further.

It should be noted that the video decoding system in fig. 12 and fig. 13 is also only one of the architectures, and in a specific application, the modifications can be made according to actual requirements, such as adding a display processor, and the like.

207. Reading image data of a required reference image frame, a reference strip or a reference area from a preset memory, if the read image data of the reference image frame is read, decoding a block to be decoded in the image frame to be decoded according to the read image data of the reference image frame, if the read image data of the reference strip is read, decoding the block to be decoded in the strip to be decoded according to the read image data of the reference strip, and if the read image data of the reference area is read, decoding the block to be decoded in the area to be decoded according to the read image data of the reference area.

For example, an image frame, strip or region may be divided into a plurality of non-overlapping blocks that form a rectangular array, where each block is a block of N × N pixels, such as a block of 4 × 4 pixels, a block of 32 × 32 pixels, a block of 128 × 128 pixels, and so on.

When an image frame to be decoded, a stripe to be decoded or a block to be decoded in a region to be decoded is decoded, image data of a reference image frame, a reference stripe or a reference region to be referred to needs to be read from a preset memory. For example, if a block to be decoded needs to refer to image data of multiple reference image frames, reference stripes, or reference regions when decoding, the image data of the multiple reference image frames, reference stripes, or reference regions may be read once from a first memory (e.g., DRAM) and multiple times from a second memory (e.g., sys $) when reading the image data of the multiple reference image frames, reference stripes, or reference regions.

Because the cost of the SRAM is higher, the cost of the DRAM is lower, the SRAM cannot be made too large generally under the condition of considering the cost, and the DRAM can be made to be larger, therefore, in order to reduce the power consumption when reading data, the embodiment of the application can divide the original reading times from the DRAM into several times of reading from the SRAM and several times of reading from the DRAM, and can reduce the power consumption of reading data on the whole. It should be noted that the number of times of reading from the SRAM and the number of times of reading from the DRAM can be adjusted to meet the requirements of different power consumptions.

For example, in one embodiment, when reading the image data of the reference image frame, the reference stripe, or the reference region, which requires a plurality of reference times, the image data may be read from Sys $ first, and when the number of reading times is greater than or equal to a preset number threshold, the reading is switched to the reading of the unread image data from the DRAM. When reading the same image data, the DRAM consumes more than 100 times as much energy as the SRAM consumes. Therefore, by reading a part of the image data of the reference image frame, the reference band, or the reference area, which requires a plurality of reference times, from the Sys $ and another part of the data from the DRAM, the power consumption for reading the data can be reduced.

For example, in one embodiment, the image data of the reference image frame, reference strip, or reference region that requires multiple references is read directly from the SysBuf. When reading the same image data, the DRAM consumes more than 100 times as much energy as the SRAM consumes. Therefore, by reading image data of a reference image frame, a reference band, or a reference region, which requires a plurality of reference times, from the SysBuf, power consumption for reading data can be reduced.

It should be noted that, if the read image data of the reference image frame is read, the block to be decoded in the image frame to be decoded is decoded according to the read image data of the reference image frame, if the read image data of the reference stripe is read, the block to be decoded in the stripe to be decoded is decoded according to the read image data of the reference stripe, and if the read image data of the reference area is read, the block to be decoded in the area to be decoded is decoded according to the read image data of the reference area.

Referring to fig. 14, fig. 14 is a schematic diagram of a power consumption curve when reading and writing data from the Sys $ or SysBuf provided in the embodiment of the present application. The video decoding apparatus replaces much of the DRAM power consumption with the power consumption of Sys $ or SysBuf, which can greatly reduce the power consumption.

208. And if the reference times of the image frame to be decoded, the stripe to be decoded or the area to be decoded are multiple times, storing the decoded block of the block to be decoded in a system cache and storing the decoded block in a dynamic random access memory.

For example, after decoding an image frame to be decoded, a stripe to be decoded, or a block to be decoded in a region to be decoded, if the block to be decoded is subsequently referred to by other blocks to be decoded (e.g., other image frames to be decoded, stripes to be decoded, or blocks to be decoded in the region to be decoded) for multiple times, the block to be decoded is stored in the system cache and stored in the dynamic random access memory.

209. And if the reference times of the image frame to be decoded, the stripe to be decoded or the area to be decoded are multiple times, storing the decoded block of the block to be decoded in a system buffer memory.

For example, after the image frame to be decoded, the slice to be decoded, or the block to be decoded in the area to be decoded is decoded, if the block to be decoded is subsequently referred to by other blocks to be decoded (for example, other image frames to be decoded, slices to be decoded, or blocks to be decoded in the area to be decoded) for multiple times, the block to be decoded is stored in the system buffer memory.

210. And if the reference times of the image frame to be decoded, the stripe to be decoded or the area to be decoded are one time, storing the block decoded by the block to be decoded in a dynamic random access memory.

For example, after the image frame to be decoded, the stripe to be decoded, or the block to be decoded in the area to be decoded is decoded, if the block to be decoded is referred to by other blocks to be decoded (for example, other image frames to be decoded, the stripe to be decoded, or the block to be decoded in the area to be decoded) subsequently, the block to be decoded is stored in the dynamic random access memory.

And if the image frame to be decoded, the stripe to be decoded or other blocks to be decoded still need to be decoded, decoding other blocks to be decoded. And if all the blocks to be decoded in the image frame, the strip to be decoded or the area to be decoded are decoded, decoding other image frames, strips or areas until all the image frames, strips or areas to be decoded are decoded.

It can be understood that the embodiments of the present application are based on the behavior of predicting data access (i.e., the behavior of repeatedly reading) during video decoding, so as to implement intelligent selection of data storage manner, so as to reduce the power consumption of the video decoding apparatus. The position of image data storage can be changed according to the frame reference relation during decoding, so that the repeated reading times of the reference image frame stored in a low-power-consumption memory such as Sys $ and the like are properly increased, the power consumption is properly reduced, and the power consumption caused by data entering and exiting in the video decoding device can be ensured to be always maintained in a desired state. If a low power memory, such as Sys $ has high speed bandwidth at the same time, the bandwidth of the DRAM can be reduced even further.

The embodiment of the application can ensure that the power consumption of the video decoding device is controllable, and can enable hardware or software of the video decoding device to finish decoding work as soon as possible, and fully utilizes the predictable behavior that the video decoding device can repeatedly read the reference image frame or the reference strip for many times to change the storage characteristic of the read data. The speed of reading data is not limited by power consumption, and thus the video decoding apparatus is not overheated. In addition, the SRAM in the Sys $ or SysBuf has low time delay during reading and writing, so that the processing frame rate can be improved, and the reaction time delay can be reduced. Due to the fact that power consumption can be greatly reduced, the service time of a battery in the video decoding device can be prolonged, and user experience is improved.

Referring to fig. 15, fig. 15 is a third flowchart illustrating a method for image processing in a video decoding apparatus according to an embodiment of the present disclosure. The method for processing the image in the video decoding device can be applied to the video decoding device. The flow of the method for image processing in the video decoding device may include:

301. and acquiring a video code stream.

The specific implementation of step 301 can refer to the embodiment of step 101, and is not described herein again.

302. One or more reference motion vectors are obtained from the video bitstream.

For example, a video code stream is analyzed, one or more reference motion vectors may be obtained from the video code stream, each reference motion vector corresponds to a reference block, and when a block to be decoded is decoded, the reference block needs to be referred to. The relative displacement of the reference block and the block to be coded may be taken as the reference motion vector. And the corresponding reference block is obtained by referring to the motion vector, so that the fine analysis can be realized. The reference times of each reference block by other blocks to be decoded in one group image group may be one time or multiple times. For example, when a reference block is referred to by only one block to be decoded, the reference number of times of the reference block is one, and when the reference block is referred to by a plurality of blocks to be decoded, the reference number of times of the reference block is a plurality of times.

303. And acquiring one or more corresponding reference blocks from one or more image frames of the current video code stream according to the one or more reference motion vectors.

For example, after one or more reference motion vectors are obtained from the video bitstream, since each reference motion vector corresponds to one reference block, one or more corresponding reference blocks may be obtained from one or more image frames of the video bitstream according to the one or more reference motion vectors.

304. The number of references to one or more reference blocks is determined.

For example, after acquiring one or more corresponding reference blocks from one or more image frames of a video code stream according to one or more reference motion vectors, the number of times the one or more reference blocks are referred to, i.e., the number of times the one or more reference blocks are referred to by a block to be decoded or a sub-block of the block to be decoded, may be determined.

For example, referring to fig. 16, fig. 16 is a schematic view of a scene of a reference relationship between blocks in image frames in an image group in a video bitstream according to an embodiment of the present disclosure. Fig. 16 illustrates an example in which 9 image frames are included in one image group. In other embodiments, the number of image frames included in an image group may be adjusted according to specific needs. The display order of the image frames may be the same as or different from the decoding order within an image group. The display order and decoding order of the image frames in the image group as shown in fig. 16 are different.

From the arrow direction in fig. 16, it can be seen that the number of times each block is referred to by other blocks in the group image, that is, the number of times each block is referred to by other blocks pointed by the arrow can be determined according to the arrow direction, and when a certain block is referred to by other blocks one or more times, the block can be used as a reference block. For example, the reference number of times of the reference block in the I frame in fig. 16 is four, the reference number of times of the reference block in the B frame in the display order 2 is two, the reference number of times of the reference block in the B frame in the display order 3 is one, the reference number of times of the reference block in the P frame in the display order 4 is five, the reference number of times of the reference block in the B frame in the display order 6 is two, and so on.

305. According to the preset power consumption threshold value and the reference times of the one or more reference blocks, one or more reference blocks needing to be stored in a preset memory are determined from the one or more reference blocks and stored in the preset memory, and the power consumption generated by storing the reference blocks in the preset memory and reading the reference blocks from the preset memory is smaller than or equal to the preset power consumption threshold value.

For example, in the embodiment of the present application, one or more reference blocks that need to be stored in the preset memory may be determined from one or more reference blocks, for example, the determined reference blocks that need to be stored in the preset memory may be reference blocks that have multiple reference times, that is, the reference blocks that are referred to by the block to be decoded or the sub-block of the block to be decoded multiple times are determined to be stored in the preset memory, so as to facilitate reading of image data of the reference blocks when decoding a subsequent block to be decoded or a sub-block of the block to be decoded. It should be noted that, the power consumption generated by storing the reference block in the preset memory and reading the reference block from the preset memory is less than or equal to the preset power consumption threshold, and the power consumption of the video decoding apparatus can be reduced by reading and writing data using the preset memory with low power consumption.

306. And decoding the block to be decoded or the sub-blocks of the block to be decoded according to the reference block.

For example, the reference block stored in the preset memory may be referred to by the block to be decoded or the sub-block of the block to be decoded a plurality of times, and thus the reference block stored in the preset memory may be read a plurality of times when decoding is performed. After the image data of the reference block is read from the preset memory, the block to be decoded or the sub-block of the block to be decoded can refer to the image data of the reference block, that is, the block to be decoded or the sub-block of the block to be decoded can be decoded according to the read image data of the reference block.

It is understood that, in the embodiment of the present application, the video decoding apparatus may obtain a video bitstream, and obtain one or more reference motion vectors according to the video bitstream. Then, acquiring one or more corresponding reference blocks from one or more image frames of the video code stream according to the one or more reference motion vectors; determining a reference number of times of one or more reference blocks; according to the preset power consumption threshold value and the reference times of the one or more reference blocks, one or more reference blocks needing to be stored in a preset memory are determined from the one or more reference blocks and stored in the preset memory, and the power consumption for storing the reference blocks in the preset memory and reading the reference blocks from the preset memory is smaller than or equal to the preset power consumption threshold value. And then, decoding the block to be decoded or the sub-blocks of the block to be decoded according to the reference block. In other words, in the embodiment of the present application, the purpose of reducing the power consumption of the video decoding apparatus is achieved by storing the determined reference image frame or the determined image data of the reference stripe, which needs to be stored in the preset memory, in the preset memory with lower power consumption. Therefore, the embodiment of the application can reduce the power consumption of the video decoding device.

Referring to fig. 17, fig. 17 is a fourth flowchart illustrating a method for image processing in a video decoding apparatus according to an embodiment of the present disclosure. The method for processing the image in the video decoding device can be applied to the video decoding device. The flow of the method for image processing in the video decoding device may include:

401. and acquiring a video code stream.

The specific implementation of step 401 can refer to the embodiment of step 101, and is not described herein again.

402. The video code stream is entropy decoded to obtain one or more Motion Vector Difference (MVD).

For example, after the video code stream is obtained, entropy decoding may be performed on the video code stream, for example, frame header information of an image frame, header information of a NAL unit, or a slice header is decoded, and after the entropy decoding, one or more motion vector differences may be obtained, and at the same time, a quantized residual may also be obtained. Wherein, the residual refers to a difference value between the block to be coded and one or more blocks with the minimum coding cost.

For example, in an embodiment, the entropy decoding 402 of the video code stream to obtain one or more motion vector difference values may include:

and entropy decoding the video code stream to obtain one or more motion vector difference values and a quantized first residual error.

For example, by entropy decoding a video code stream, one or more motion vector difference values may be obtained, and at the same time, a quantized first residual may be obtained. The quantized first residual refers to a first residual obtained by performing forward transform and quantization on the residual during encoding, wherein the residual may be a difference obtained by subtracting a two-dimensional pixel at a position corresponding to a searched block from a two-dimensional pixel of a block to be encoded.

403. One or more reference motion vectors are obtained based on the one or more motion vector difference values and the corresponding motion vector predictor.

For example, after entropy decoding is performed on a video code stream to obtain one or more motion vector difference values, one or more reference motion vectors may be obtained according to the one or more motion vector difference values and corresponding motion vector prediction values, for example, a sum obtained by adding the motion vector difference values and the motion vector prediction values is used as a reference motion vector.

404. One or more reference blocks are obtained from one or more image frames of the video code stream according to one or more reference motion vectors.

For example, each reference motion vector corresponds to a reference block, so that one or more corresponding reference blocks can be determined from one or more image frames of the video code stream according to the one or more reference motion vectors, and thus the corresponding reference block can be obtained, where the reference block is a decoded block.

405. The number of references to one or more reference blocks is determined.

For example, after acquiring a reference block in an image frame, the reference times of one or more reference blocks may be determined, where the reference times refer to the times that the reference block is referred to by a block to be decoded or a sub-block in the block to be decoded.

For example, after performing the fine block level analysis or estimation in fig. 16 through the related hardware or software of the video decoding apparatus, it can be determined that some regions of some image frames will be referred to by some regions of other image frames many times during decoding, or some blocks of some image frames will be referred to by some blocks of other image frames many times during decoding, which is suitable for being stored in a power-saving low-power memory such as Sys $ and DRAM, or directly stored in a low-power memory such as SysBuf, so as to maintain the data throughput at the expected low-power value during video decoding, increase the service time of the video decoding system, and prevent the video decoding system from overheating.

For example, if the video code stream is further analyzed, the motion vector difference obtained by entropy decoding in the decoding process is taken out to restore the reference motion vector, so that fine reference motion vector analysis is achieved, which areas of which reference image frames are used in several adjacent image frames can be further analyzed, and the times that areas of certain image frames are used as reference areas can be more accurately distinguished, and the areas are used as the basis of priority ordering in a low-energy-consumption memory. For example, referring to fig. 18, fig. 18 is a schematic view of a scene for finely analyzing reference relationships of blocks in a plurality of image frames in a video bitstream according to an embodiment of the present application. In fig. 18, the h.264 video code stream is taken as an example, the h.264 video code stream is subjected to fine analysis, and through the analysis of the slice body, it can be analyzed that the number of times that some blocks are referred to by other blocks is greater than that of times that other blocks are referred to. For example, for the code stream corresponding to 5 image frames in fig. 18, the video code stream corresponding to the 5 image frames is finely analyzed, and each portion (e.g., each macroblock, which may also be referred to as a block) of each image frame can be determined how many times it will be referred to by other blocks through reference motion vector analysis.

As shown in fig. 18, the reference times of the block 0, the block 2, and the block C in the image frame 0 are all one time, the reference time of the block 8 in the image frame 0 is two times, the reference time of the block 6 in the image frame 1 is two times, the reference time of the block 4 in the image frame 1 is three times, the reference time of the block 9 in the image frame 1 is four times, the reference times of the block 5, the block a, and the block D in the image frame 1 are all one time, and the reference times of the block 7, the block 3, and the block B in the image frame 2 are all one time.

406. If the reference times are multiple times, one or more reference blocks with the reference times being multiple times are stored in a system cache and stored in a dynamic random access memory according to a preset power consumption threshold.

For example, the preset memory may include a first memory and a second memory, and power consumption of the first memory is greater than power consumption of the second memory. The first memory may include a dynamic random access memory disposed external to the video decoding apparatus, and the second memory may include a system cache disposed external to the video decoding apparatus. After the reference times of the one or more reference blocks are determined, that is, the times that the one or more reference blocks are referred to by the block to be decoded or the sub-block in the block to be decoded are determined, if the reference times are multiple times, the one or more reference blocks with the multiple reference times may be stored in the system cache and stored in the dynamic random access memory.

For example, when it is desired to know how much power consumption is reduced to enable the video decoding system to operate, it can be deduced how much the bandwidth of the dram should be reduced. The bandwidth required for the reduced dram is computationally pre-determined for each image frame. By pre-calculating which regions (reference blocks) in the image frame need to be stored in Sys $, the data throughput limit of the dram can be met or lowered as much as possible during the subsequent motion compensation, and the limit is derived from how much power consumption or energy consumption needs to be reduced.

It should be noted that, in the embodiment of the present application, the power consumption of the dynamic random access memory is greater than the power consumption of the system cache. After one or more reference blocks that are referenced multiple times are fetched, i.e., one or more reference blocks that are referenced multiple times by the block to be decoded or a sub-block in the block to be decoded are fetched, they may be stored in a system cache and stored in a dynamic random access memory.

As shown in fig. 18, since the reference number of times of the block 8 in the image frame 0 is two, the reference number of times of the block 6 in the image frame 1 is two, the reference number of times of the block 4 in the image frame 1 is three, and the reference number of times of the block 9 in the image frame 1 is four, the block 8 in the image frame 0, the block 6 in the image frame 1, the block 4 in the image frame 1, and the block 9 in the image frame 1 are stored in the system cache and stored in the dynamic random access memory. It should be noted that, the power consumption generated by storing and reading one or more reference blocks with multiple reference times from the dynamic random access memory and the system cache is less than or equal to the preset power consumption threshold, so that the power consumption in reading data can be reduced.

407. And if the reference times are multiple times, storing one or more reference blocks with the reference times in a system buffer memory according to a preset power consumption threshold.

For example, the preset memory may include a second memory including a system buffer memory provided outside the video decoding apparatus. After the reference times of the one or more reference blocks are determined, that is, the times that the one or more reference blocks are referred to by the block to be decoded or the sub-block in the block to be decoded are determined, if the reference times are multiple, the one or more reference blocks which are referred to multiple times are stored in a system buffer memory according to a preset power consumption threshold.

For example, when it is desired to know how much power consumption is reduced to enable the video decoding system to operate, it can be deduced how much the bandwidth of the dram should be reduced. The bandwidth of the dram that needs to be reduced can be computationally pre-derived for each image frame. By pre-calculating which regions (reference blocks) in the image frame need to be stored in the SysBuf, the data throughput limit of the first memory can be met or lowered as much as possible during subsequent motion compensation, which limit results from how much power consumption or energy consumption needs to be reduced.

408. And if the reference times are once, storing one or more reference blocks with the reference times of once in the dynamic random access memory according to a preset power consumption threshold value.

For example, the preset memory may include a first memory, and the first memory may include a dynamic random access memory provided outside the video decoding apparatus. After the reference times of the one or more reference blocks are determined, that is, the times that the one or more reference blocks are referred to by the to-be-decoded block or the sub-blocks in the to-be-decoded block are determined, if the reference times are one, the one or more reference blocks with the reference times being one are stored in a dynamic random access memory according to a preset power consumption threshold value so as to be convenient for reading when a video decoding device decodes. By storing the reference block with the reference frequency of multiple times in the low-power-consumption memory and storing the reference block with the reference frequency of one time in the dynamic random access memory, the power consumption generated when data is read can be reduced on the whole.

For example, the first memory may include a DRAM provided outside the video decoding apparatus, and the second memory may include a system cache or a system buffer memory provided outside the video decoding apparatus, i.e., the second memory may include a Sys $ or SysBuf provided outside the video decoding apparatus. Of course, the second memory may also be other low power consumption memories, etc. The power consumption of the DRAM is greater than the power consumption of Sys $ or SysBuf. Assuming that there is already a power consumption model associated with the flow of data into and out of the DRAMs, the system bus, the Sys $ and the sysBuf, it can be deduced how much power consumption is reduced or how much the amount of data access of the DRAMs is reduced corresponding to the power consumption. According to the code stream header information of the image frame or the stripe, which blocks need to be written into the Sys $ or the SysBuf can be judged, and the expected power consumption reduction value in decoding can be achieved.

The information of the reference motion vector decoded from the detail information of the code stream is used to determine which blocks are suitable for being written into Sys $ or SysBuf. Generally, the more times a reference block is referenced by a block to be decoded or a subblock within a block to be decoded, the more suitable it is to write to the Sys $ or SysBuf, the more decoding power reduction value can be achieved with a smaller Sys $ or SysBuf footprint. The fine analysis method only seeks to achieve the expected power consumption/energy consumption reduction, and does not pursue the optimal power consumption/energy consumption reduction. As long as the amount of data in and out reduction of the reconstructed area/region to the Sys $ or SysBuf is calculated, it can be deduced how much power/energy consumption is saved.

Referring to FIG. 8, the difference between the energy of the read SRAM and the energy of the read DRAM is about 100 times, i.e., the energy of the read SRAM is much smaller than the energy of the read DRAM. By storing a reference block with a plurality of reference times in the Sys $ and the DRAM, or storing a reference block with a plurality of reference times in the SysBuf (the Sys $ or the SysBuf is configured by a plurality of SRAMs), when image data of the reference block is read from the Sys $ and the DRAM, or from the SysBuf, power consumption in reading the data as a whole can be reduced.

409. Reading the image data of the required reference block from a preset memory, and decoding the block to be decoded or the sub-blocks in the block to be decoded according to the read image data of the reference block.

For example, when a block to be decoded or a sub-block in the block to be decoded needs to be decoded, image data of a reference block needs to be referred to, and at this time, the image data of the reference block needs to be read. Wherein, a block may include a plurality of sub-blocks arranged in a rectangular array. For example, when reading the image data of the reference block, if the reference number of times of the reference block to be referred to by the block to be decoded or the sub-block in the block to be decoded is one, the image data of the reference block is directly read from the first memory, if the reference number of times of the reference block to be referred to by the block to be decoded or the sub-block in the block to be decoded is multiple, the image data of the reference block may be read from the first memory once and the rest of the image data may be read from Sys $, or if the reference number of times of the reference block to be referred to by the block to be decoded or the sub-block in the block to be decoded is multiple, the image data of the reference block may be read from SysBuf.

It is understood that, for example, when the second memory is Sys $, the number of reads from Sys $ may be greater than the number of reads from DRAM, the number of reads from Sys $ may be less than the number of reads from DRAM, or the number of reads from Sys $ may be equal to the number of reads from DRAM, specifically, several reads from DRAM and Sys $ respectively, and is set according to a specific scenario, and the present application is not limited thereto. Therefore, by reading a part of the image data of the reference block from the Sys $ and another part of the data from the DRAM, the power consumption for reading the data can be reduced. As can be seen from fig. 14, the video decoding apparatus replaces much of the DRAM power consumption by the power consumption of Sys $ or SysBuf, greatly reducing the power consumption.

For example, in an embodiment, the decoding of the block to be decoded or the sub-blocks in the block to be decoded according to the read image data of the reference block in 409 may include:

carrying out inverse quantization and inverse transformation on the first residual error to obtain a second residual error;

obtaining a predicted value of the block to be decoded or a sub-block in the block to be decoded according to the reference motion vector and a reference block;

and acquiring the block to be decoded or the sub-block decoded by the sub-block in the block to be decoded according to the second residual and the predicted value of the block to be decoded or the sub-block in the block to be decoded.

For example, in an embodiment, the decoding the block to be decoded or the sub-blocks in the block to be decoded according to the read image data of the reference block in 409 may further include:

and acquiring video stream decoding data according to the block decoded by the block to be decoded or the sub-block decoded by the sub-block in the block to be decoded.

For example, please refer to fig. 19, fig. 19 is a schematic view of a scene decoded by the video decoding apparatus according to the embodiment of the present disclosure. The video code stream is described by taking an h.264 video code stream as an example, and after entropy decoding is performed on the video code stream, one or more motion vector difference values and a quantized first residual error are obtained. A reference motion vector can be obtained based on the motion vector difference and the corresponding motion vector predictor, so that the reference block to be motion compensated can be more finely known. The entropy decoding can be realized by adopting an independent hardware design, and can also be realized by a software mode. The parsing of the current video code stream and the image buffering may be implemented in a software manner by a driver or an Open Media access framework (OpenMAX). And after inverse quantization and inverse transformation are carried out on the first residual error, a second residual error can be obtained. According to the reference motion vector (relative displacement between the reference block and the block to be decoded or the sub-blocks in the block to be decoded) and the reference block, the prediction value of the block to be decoded or the sub-blocks in the block to be decoded can be obtained. It should be noted that the predicted value of the block to be decoded or the sub-block in the block to be decoded may be obtained in an intra prediction manner or a motion compensation manner. The inverse quantization and inverse transform, intra/inter mode selection, intra prediction, motion compensation, and deblocking filtering, etc. in the decoding process may be implemented by an Application Specific Integrated Circuit (ASIC).

After the predicted value of the block to be decoded or the sub-block in the block to be decoded is obtained, the second residual is added to the predicted value of the block to be decoded or the sub-block in the block to be decoded to obtain the block to be decoded or the sub-block (actual value) in the block to be decoded after decoding, and the block effect filter filtering is performed according to the block to be decoded or the sub-block in the block to be decoded to obtain smooth video stream decoding data.

410. If the reference times of the block to be decoded or the sub-blocks in the block to be decoded are multiple times, the block to be decoded or the sub-blocks in the block to be decoded are stored in a system cache and stored in a dynamic random access memory.

For example, after the block to be decoded or the sub-blocks in the block to be decoded are decoded, if the block to be decoded is subsequently referred to by other blocks to be decoded for multiple times, or the sub-blocks in the block to be decoded are subsequently referred to by the sub-blocks in other blocks to be decoded for multiple times, the block to be decoded or the sub-blocks in the block to be decoded are stored in the system cache and stored in the dynamic random access memory to be used as reference blocks when the other blocks to be decoded or the sub-blocks in the block to be decoded are decoded.

411. And if the reference times of the block to be decoded or the sub-blocks in the block to be decoded are multiple times, storing the block to be decoded or the sub-blocks in the block to be decoded into a system buffer memory.

For example, after the block to be decoded or the sub-blocks in the block to be decoded are decoded, if the block to be decoded is subsequently referred to by other blocks to be decoded for a plurality of times, or the sub-blocks decoded by the sub-blocks in the block to be decoded are subsequently referred to by the sub-blocks in the other blocks to be decoded for a plurality of times, the block to be decoded or the sub-blocks decoded by the sub-blocks in the block to be decoded are stored in the system buffer memory to be used as reference blocks when the other blocks to be decoded or the sub-blocks in the block to be decoded are decoded.

412. And if the reference times of the block to be decoded or the sub-blocks in the block to be decoded is one time, storing the block to be decoded or the sub-blocks decoded by the sub-blocks in the block to be decoded in a dynamic random access memory.

For example, after the block to be decoded or the sub-blocks in the block to be decoded are decoded, if the block to be decoded is subsequently referred to by other blocks to be decoded once, or the sub-blocks decoded by the sub-blocks in the block to be decoded are subsequently referred to by the sub-blocks in other blocks to be decoded once, the block to be decoded or the sub-blocks decoded by the sub-blocks in the block to be decoded are stored in the dynamic random access memory to serve as reference blocks when the sub-blocks in other blocks to be decoded or the sub-blocks in the block to be decoded are decoded.

If the image frame to be decoded, the stripe to be decoded or the region to be decoded still has other blocks to be decoded or sub-blocks of other blocks to be decoded, the other blocks to be decoded or sub-blocks of other blocks to be decoded are decoded. If all the blocks to be decoded or the sub-blocks of the blocks to be decoded in the image frame, the strip to be decoded or the area to be decoded are decoded, other image frames, strips or areas are decoded until all the image frames, strips or areas to be decoded are decoded.

It should be noted that the flow of the method for performing image processing in the video decoding apparatus in fig. 7 and the flow of the method for performing image processing in the video decoding apparatus in fig. 17 may be combined in a system, and do not interfere with each other, that is, there may be an intermediate switch to another flow when decoding a video stream. A reasonable switching point is when a new image frame or slice starts to decode.

It can be understood that the embodiments of the present application are based on the fact that the data access behavior (i.e. the behavior of repeated reading) can be predicted during video decoding, so as to implement intelligent selection of the data storage manner, so as to reduce the power consumption of the video decoding apparatus. The position of image data storage can be changed according to the frame reference relation during decoding, so that the repeated reading times of the reference image frame stored in a low-power-consumption memory such as Sys $ or SysBuf are properly increased, the power consumption is properly reduced, and the power consumption caused by data entering and exiting in the video decoding device can be ensured to be always maintained in a desired state. If a low power memory, such as Sys $ has high speed bandwidth at the same time, the bandwidth of the DRAM can be reduced even further.

The embodiment of the application can ensure that the power consumption of the video decoding device is controllable, and hardware or software of the video decoding device can finish decoding work as soon as possible, the video decoding device can change the storage characteristic of the read data by fully utilizing the expectable behavior that the video decoding device can repeatedly read the reference block for many times, and the power consumption is reduced because the power of the access data is saved, so that the bottleneck of power consumption can not be brought by the data access, the running speed of the video decoding device can be maintained, and the power consumption is reduced. The speed of reading data is not limited by power consumption, and thus the video decoding apparatus is not overheated. In addition, the SRAM in the Sys $ or SysBuf has low time delay during reading and writing, so that the processing frame rate can be improved, and the reaction time delay can be reduced. Due to the fact that power consumption can be greatly reduced, the service time of a battery in the video decoding device can be prolonged, and user experience is improved.

It can be understood that, according to the embodiments of the present application, a target position or attribute of data reading may be selected according to a long-time playing requirement of a playing device and a larger power consumption caused by a predictable behavior. For example, data needing to be read repeatedly is read from Sys $ and DRAM, or is read from SysBuf, but not all data are read from DRAM, and since the same data are read, the power consumption of SRAM is far smaller than that of DRAM, so that the power consumption of reading data can be greatly reduced.

The embodiment of the present application takes video decoding as an example to describe in detail how to reduce the power consumption of reading data. In other embodiments, the method can be applied to all modules and applications that require high bandwidth but have predictable access data behavior, such as video encoding devices, frame rate up conversion (frameup conversion) devices, and the like. The behaviors of the modules and the applications can be generally predicted, such as the times of repeated reading, and through the behaviors, corresponding storage characteristics can be allocated in advance, namely, data which are repeatedly read are stored in a low-power-consumption memory, for example, the energy consumption corresponding to different levels of memories is selected according to the access time requirement of image data of all frames or partial frames, and the times of reading data from Sys $ and DRAM can be reasonably allocated when the energy consumption is different, or the times of reading data from sysBuf can be reasonably allocated.

For example, the video encoding apparatus may also determine the behavior of accessing data by analyzing the video stream in advance, and the frame rate enhancing apparatus may know which regions are used many times during processing by simple analysis, and so on. It can also be applied to fixed Artificial Intelligence (AI) network behavior, the part of which is repeatedly read is the feature map (feature map) part, and the AI network behavior is expected.

Referring to fig. 20, fig. 20 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present disclosure. The image processing apparatus 500 may include: the device comprises an acquisition module 501, a first determination module 502, a second determination module 503, a third determination module 504 and a decoding module 505.

An obtaining module 501, configured to obtain a video code stream;

a first determining module 502, configured to determine one or more reference positions from the video code stream;

a second determining module 503, configured to determine reference times of the one or more reference positions;

a third determining module 504, configured to determine, according to a preset power consumption threshold and the reference times of the one or more reference positions, a reference position that needs to be stored in a preset memory from the one or more reference positions, store the reference position in the preset memory, and store power consumption resulting from storing the reference position in the preset memory and reading the reference position from the preset memory, where the power consumption is less than or equal to the preset power consumption threshold; and

and a decoding module 505, configured to decode the object to be decoded according to the reference position.

In one embodiment, the reference position comprises a reference image frame, a reference strip or a reference region, and the first determining module 502 may be configured to:

and determining one or more reference image frames, reference strips or reference areas from the video code stream according to frame header information of the image frames in the video code stream or strip header information of one or more strips in the image frames.

In one embodiment, the reference position includes a reference image frame, a reference strip or a reference area, and the second determining module 503 may be configured to:

determining a number of references to the one or more reference image frames, reference slices or reference regions by preset parameters, the preset parameters including any one or more of: the system comprises a network abstraction layer analysis parameter, a strip head analysis parameter, a reference image list correction parameter and a reference image frame marking parameter.

In one embodiment, the preset memory includes a first memory and a second memory, and the power consumption of the first memory is greater than the power consumption of the second memory.

In one embodiment, the first memory comprises a dynamic random access memory disposed outside of the video decoding apparatus, the second memory comprises a system cache disposed outside of the video decoding apparatus, the reference location comprises a reference image frame, a reference slice, or a reference region, and the third determining module 504 may be configured to:

if the reference times are multiple times, storing one or more reference image frames, reference stripes or reference areas with the reference times being multiple times in the system cache according to the preset power consumption threshold value, and storing the reference image frames, the reference stripes or the reference areas in the dynamic random access memory.

In one embodiment, the second memory comprises a system buffer memory disposed outside the video decoding device, the reference location comprises a reference image frame, a reference slice, or a reference region, and the third determining module 504 may be configured to:

and if the reference times are multiple times, storing one or more reference image frames, reference strips or reference areas with the reference times being multiple times in the system buffer memory according to the preset power consumption threshold.

In one embodiment, the first memory comprises a dynamic random access memory disposed outside of the video decoding apparatus, the reference location comprises a reference image frame, a reference slice, or a reference region, and the third determining module 504 may be configured to:

and if the reference times are one time, storing one or more reference image frames, reference strips or reference areas with the reference times being one time in the dynamic random access memory according to the preset power consumption threshold.

In one implementation, the object to be decoded comprises a block to be decoded in an image frame to be decoded, a block to be decoded in a slice to be decoded, or a block to be decoded in a region to be decoded, and the decoding module 505 may be configured to:

reading image data of a required reference image frame, a reference strip or a reference area from the preset memory, if the read image data of the reference image frame is read, decoding a block to be decoded in the image frame to be decoded according to the read image data of the reference image frame, if the read image data of the reference strip is read, decoding the block to be decoded in the strip to be decoded according to the read image data of the reference strip, and if the read image data of the reference area is read, decoding the block to be decoded in the area to be decoded according to the read image data of the reference area;

if the reference times of the image frame to be decoded, the stripe to be decoded or the area to be decoded are multiple times, storing the block decoded by the block to be decoded in the system cache and storing the block in the dynamic random access memory.

In one implementation, the object to be decoded comprises a block to be decoded in an image frame to be decoded, a block to be decoded in a slice to be decoded, or a block to be decoded in a region to be decoded, the decoding module 505 may be configured to:

and if the reference times of the image frame to be decoded, the stripe to be decoded or the area to be decoded are multiple times, storing the decoded block of the block to be decoded in the system buffer memory.

and if the reference times of the image frame to be decoded, the stripe to be decoded or the area to be decoded are one time, storing the block decoded by the block to be decoded in the dynamic random access memory.

Referring to fig. 21, fig. 21 is another schematic structural diagram of an image processing apparatus according to an embodiment of the present disclosure. The image processing apparatus 600 may include: a first obtaining module 601, a second obtaining module 602, a third obtaining module 603, a first determining module 604, a second determining module 605, and a decoding module 606.

A first obtaining module 601, configured to obtain a video code stream;

a second obtaining module 602, configured to obtain one or more reference motion vectors according to the video code stream;

a third obtaining module 603, configured to obtain one or more corresponding reference blocks from one or more image frames of the video bitstream according to the one or more reference motion vectors;

a first determining module 604 for determining a reference number of the one or more reference blocks;

a second determining module 605, configured to determine, according to a preset power consumption threshold and the reference times of the one or more reference blocks, one or more reference blocks that need to be stored in a preset memory from the one or more reference blocks, store the one or more reference blocks in the preset memory, and enable power consumption generated by storing the reference blocks in the preset memory and reading the reference blocks from the preset memory to be less than or equal to the preset power consumption threshold; and

a decoding module 606, configured to decode the block to be decoded or the sub-block of the block to be decoded according to the reference block.

In one embodiment, the second obtaining module 602 may be configured to:

entropy decoding the video code stream to obtain one or more motion vector difference values;

and acquiring the one or more reference motion vectors according to the one or more motion vector difference values and the corresponding motion vector predicted values.

In one embodiment, the first memory comprises a dynamic random access memory disposed outside the video decoding apparatus, the second memory comprises a system cache disposed outside the video decoding apparatus, and the second determining module 605 may be configured to:

if the reference times are multiple times, storing one or more reference blocks with the multiple reference times in the system cache and in the dynamic random access memory according to the preset power consumption threshold.

In one embodiment, the second memory comprises a system buffer memory disposed outside the video decoding apparatus, and the second determining module 605 may be configured to:

and if the reference times are multiple times, storing one or more reference blocks with the multiple reference times in the system buffer memory according to the preset power consumption threshold.

In one embodiment, the first memory comprises a dynamic random access memory disposed outside the video decoding apparatus, and the second determining module 605 may be configured to:

and if the reference times are once, storing one or more reference blocks with the reference times of once in the dynamic random access memory according to the preset power consumption threshold.

In one embodiment, the decoding module 606 may be configured to:

reading image data of a required reference block from the preset memory, and decoding the block to be decoded or sub-blocks in the block to be decoded according to the read image data of the reference block;

if the reference times of the block to be decoded or the sub-blocks in the block to be decoded are multiple times, storing the block to be decoded or the sub-blocks decoded by the sub-blocks in the block to be decoded in the system cache and storing the block to be decoded or the sub-blocks decoded by the sub-blocks in the block to be decoded in the dynamic random access memory.

In one embodiment, the decoding module 606 may be configured to:

if the reference times of the block to be decoded or the sub-blocks in the block to be decoded are multiple times, storing the block to be decoded or the sub-blocks in the block to be decoded after decoding in the system buffer memory.

In one embodiment, the decoding module 606 may be configured to:

if the reference times of the block to be decoded or the sub-blocks in the block to be decoded are once, storing the block to be decoded or the sub-blocks in the block to be decoded after decoding in the dynamic random access memory.

In an embodiment, the second obtaining module 602 may be configured to:

entropy decoding the current video code stream to obtain one or more motion vector difference values and a quantized first residual error;

the decode module 606 may be to:

In one embodiment, the decoding module 606 may be configured to:

and acquiring video stream decoding data according to the decoded block of the block to be decoded or the decoded subblock of the subblock in the block to be decoded.

In one embodiment, the predicted value of the block to be decoded or the sub-block in the block to be decoded is obtained by an intra prediction mode or a motion compensation mode.

An embodiment of the present application provides a computer-readable storage medium, on which a computer program is stored, which, when executed on a computer, causes the computer to execute the flow in the method for image processing in a video decoding apparatus as provided in the present embodiment.

An embodiment of the present application further provides an electronic device, which includes a memory, a processor, and a video decoding apparatus, where the processor is configured to execute a flow in the method for processing an image in the video decoding apparatus, which is provided in this embodiment, by calling a computer program stored in the memory.

For example, the electronic device may be a mobile terminal such as a tablet computer or a smart phone. Referring to fig. 22, fig. 22 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.

The electronic device 700 may comprise a video decoding apparatus 701, a memory 702, a processor 703, and the like. Those skilled in the art will appreciate that the electronic device configuration shown in fig. 22 does not constitute a limitation of the electronic device and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.

The video decoding apparatus 701 may be used to decode the encoded video image to restore the original video image.

The memory 702 may be used to store applications and data. The memory 702 stores applications containing executable code. The application programs may constitute various functional modules. The processor 703 executes various functional applications and data processing by running an application program stored in the memory 702.

The processor 703 is a control center of the electronic device, connects various parts of the entire electronic device by using various interfaces and lines, and performs various functions of the electronic device and processes data by running or executing an application program stored in the memory 702 and calling data stored in the memory 702, thereby integrally monitoring the electronic device.

In this embodiment, the processor 703 in the electronic device loads the executable code corresponding to the processes of one or more application programs into the memory 702 according to the following instructions, and the processor 703 runs the application programs stored in the memory 702, so as to execute:

acquiring a video code stream;

determining one or more reference positions from the video code stream;

determining a reference number of times of the one or more reference positions;

determining a reference position to be stored in a preset memory from the one or more reference positions according to a preset power consumption threshold and the reference times of the one or more reference positions, storing the reference position in the preset memory, and storing the reference position in the preset memory and the power consumption generated by reading the reference position from the preset memory, wherein the power consumption is less than or equal to the preset power consumption threshold; and

decoding the object to be decoded according to the reference position; or performing:

acquiring a video code stream;

acquiring one or more reference motion vectors according to the video code stream;

determining a reference number of times of the one or more reference blocks;

determining one or more reference blocks needing to be stored in a preset memory from the one or more reference blocks according to a preset power consumption threshold and the reference times of the one or more reference blocks, storing the one or more reference blocks in the preset memory, and storing the reference blocks in the preset memory and reading the reference blocks from the preset memory to generate power consumption which is less than or equal to the preset power consumption threshold; and

Referring to fig. 23, the electronic device 700 may include a video decoding apparatus 701, a memory 702, a processor 703, a battery 704, an input unit 705, an output unit 706, and the like.

The battery 704 may be used to provide power support for various components of the electronic device, thereby ensuring proper operation of the various components.

The input unit 705 may be used to receive an encoded input video stream of video images, for example, may be used to receive a video stream that requires video decoding.

The output unit 706 may be used to output the decoded video stream.

acquiring a video code stream;

determining one or more reference positions from the video code stream;

determining a reference number of times of the one or more reference locations;

acquiring a video code stream;

determining a number of references of the one or more reference blocks;

In one embodiment, the reference position includes a reference image frame, a reference stripe, or a reference area, and the processor 703, when executing the determining of the one or more reference positions from the video bitstream, may further execute: and determining the one or more reference image frames, reference strips or reference areas from the video code stream according to frame header information of an image frame in the video code stream or strip header information of one or more strips in the image frame.

In one embodiment, the reference positions include a reference image frame, a reference strip, or a reference area, and the processor 703 may further perform the following steps when performing the determining the reference times of the one or more reference positions: determining the reference times of the one or more reference image frames, reference strips or reference areas by preset parameters, wherein the preset parameters comprise any one or more of the following: the system comprises a network abstraction layer analysis parameter, a strip head analysis parameter, a reference image list correction parameter and a reference image frame marking parameter.

In one embodiment, the first memory includes a dynamic random access memory disposed outside the video decoding apparatus, the second memory includes a system cache disposed outside the video decoding apparatus, the reference position includes a reference image frame, a reference stripe or a reference area, and the processor 703 executes the determining, according to a preset power consumption threshold and the reference times of the one or more reference positions, a reference position that needs to be stored in a preset memory from the one or more reference positions, and stores the reference position in the preset memory, and may further execute: if the reference frequency is multiple, storing the one or more reference image frames, reference stripes or reference areas with the reference frequency being multiple in the system cache according to the preset power consumption threshold value, and storing the reference image frames, reference stripes or reference areas in the dynamic random access memory.

In one embodiment, the second memory comprises a system buffer memory disposed external to the video decoding device, the reference location comprises a reference image frame, a reference slice, or a reference region; when the processor 703 executes the determining, according to the preset power consumption threshold and the reference times of the one or more reference positions, a reference position that needs to be stored in a preset memory from the one or more reference positions, and stores the reference position in the preset memory, the following may be further executed: and if the reference times are multiple times, storing one or more reference image frames, reference strips or reference areas with the reference times being multiple times in the system buffer memory according to the preset power consumption threshold.

In one embodiment, the first memory comprises a dynamic random access memory disposed external to the video decoding device, the reference location comprises a reference image frame, a reference slice, or a reference region; when the processor 703 executes the determining, according to the preset power consumption threshold and the reference times of the one or more reference positions, a reference position that needs to be stored in a preset memory from the one or more reference positions, and stores the reference position in the preset memory, the following may be further executed: and if the reference times are once, storing one or more reference image frames, reference strips or reference areas with the reference times being once in the dynamic random access memory according to the preset power consumption threshold.

In one embodiment, the object to be decoded includes a block to be decoded in an image frame to be decoded, a block to be decoded in a slice to be decoded, or a block to be decoded in a region to be decoded, and when the processor 703 performs decoding of the object to be decoded according to the reference position, the following may be further performed: reading image data of a required reference image frame, a reference strip or a reference area from the preset memory, if the read image data of the reference image frame is read, decoding a block to be decoded in the image frame to be decoded according to the read image data of the reference image frame, if the read image data of the reference strip is read, decoding the block to be decoded in the strip to be decoded according to the read image data of the reference strip, and if the read image data of the reference area is read, decoding the block to be decoded in the area to be decoded according to the read image data of the reference area; if the reference times of the image frame to be decoded, the stripe to be decoded or the area to be decoded are multiple times, storing the block decoded by the block to be decoded in the system cache and storing the block in the dynamic random access memory.

In one embodiment, the object to be decoded includes a block to be decoded in an image frame to be decoded, a block to be decoded in a slice to be decoded, or a block to be decoded in a region to be decoded, and when the processor 703 performs decoding on the object to be decoded according to the reference position, it may further perform: reading image data of a required reference image frame, a reference strip or a reference area from the preset memory, if the read image data of the reference image frame is read, decoding a block to be decoded in the image frame to be decoded according to the read image data of the reference image frame, if the read image data of the reference strip is read, decoding the block to be decoded in the strip to be decoded according to the read image data of the reference strip, and if the read image data of the reference area is read, decoding the block to be decoded in the area to be decoded according to the read image data of the reference area; and if the reference times of the image frame to be decoded, the stripe to be decoded or the area to be decoded are multiple times, storing the decoded block of the block to be decoded in the system buffer memory.

In one embodiment, the object to be decoded includes a block to be decoded in an image frame to be decoded, a block to be decoded in a slice to be decoded, or a block to be decoded in a region to be decoded, and when the processor 703 performs decoding of the object to be decoded according to the reference position, the following may be further performed: reading image data of a required reference image frame, a reference strip or a reference area from the preset memory, if the read image data of the reference image frame is read, decoding a block to be decoded in the image frame to be decoded according to the read image data of the reference image frame, if the read image data of the reference strip is read, decoding the block to be decoded in the strip to be decoded according to the read image data of the reference strip, and if the read image data of the reference area is read, decoding the block to be decoded in the area to be decoded according to the read image data of the reference area; and if the reference times of the image frame to be decoded, the stripe to be decoded or the area to be decoded are one time, storing the block decoded by the block to be decoded in the dynamic random access memory.

In an embodiment, when the processor 703 executes the obtaining of the one or more reference motion vectors according to the video bitstream, it may further execute: entropy decoding the video code stream to obtain one or more motion vector difference values; and acquiring the one or more reference motion vectors according to the one or more motion vector difference values and the corresponding motion vector predicted values.

In one embodiment, the first memory includes a dynamic random access memory disposed outside the video decoding apparatus, the second memory includes a system cache disposed outside the video decoding apparatus, and the processor 703, when determining, from the one or more reference blocks, one or more reference blocks that need to be stored in the preset memory according to the preset power consumption threshold and the reference times of the one or more reference blocks, may further perform: if the reference times are multiple times, storing one or more reference blocks with the multiple reference times in the system cache and in the dynamic random access memory according to the preset power consumption threshold.

In an embodiment, the second memory includes a system buffer memory disposed outside the video decoding apparatus, and the processor 703, when determining, from the one or more reference blocks, one or more reference blocks that need to be stored in the preset memory according to the preset power consumption threshold and the reference times of the one or more reference blocks, may further perform: and if the reference times are multiple times, storing one or more reference blocks with the reference times in the system buffer memory according to the preset power consumption threshold.

In an embodiment, the first memory includes a dynamic random access memory disposed outside the video decoding apparatus, and the processor 703, when determining, from the one or more reference blocks, one or more reference blocks that need to be stored in the preset memory according to the preset power consumption threshold and the reference times of the one or more reference blocks, may further perform: and if the reference times are once, storing one or more reference blocks with the reference times of once in the dynamic random access memory according to the preset power consumption threshold.

In one embodiment, when the processor 703 performs the decoding of the block to be decoded or the sub-blocks of the block to be decoded according to the reference block, the following may be further performed: reading image data of a required reference block from the preset memory, and decoding the block to be decoded or sub-blocks in the block to be decoded according to the read image data of the reference block; if the reference times of the block to be decoded or the sub-blocks in the block to be decoded are multiple times, storing the block to be decoded or the sub-blocks in the block to be decoded after decoding in the system cache and storing the block to be decoded or the sub-blocks in the block to be decoded in the dynamic random access memory.

In one embodiment, when the processor 703 performs the decoding of the block to be decoded or the sub-blocks in the block to be decoded according to the reference block, the following may be further performed: reading image data of a required reference block from the preset memory, and decoding the block to be decoded or sub-blocks in the block to be decoded according to the read image data of the reference block; and if the reference times of the block to be decoded or the sub-blocks in the block to be decoded are multiple times, storing the block decoded by the block to be decoded or the sub-blocks decoded by the sub-blocks in the block to be decoded in the system buffer memory.

In one embodiment, when the processor 703 performs the decoding of the block to be decoded or the sub-blocks in the block to be decoded according to the reference block, the following may be further performed: reading image data of a required reference block from the preset memory, and decoding the block to be decoded or sub-blocks in the block to be decoded according to the read image data of the reference block; and if the reference times of the block to be decoded or the sub-blocks in the block to be decoded is one time, storing the block decoded by the block to be decoded or the sub-blocks decoded by the sub-blocks in the block to be decoded in the dynamic random access memory.

In an embodiment, when the processor 703 performs the entropy decoding on the video code stream to obtain one or more motion vector difference values, it may further perform: and entropy decoding the video code stream to obtain one or more motion vector difference values and a quantized first residual error.

When the processor 703 executes the decoding of the block to be decoded or the sub-blocks in the block to be decoded according to the read image data of the reference block, the processor may further execute: carrying out inverse quantization and inverse transformation on the first residual error to obtain a second residual error; obtaining a predicted value of the block to be decoded or a sub-block in the block to be decoded according to the reference motion vector and the reference block; and acquiring the block to be decoded or the sub-block decoded by the sub-block in the block to be decoded according to the second residual and the predicted value of the block to be decoded or the sub-block in the block to be decoded.

In one embodiment, when the processor 703 executes the decoding of the block to be decoded or the sub-blocks in the block to be decoded according to the read image data of the reference block, the following may also be executed: and acquiring video stream decoding data according to the decoded block of the block to be decoded or the decoded subblock of the subblock in the block to be decoded.

Fig. 24 and 25 are referenced, and fig. 24 is a schematic structural diagram of the image processing system according to the embodiment of the present application. Fig. 25 is another schematic structural diagram of an image processing system according to an embodiment of the present application. The image processing system 800 includes a video decoding apparatus 801, a first memory 802 and a second memory 803, wherein the power consumption of the first memory 802 is larger than the power consumption of the second memory 803, for example, the first memory may be a DRAM, the second memory 803 may be a Sys $ or SysBuf, a reference position with a reference number of times is stored in the first memory 802, or a reference position with a reference number of times is stored in the first memory 802, and a reference position with a reference number of times is stored in the second memory 803. It should be noted that the reference times are the times that the reference position is referred to by the object to be decoded.

In one embodiment, the reference location may include a reference image frame, a reference slice, a reference region, or a reference block. For example, the first memory 802 may store a reference image frame, a reference slice, or a reference block with reference times being one or more times, or store a reference image frame, a reference slice, a reference area, or a reference block with reference times being one or more times, that is, the first memory 802 may store a reference image frame, a reference slice, or a reference block with reference times being one or more times, and the second memory 803 may store a reference image frame, a reference slice, or a reference block with reference times being multiple times.

When the video decoding apparatus 801 decodes, the video decoding apparatus 801 reads a reference position with reference to one time from the first memory 802 and reads a reference position with reference to multiple times from the second memory 803, and decodes an object to be decoded according to the reference positions when decoding. The object to be decoded may include a block to be decoded in an image frame to be decoded, a block to be decoded in a slice to be decoded, a block to be decoded in a region to be decoded, or a sub-block of the block to be decoded.

For example, when the video decoding apparatus 801 decodes, it may read a reference image frame, a reference slice, or a reference block with one reference time from the first memory 802, and read a reference image frame, a reference slice, or a reference block with multiple reference times from the second memory 803, decode a block to be decoded in an image frame to be decoded according to the reference image frame, decode a block to be decoded in a slice to be decoded according to the reference slice, decode a block to be decoded in a region to be decoded according to the reference region, or decode a block to be decoded or a sub-block of the block to be decoded according to the reference block.

For example, the second memory 803 may be Sys $, and when the number of times of reference of the reference image frame, the reference slice, the reference region, or the reference block to be referred to is one, the image data of the reference image frame, the reference slice, the reference region, or the reference block is directly read from the first memory 802, and when the number of times of reference of the reference image frame, the reference slice, the reference region, or the reference block to be referred to is multiple, the image data of the reference image frame, the reference slice, the reference region, or the reference block may be read from the first memory 802 once, and the rest of the image data may be read from Sys $.

It is understood that the number of reads from the Sys $ may be greater than the number of reads from the first memory 802, the number of reads from the Sys $ may be less than the number of reads from the first memory 802, or the number of reads from the Sys $ may be equal to the number of reads from the first memory 802, specifically, the number of reads from each of the first memory 802 and the Sys $ is set accordingly according to a specific scenario, which is not specifically limited in the embodiments of the present application.

For another example, the second memory 803 may be a SysBuf, and when the number of times of reference of a reference image frame, a reference slice, a reference region, or a reference block to be referred to is one, the image data of the reference image frame, the reference slice, the reference region, or the reference block is directly read from the first memory 802, and when the number of times of reference of the reference image frame, the reference slice, the reference region, or the reference block to be referred to is multiple, the image data of the reference image frame, the reference slice, the reference region, or the reference block may be read from the SysBuf.

In the above embodiments, the descriptions of the embodiments have respective emphasis, and parts that are not described in detail in a certain embodiment may refer to the above detailed description of the method for processing an image in a video decoding apparatus, and are not described herein again.

The image processing apparatus provided in the embodiment of the present application and the method for performing image processing in the video decoding apparatus in the foregoing embodiments belong to the same concept, and any method provided in the method for performing image processing in the video decoding apparatus may be run on the image processing apparatus, and a specific implementation process thereof is described in detail in the method for performing image processing in the video decoding apparatus, and is not described again here.

It should be noted that, for the method for performing image processing in a video decoding apparatus according to the embodiment of the present application, it can be understood by those skilled in the art that all or part of the process for implementing the method for performing image processing in a video decoding apparatus according to the embodiment of the present application can be implemented by controlling the relevant hardware through a computer program, where the computer program can be stored in a computer-readable storage medium, such as a memory, and executed by at least one processor, and during the execution process, the process of the embodiment of the method for performing image processing in a video decoding apparatus can be included as described above. The storage medium may be a magnetic disk, an optical disk, a Read Only Memory (ROM), a Random Access Memory (RAM), or the like.

In the image processing apparatus according to the embodiment of the present application, each functional module may be integrated into one processing chip, each module may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium, such as a read-only memory, a magnetic or optical disk, or the like.

The foregoing detailed description has provided a method, an apparatus, a storage medium, an electronic device, and a system for processing an image in a video decoding apparatus according to embodiments of the present application, and specific examples are applied herein to explain the principles and embodiments of the present application, and the descriptions of the foregoing embodiments are only used to help understand the method and the core ideas of the present application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. A method of image processing in a video decoding device, the method comprising:

acquiring a video code stream;

determining one or more reference positions from the video code stream;

determining a reference number of times of the one or more reference positions;

and decoding the object to be decoded according to the reference position.

2. The method of claim 1, wherein the reference location comprises a reference image frame, a reference slice, or a reference region, and wherein determining one or more reference locations from the video bitstream comprises:

3. The method of claim 1, wherein the reference position comprises a reference image frame, a reference slice or a reference region, and wherein determining the reference times of the one or more reference positions comprises:

4. The method of claim 1, wherein the predetermined memory comprises a first memory and a second memory, and wherein a power consumption of the first memory is greater than a power consumption of the second memory.

5. The method of claim 4, wherein the first memory comprises a dynamic random access memory disposed outside of the video decoding apparatus, the second memory comprises a system cache disposed outside of the video decoding apparatus, and the reference location comprises a reference image frame, a reference slice, or a reference region;

the determining, according to a preset power consumption threshold and the reference times of the one or more reference positions, a reference position that needs to be stored in a preset memory from the one or more reference positions and storing the reference position in the preset memory includes:

and if the reference times are multiple times, storing one or more reference image frames, reference strips or reference areas with the reference times being multiple times in the system cache according to the preset power consumption threshold value, and storing the reference image frames, the reference strips or the reference areas in the dynamic random access memory.

6. The method of claim 4, wherein the second memory comprises a system buffer memory disposed outside the video decoding apparatus, and the reference location comprises a reference image frame, a reference slice, or a reference region;

7. The method of claim 4, wherein the first memory comprises a dynamic random access memory disposed outside the video decoding apparatus, and the reference position comprises a reference image frame, a reference slice, or a reference region;

8. The method of claim 5, wherein the object to be decoded comprises a block to be decoded in an image frame to be decoded, a block to be decoded in a slice to be decoded, or a block to be decoded in a region to be decoded, the object to be decoded is decoded according to the reference position, comprising:

9. The method of claim 6, wherein the object to be decoded comprises a block to be decoded in an image frame to be decoded, a block to be decoded in a slice to be decoded, or a block to be decoded in a region to be decoded, the object to be decoded is decoded according to the reference position, comprising:

10. The method of claim 7, wherein the object to be decoded comprises a block to be decoded in an image frame to be decoded, a block to be decoded in a slice to be decoded, or a block to be decoded in a region to be decoded, the object to be decoded is decoded according to the reference position, comprising:

11. A method of image processing in a video decoding device, the method comprising:

acquiring a video code stream;

determining a reference number of times of the one or more reference blocks;

and decoding the block to be decoded or the sub-blocks in the block to be decoded according to the reference block.

12. The method of claim 11, wherein the obtaining one or more reference motion vectors from the video bitstream comprises:

13. The method of claim 12, wherein the predetermined memory comprises a first memory and a second memory, and wherein a power consumption of the first memory is greater than a power consumption of the second memory.

14. The method of claim 13, wherein the first memory comprises a Dynamic Random Access Memory (DRAM) disposed outside the video decoding apparatus, the second memory comprises a system cache disposed outside the video decoding apparatus, and the determining the one or more reference blocks to be stored in the predetermined memory from the one or more reference blocks according to the predetermined power consumption threshold and the reference times of the one or more reference blocks comprises:

15. The method of claim 13, wherein the second memory comprises a system buffer memory disposed outside the video decoding apparatus, and wherein the determining, from the one or more reference blocks, the one or more reference blocks that need to be stored in the preset memory according to the preset power consumption threshold and the reference times of the one or more reference blocks comprises:

and if the reference times are multiple times, storing one or more reference blocks with the reference times in the system buffer memory according to the preset power consumption threshold.

16. The method of claim 13, wherein the first memory comprises a Dynamic Random Access Memory (DRAM) disposed outside the video decoding apparatus, and the determining the one or more reference blocks to be stored in the predetermined memory from the one or more reference blocks according to the predetermined power consumption threshold and the reference times of the one or more reference blocks comprises:

17. The method of claim 14, wherein decoding the block to be decoded or the sub-blocks in the block to be decoded according to the reference block comprises:

if the reference times of the block to be decoded or the sub-blocks in the block to be decoded are multiple times, storing the block to be decoded or the sub-blocks in the block to be decoded after decoding in the system cache and storing the block to be decoded or the sub-blocks in the block to be decoded in the dynamic random access memory.

18. The method of claim 15, wherein said decoding a block to be decoded or sub-blocks within the block to be decoded according to the reference block comprises:

19. The method of claim 16, wherein the decoding a block to be decoded or sub-blocks within the block to be decoded according to the reference block comprises:

20. The method of any of claims 17 to 19, wherein entropy decoding the video bitstream to obtain one or more motion vector difference values comprises:

entropy decoding the video code stream to obtain one or more motion vector difference values and a quantized first residual error;

the decoding the block to be decoded or the sub-blocks in the block to be decoded according to the read image data of the reference block includes:

21. The method of claim 20, wherein the decoding the block to be decoded or the sub-blocks in the block to be decoded according to the read image data of the reference block, further comprises:

22. The method of claim 20, wherein the prediction value of the block to be decoded or the sub-block in the block to be decoded is obtained by an intra prediction method or a motion compensation method.

23. An apparatus for performing image processing in a video decoding apparatus, the apparatus comprising:

the acquisition module is used for acquiring a video code stream;

a third determining module, configured to determine, according to a preset power consumption threshold and the reference times of the one or more reference positions, a reference position that needs to be stored in a preset memory from the one or more reference positions, store the reference position in the preset memory, and store power consumption generated by storing the reference position in the preset memory and reading the reference position from the preset memory, where the power consumption is less than or equal to the preset power consumption threshold; and

24. An apparatus for performing image processing in a video decoding apparatus, the apparatus comprising:

the first acquisition module is used for acquiring a video code stream;

a second determining module, configured to determine, according to a preset power consumption threshold and reference times of the one or more reference blocks, one or more reference blocks that need to be stored in a preset memory from the one or more reference blocks, store the one or more reference blocks in the preset memory, and store the reference blocks in the preset memory and read the reference blocks from the preset memory, where power consumption generated by the reference blocks being stored in the preset memory and the reference blocks being read from the preset memory is less than or equal to the preset power consumption threshold; and

25. A computer-readable storage medium, on which a computer program is stored, which, when executed on a computer, causes the computer to carry out the method according to any one of claims 1 to 10 or 11 to 22.

26. An electronic device comprising a memory, a processor and a video decoding apparatus, wherein the processor executes the method of any one of claims 1 to 10 or 11 to 22 by calling a computer program stored in the memory.

27. An image processing system is characterized by comprising a video decoding device, a first memory and a second memory, wherein the power consumption of the first memory is larger than that of the second memory, the first memory stores a reference position with one reference time or stores a reference position with one or more parameter times, the second memory stores a reference position with multiple reference times, the video decoding device reads the reference position with one reference time from the first memory and reads the reference position with multiple reference times from the second memory during decoding, and an object to be decoded is decoded according to the reference positions.

28. The image processing system of claim 27, wherein the reference location comprises a reference image frame, a reference slice, a reference region, or a reference block, and wherein the object to be decoded comprises a block to be decoded in the image frame to be decoded, a block to be decoded in the slice to be decoded, a block to be decoded in the region to be decoded, or a sub-block of the block to be decoded.