CN108804508B

CN108804508B - Method and system for storing input image

Info

Publication number: CN108804508B
Application number: CN201810344898.2A
Authority: CN
Inventors: 赵屏; 杨志文; 王智鸣
Original assignee: MediaTek Inc
Current assignee: MediaTek Inc
Priority date: 2017-04-25
Filing date: 2018-04-17
Publication date: 2022-06-07
Anticipated expiration: 2038-04-17
Also published as: CN108833922B; CN108804508A; CN108833922A; TW201839714A; TW201840177A

Abstract

The invention provides a method and a system for storing an input image. The invention describes allocating one or more frame buffers in a memory. The invention also describes segmenting the input image into a plurality of access units corresponding to a plurality of subsets of the input image and assigning a primary portion and a secondary portion to each of the plurality of access units in the frame buffer, wherein at least one secondary portion is located out of sequence in the frame buffer after its respective primary portion. The invention also describes compressing the access units into compressed access units, storing each compressed access unit into a respective primary portion, and if the size of the compressed access unit exceeds the size of the primary portion, storing the remainder of the compressed access unit into a respective secondary portion. The invention enables compressed access units stored in memory to be accessed efficiently.

Description

Method and system for storing input image

Priority declaration

This application claims priority from: U.S. provisional patent application No. 62/489,588 entitled "Memory Access Efficiency Optimization for Frame Buffer Compression" was filed on 25.04.2017 and U.S. provisional patent application No. 15/786,908 entitled "Distributed Access Unit for Frame Buffer Compression" was filed on 18.10.2017, which are incorporated herein by reference in their entirety.

Technical Field

The disclosed embodiments of the present invention relate to storage technology, and more particularly, to a method and system for storing an input image.

Background

The background description provided herein is for the purpose of generally presenting the context of the disclosure. Work of the presently named inventors, to the extent it is described in this background section, as well as aspects of the description that may not otherwise qualify as prior art at the time of filing, are neither expressly nor impliedly admitted as prior art against the present disclosure.

An electronic device, such as a computer system, may include one or more memories. In one example, an electronic device includes a component, such as a Central Processing Unit (CPU), located on a different integrated circuit chip than the memory, that accesses the memory through a memory controller. Memory accessed by the CPU can create heavy data traffic between the CPU and the memory.

Disclosure of Invention

The present invention provides a method and system for storing an input image to efficiently access compressed access units stored in a memory.

Aspects of the present invention provide a method of storing an input image in a memory. The method can comprise the following steps: allocating one or more frame buffers in a memory; dividing the input image into a plurality of access units corresponding to a plurality of subsets of the input image and assigning a main portion and an auxiliary portion to each access unit of the plurality of access units in the frame buffer, wherein at least one auxiliary portion is out of sequence in the frame buffer after its respective main portion; compressing the plurality of access units into a plurality of compressed access units; and storing each compressed access unit in a respective primary portion and, if the size of the compressed access unit exceeds the size of the primary portion, storing the remainder of the compressed access unit in a respective secondary portion.

Aspects of the present invention also provide a system for storing an input image. The system includes a memory, a memory allocation device, and a memory controller. The memory has one or more frame buffers; memory allocation means for receiving an input image, allocating a frame buffer in the memory for storing the input image, dividing the input image into a plurality of access units corresponding to a plurality of subsets of the input image, and allocating a main portion and a sub portion to each access unit in the frame buffer, wherein at least one sub portion is non-sequentially located after its respective main portion in the frame buffer; and a memory controller for storing each compressed access unit into a respective primary portion in response to a plurality of instructions of the memory allocation means and storing the remainder of the compressed access unit into a respective secondary portion if the size of the compressed access unit exceeds the size of the primary portion.

An optional aspect of the invention may provide a non-transitory computer readable medium having computer readable instructions stored thereon which, when executed by a processing circuit, the processing circuit performs a method comprising: allocating one or more frame buffers in a memory;

dividing the input image into a plurality of access units corresponding to a plurality of subsets of the input image and assigning a main portion and an auxiliary portion to each access unit of the plurality of access units in the frame buffer, wherein at least one auxiliary portion is out of sequence in the frame buffer after its respective main portion; compressing the plurality of access units into a plurality of compressed access units; each compressed access unit is stored in a respective primary portion and if the size of the compressed access unit exceeds the size of the primary portion, the remainder of the compressed access unit is stored in a respective secondary portion.

The present invention enables compressed access units to be efficiently accessed by allocating a primary portion and a secondary portion in a memory for each access unit, storing the compressed access units to the primary portion after the access units are compressed, and also storing the remaining portions of the compressed access units to the secondary portion in the case where the size of the primary portion is smaller than the size of the compressed access units.

Drawings

Various embodiments of the present invention, provided as examples, will be described in detail with reference to the following drawings, wherein like reference numerals represent like elements, and wherein:

FIG. 1 is an exemplary block diagram of a memory system according to an embodiment of the invention;

FIG. 2 is an exemplary data structure according to an embodiment of the present invention;

FIG. 3 is three exemplary superblocks in three frame buffers according to embodiments of the present invention;

FIG. 4 is three exemplary superblocks in three frame buffers, according to an embodiment of the present invention;

FIG. 5 is two exemplary superblocks in two frame buffers according to embodiments of the present invention;

FIG. 6 is an alternative frame buffer example according to an embodiment of the present invention;

FIG. 7 is a flow chart depicting an exemplary process according to an embodiment of the present invention.

Detailed Description

FIG. 1 illustrates an example block diagram of a memory system 100 in accordance with an embodiment of this disclosure. As shown, memory system 100 may include a memory allocation device 110, a memory controller 120, and a memory 130. The memory 130 may include a frame buffer 131. The memory system 100 is used to divide an input image into one or more access units and store each compressed access unit (compressed access unit) into a main portion (main port) and a sub portion (secondary port) allocated in the frame buffer 131 and used for the respective access unit.

Memory system 100 may be any suitable system for storing data. In one embodiment, the memory system 100 is an electronic device, such as a desktop computer, a tablet computer, a smart phone, a wearable device, a smart TV, a video camera, a camcorder (camcorder), a media player, and so forth. In an example embodiment, the memory system 100 may also include other components that access data stored in the memory 130. For example, the other components may include a CPU 141, a Graphics Processing Unit (GPU) 142, a multimedia engine 143, a display circuit 144, an image processor 145, a video codec 146, and so forth.

In one embodiment, the memory 130 may have a sequence of memory blocks (memory blocks) separated by memory boundaries (memory boundaries) based on page size or channel partition, for example, at every 32 bytes, 64 bytes, 128 bytes, 256 bytes, 512 bytes, 1 kbytes, 2 kbytes, or 4 kbytes. Accessing a certain amount of data stored in a memory block between two adjacent dividing lines is more efficient than accessing the same amount of data stored separately in two memory blocks across the dividing lines of the memory. Thus, when the starting address of the data is aligned with the memory boundary, the data in memory 130 may be efficiently accessed by another component of memory system 100. Memory boundaries are formed at addresses that are multiples of the memory block size. In one embodiment, the memory block size may accommodate a certain amount of data that may be quickly transferred between the memory 130 and other components of the other memory system 100 in a sequence of burst read/write commands and a single or several precharge and activate commands. The memory block size may be selected based on characteristics of the memory 130 and other components of the memory system 100 accessing the memory 130, such as the page size and channel partitioning of the memory 130, as well as the architecture and operating mode of the memory 130 and other components of the memory system 100 accessing the memory 130.

The memory allocation means 110 is arranged to receive an input image and to divide it into one or more access units. The memory allocation means 110 is further arranged to allocate portions of the memory 130 to the input image, for example a frame buffer 131, and to allocate two memory portions, a main portion and a sub portion, in the frame buffer 131 for each access unit. In one example, the start address of the frame buffer 131 may be aligned to a memory boundary, e.g., 0 bytes. The memory allocation means 110 is arranged to compress each access unit and store each compressed access unit to a respective primary portion and, if the size of the compressed access unit exceeds the size of the primary portion, to store the remaining portion of the compressed access unit to its respective secondary portion. In one embodiment, memory allocation device 100 may be integrated into any component that accesses data stored in memory 130, such as one or more components of memory system 100, including CPU 141, GPU142, multimedia engine 143, display circuitry 144, image processor 145, video codec 146, and the like.

In one embodiment, the main portion may have a starting address aligned with a memory boundary and a size that is one or more times the size of the memory block. Thus, data stored in the main portion can be efficiently accessed. Alternatively, when the size of the main portion is smaller than the memory block size, each main portion may be located within a respective memory block, and the starting address of one or more main portions may be aligned with one or more memory boundaries.

The size of the sub-portion may be a fraction of the memory block size. Thus, two or more secondary portions may be combined together and stored independently of their respective primary portions. When the size of the compressed access unit is smaller than or equal to the size of the main part, the compressed access unit can be stored completely inside the main part without using the sub part. In this way, accessing the compressed access unit can be done efficiently, since the main part can be accessed efficiently.

In another embodiment, at least one of the sub-portions is not located consecutively after its respective main portion in the frame buffer, which includes the main portion having a larger address than its respective sub-portion in reverse order.

Memory controller 120 is used to manage memory accesses from memory allocation device 110 to memory 130. The memory controller 120 may be configured to receive a request from the memory allocation device 110 to store the compressed access units in respective primary and secondary portions of a frame buffer 131 of the memory 130. Based on these requests, memory controller 120 may send commands to memory 130 with instructions to store the compressed access units to the respective primary and secondary portions of frame buffer 131. Memory controller 120 may also be used to schedule and cache such requests, etc.

Memory 130 may be any suitable device for storing data. In one embodiment, memory 130 includes Dynamic Random Access Memory (DRAM) type memory modules, such as double data rate synchronous DRAM (DDR SDRAM), double data rate double synchronous DRAM (DDR 2 SDRAM), double data rate triple synchronous DRAM (DDR 3SDRAM), double data rate four synchronous DRAM (DDR four synchronous DRAM, DDR4SDRAM), low power DDR SDRAM (LPDDR SDRAM), and the like.

In one embodiment, the memory system 100 may be a system-on-chip (SOC) in which all components are located on a monolithic Integrated Circuit (IC) chip. In addition, other components such as the CPU 141, GPU142, multimedia engine 143, display circuitry 144, image processor 145, and video codec 146 may also be included on the same single IC chip. Alternatively, components in memory system 100 may be distributed across several ICs. For example, memory allocation device 110, memory controller 120, memory 130, and other components of memory system 100 may be located on multiple IC chips. Additionally, memory allocation device 110 may be integrated into any component that accesses data stored in memory 130, such as one or more components of memory system 100, including CPU 141, GPU142, multimedia engine 143, display circuitry 144, image processor 145, video codec 146, and so forth.

During operation, an input image may be received by the memory allocation device 110. The memory allocation device 110 may divide the input image into one or more access units. In addition, the memory allocation device 110 may allocate a portion of the memory 130, such as the frame buffer 131, to the input image. Two memory portions, a main portion and a sub portion, are allocated to each access unit in the frame buffer 131. Under the instruction of the memory allocation device 110, the memory controller 120 may store the compressed access units to their respective primary portions, as well as the secondary portions as the case may be. The main portion may have a starting address aligned with a memory boundary and a size that is one or more times the size of the memory block. The size of the sub-portion may be a fraction of the memory block size. Thus, two or more secondary portions may be combined together and stored independently of their respective primary portions. When the size of the compressed access unit is smaller than or equal to the size of the main part, the compressed access unit can be stored completely inside the main part without using the sub part. In this way, accessing the compressed access unit can be done efficiently.

Fig. 2 is an exemplary data structure 200 showing an input image 210, a frame buffer 231A and a frame buffer 231B divided into access units, according to an embodiment of the present invention. As shown, the input image 210 may be partitioned into an N × M array of access units. Inside the array, the size of an access unit depends on the number of pixels in this access unit and the pixel bit-depth (pixel bit-depth). The pixel bit depth is the number of bits of the pixel used to specify a color, e.g., 10 bits or 12 bits, which corresponds to 1024 colors or 4049 colors, respectively. In one example, the number of pixels in an access unit may depend on the compression method used by the memory allocation device 110, e.g. the size of the compression unit depends on which compression method is operated with. For example, the size of the compression unit may be 4 × 4 pixels, 8 × 8 pixels, 16 × 4 pixels, 16 × 8 pixels, 16 × 16 pixels, and the like. The access unit may have one or more compression units.

The frame buffer 231A shows an exemplary frame buffer structure for storing an input image. The frame buffer may be a memory having accessible locations for storing data. Accessible locations within memory may be combined into memory blocks having a memory block size. As described above, the memory block size may be selected based on characteristics of the memory 130 and other components of the memory system 100 accessing the memory 130, such as the page size and channel partitioning of the memory 130, as well as the architecture and operating mode of the memory 130 and other components of the memory system 100 accessing the memory 130. In one example, the memory block size may be selected to be 32 bytes, 64 bytes, 128 bytes, 256 bytes, 512 bytes, 1 kbytes, 2 kbytes, 4 kbytes, and the like. For example, memory 130 may be a DDR3SDRAM device, and data is retrieved from memory 130. The memory block size is the amount of data that can be retrieved from memory 130 in a single read cycle. Specifically, the memory block size may be 32 bytes when the data bus width and burst length (burst length) are 64 bits (i.e., 8 bytes) and 4, respectively. In another example, the memory block size may be determined by the CPU 141 or GPU142 cache line accessing memory as 64 bytes or 128 bytes, etc. In the example of fig. 2, the memory block size is 64 bytes, and thus the memory boundaries 250(1) -250(n) are located at 0 bytes, 64 bytes, 128 bytes, 192 bytes, etc., in the frame buffer 231A and the frame buffer 231B.

A primary portion and a secondary portion may be assigned to each access unit. In one embodiment, the size of the main portion may depend on compressibility of the input image, compression method, memory block size, etc. Furthermore, in one embodiment, the sum of the size of the primary portion and the size of the secondary portion may be equal to the size of the access unit. Likewise, the ratio of the size of the main portion to the size of the sub portion may depend on the compressibility of the input image, the compression method, the memory block size, etc. For example, when an access unit can be compressed to a smaller size, a smaller primary portion is sufficient to store the compressed access unit, and the respective secondary portions can be left empty, making the ratio of the size of the primary portion to the size of the secondary portions smaller. For example, the ratio of the size of the main portion to the size of the secondary portion may be 2, 4, 8, etc.

In addition, the main portion may have a start address aligned with a memory boundary and a size that is a multiple of a memory block size, so that data stored in the main portion may be efficiently accessed. In the example of fig. 2, the main portions 221(1) -221(3) may be selected to have a memory block size of 64 bytes, and the starting addresses of the main portions 221(1) -221(3) are aligned with the memory dividing lines 250(1) -250(3), respectively. The size of the respective sub-portions 241(1) - (241) (3) may be selected to be less than 64 bytes, for example 32 bytes.

The frame buffer 231B shows an example when compressed

access units

261 and 263 having various sizes are stored. Compressed access units 261-' 263 may be stored in their respective main portions 221(1) -221(3) and sub-portions 241(1) -241 (3). In the example of fig. 2, the size of the compressed access unit 261 is less than the memory block size of 64 bytes. Thus, compressed access unit 261 may be stored in primary portion 221(1) while secondary portion 241(1) remains empty. The size of the compressed access unit 262 is equal to the memory block size of 64 bytes. Thus, compressed access units 262 may be stored in primary portion 221(2) while secondary portion 241(2) remains empty. However, the size of the compressed access unit 263 is larger than the memory block size of 64 bytes. Thus, a first portion of compressed access unit 263 may fill main portion 221(3), and a second portion or remaining portion of compressed access unit 263 may be stored in sub-portion 241 (3).

In various embodiments, the size of the access unit, the size of the primary portion, and the size of the secondary portion may be selected and held constant for the input image. Alternatively, a plurality of input images, such as sequential frames of a video, may be stored by the memory system 100. The size of the access unit, the size of the main section and the size of the secondary section may be selected for each individual input image, and thus may dynamically vary from one input image to another.

The main part and the sub part may be provided in the frame buffer 131 according to various layouts. Fig. 3-5 show example layouts that include a repeating pattern of primary and secondary portions in a periodic manner, with the smallest repeating unit being a super-block. Thus, the primary and secondary portions of the frame buffer 131 may be arranged, for example, by sequentially placing super blocks adjacent to each other. In one embodiment, the size of the super block may be a multiple of the size of the memory block.

FIG. 3 is a block diagram of three exemplary superblocks 341A-341C in three frame buffers 331A-331C, in accordance with an embodiment of the present invention. Superblocks 341A-341C share the same characteristics, with all of the sub-portions in the superblock being combined into a sub-portion group, which is located in the middle of the respective superblock. The start address of the main part is aligned with the memory boundaries. The size of the set of sub-portions is one or more times the size of the memory block, and a first sub-portion of the set of sub-portions is aligned with a memory boundary.

Referring to the super block 341A, the size of the access unit is set to 160 bytes, the memory block size is set to 128 bytes, and the sizes of the main part and the sub part are set to 128 bytes and 32 bytes, respectively. As shown, the superblock model has four sub-parts (i.e., S)₀-S₃) Is inserted in the first master part group (i.e. M)₀-M₁) With the second master part grouping (i.e. M)₂-M₃) In the meantime. Superblock 341A is 5 times the size of the memory block (i.e., 640 bytes). The memory boundaries of superblock 341A are located at 0 bytes, 128 bytes, 256 bytes, 384 bytes, 512 bytes, and 640 bytes, all of main part M₀-M₃Are aligned with the memory boundaries at 0 bytes, 128 bytes, 384 bytes and 512 bytes, respectively. The sub-part group has a size of 128 bytes, the first sub-part S₀Aligned with the memory boundary located at 256 bytes.

Referring to the super block 341B, the size of the access unit is set to 192 bytes, the memory block size is set to 128 bytes, and the sizes of the main part and the sub part are set to 128 bytes and 64 bytes, respectively. As shown, the superblock model has four sub-parts (i.e., S)₀-S₃) Is inserted in the first master part group (i.e. M)₀-M₁) With the second master part grouping (i.e. M)₂-M₃) In the meantime. The size of super block 341B is 6 times the size of the memory block (i.e., 768 bytes). The memory boundaries of superblock 341B are located at 0 bytes, 128 bytes, 256 bytes, 384 bytes, 512 bytes, 640 bytes, and 768 bytes, all of main portion M₀-M₃Are aligned with the memory boundaries at 0 bytes, 128 bytes, 512 bytes, and 640 bytes, respectively. The sub-part group has a size of 256 bytes, the first sub-part S₀Aligned with the memory boundary located at 256 bytes.

Referring to the super block 341C, the size of the access unit is set to 384 bytes, the memory block size is set to 256 bytes, and the sizes of the main part and the sub part are set to 256 bytes and 128 bytes, respectively. As shown, the superblock model has a structure including two sub-parts (i.e., S)₀-S₁) Is inserted in the first main part M₀With the second main part M₁In the meantime. Superblock 341C is 3 times the size of the memory block (i.e., 768 bytes). The memory boundary of super block 341C is located at 0 bytes, 256 bytes, 512 bytes, and 768 bytes, main portion M₀-M₁Respectively and at 0The memory boundaries of bytes are aligned with the memory boundaries located at 512 bytes. The sub-part group has a size of 256 bytes, the first sub-part S₀Aligned with the memory boundary located at 256 bytes.

FIG. 4 is a block diagram of three exemplary superblocks 441A-441C in three frame buffers 431A-431C, according to an embodiment of the present invention. Superblocks 441A-441C share the same characteristics, with all of the secondary portions in the superblock combined into a secondary portion group that is followed by a primary portion group that contains the primary portion. The start address of the main part is aligned with the memory boundaries. The size of the set of sub-portions is one or more times the size of the memory block, and a first sub-portion of the set of sub-portions is aligned with a memory boundary.

Referring to the super block 441A, the size of the access unit is set to 192 bytes, the memory block size is set to 128 bytes, and the sizes of the main part and the sub part are set to 128 bytes and 64 bytes, respectively. As shown, the superblock model 441A has a structure including two sub-portions (i.e., S)₀-S₁) Following the master portion group (i.e., M)₀-M₁). The size of the super-block 441A is 3 times the size of the memory block (i.e., 384 bytes). The memory boundaries of superblock 441A are located at 0 bytes, 128 bytes, 256 bytes, and 384 bytes, all of major portion M₀-M₁Are aligned with the memory boundary at 0 bytes and the memory boundary at 128 bytes, respectively. The sub-part group has a size of 128 bytes, the first sub-part S₀Aligned with the memory boundary located at 256 bytes.

Referring to the super block 441B, the size of the access unit is set to 160 bytes, the memory block size is set to 128 bytes, and the sizes of the main part and the sub part are set to 128 bytes and 32 bytes, respectively. As shown, the superblock model has four sub-parts (i.e., S)₀-S₃) Following the master portion group (i.e., M)₀-M₃). The size of the super block 441B is 5 times the size of the memory block (i.e., 640 bytes). The memory boundaries of superblock 441B are located at 0, 128, 256, 384, 512, and 640 bytes, with all major portions M₀-M₃Are aligned with the memory boundaries at 0 bytes, 128 bytes, 256 bytes and 384 bytes, respectively. The sub-part group has a size of 256 bytes, the first sub-part S₀Aligned with the memory boundary at 512 bytes.

Referring to the super block 441C, the size of the access unit is set to 320 bytes, the memory block size is set to 128 bytes, and the sizes of the main part and the sub part are set to 256 bytes and 64 bytes, respectively. As shown, the superblock model has a structure including two sub-parts (i.e., S)₀-S₁) Following the master portion group (i.e., M)₀-M₁). The size of the super-block 441C is 5 times the size of the memory block (i.e., 640 bytes). The memory boundaries of superblock 441C are located at 0, 128, 256, 512, and 640 bytes, with all major portions M₀-M₁Are aligned with the memory boundaries at 0 bytes and 256 bytes, respectively. The sub-part group has a size of 256 bytes, the first sub-part S₀Aligned with the memory boundary located at 512 bytes.

FIG. 5 is a block diagram of two exemplary superblocks 541A-541B in two frame buffers 531A-531B, according to an embodiment of the present invention. The superblocks 541A-541C share the same characteristics, with the main portion being smaller in size (i.e., 128 bytes) than the memory block size (i.e., 256 bytes). In addition, some compressed access units may need to be stored in the primary portion and the secondary portion, while some compressed access units may be stored in the primary portion. In order to allow efficient access to the compressed access units stored in the main and sub-parts, as many sub-parts as possible corresponding to the main part may be contained in the same memory block and preferably immediately follow the respective main part. For example, in super block 541A, main portion M is in its respective memory block₀By its sub-part S₀Following this, the main section M₃By its sub-part S₃Followed by.

Referring to the super block 541A, the size of an access unit is set to 192 bytes, the size of a memory block is set to 256 bytes, and the sizes of a main portion and a sub portion are respectively set to 128 wordsSection and 64 bytes. As shown, the superblock model has a secondary portion S₀Following each autonomous part M₀And a sub-portion S₃Following each autonomous part M₃. The super block 541A has a size 3 times the size of the memory block (i.e., 768 bytes). The memory boundaries of the super-block 541A are located at 0 bytes, 256 bytes, 512 bytes and 768 bytes, three main parts M₀、M₁And M₃Are aligned with memory boundaries at 0 bytes, 256 bytes and 512 bytes, respectively. Main part M₂Not aligned with the memory boundary, but the main part M₂Within a single memory block between 256 bytes and 512 bytes.

Referring to the super block 541B, the size of the access unit is set to 192 bytes, the memory block size is set to 256 bytes, and the sizes of the main portion and the sub portion are set to 128 bytes and 64 bytes, respectively. As shown, the superblock model has its main part M followed by₀Of (2) a secondary moiety S₀And following its main part M₁Of (2) a secondary moiety S₁. The super block 541B has a size 3 times the size of the memory block (i.e., 768 bytes). The memory boundaries of the super-block 541A are located at 0 bytes, 256 bytes, 512 bytes and 768 bytes, three main parts M₀、M₁And M₂Are aligned with memory boundaries at 0 bytes, 256 bytes and 512 bytes, respectively. Main part M₃Not aligned with the memory boundary, but the main part M₃Within a single memory block between 512 bytes and 768 bytes.

Fig. 6 shows an alternative frame buffer example according to an embodiment of the invention. The main and sub portions of frame buffer 631A and frame buffer 631B may be arranged by having two groups, a main portion group including all main portions sequentially placed adjacent to each other and a sub portion group including all sub portions sequentially placed adjacent to each other. In one embodiment, the size of the main portion is one or more times the size of the memory block, and the start address of the main portion may be aligned with the memory boundary. The primary portion group may be placed adjacent to the secondary portion group or may be separate from the secondary portion.

As shown in the frame buffer 631A, the size of the access unit is set to 80 bytes, the memory block size is set to 64 bytes, and the sizes of the main part and the sub part are set to 64 bytes and 16 bytes, respectively. The master portion group includes all master portions. As shown, the main portions are placed adjacent to each other and have starting addresses that align with contiguous memory boundaries at 0 bytes, 64 bytes, 128 bytes, 192 bytes, 256 bytes, etc. The secondary portion group includes all secondary portions. First sub-part S₀May have a starting address that is aligned with the memory boundary, e.g., with the memory boundary located at 512 bytes.

As shown in the frame buffer 631B, the size of the access unit is set to 160 bytes, the memory block size is set to 128 bytes, and the sizes of the main part and the sub part are set to 128 bytes and 32 bytes, respectively. The master portion group includes all master portions. As shown, the main portions are placed adjacent to each other and have starting addresses that align with contiguous memory boundaries at 0 bytes, 128 bytes, 256 bytes, 384 bytes, 512 bytes, etc. The secondary portion group includes all secondary portions. First sub-part S₀May have a starting address that is aligned with the memory boundary, e.g., with the memory boundary located at 4096 bytes.

In the superblock and frame buffers shown in fig. 3-6, at least one of the secondary portions is not sequentially located after its respective primary portion.

In one embodiment, the starting addresses of the superblock and frame buffers may be aligned with the memory boundaries of memory 130, e.g., at 0 bytes as shown in FIGS. 3-6.

Although exemplary superblock and frame buffers are shown in fig. 3-6, it should be understood that variations such as variations of the superblock model, the location of the superblock in the frame memory, etc., are possible in order to satisfy different memory usage scenarios.

During operation, when the primary and secondary portions are placed in the frame buffer 131 according to a layout, such as the layouts shown in fig. 3-6, the compressed access units may be stored in the respective primary portions. For example, the compressed access units may be stored in the respective primary portions and, if desired, in the respective secondary portions. When the size of the compressed access unit is equal to or smaller than the size of the respective primary portion, the compressed access unit may be completely stored in the respective primary portion, and the corresponding secondary portion may remain empty.

Fig. 7 shows a flow chart depicting an exemplary process 700 according to an embodiment of the invention. In one example, flow 700 is performed by memory system 100 in FIG. 1. The flow starts at step S701 and continues to step S710.

In step S710, the input image is segmented into one or more access units, for example, an N × M array of access units as shown in fig. 2. In one example, the memory allocation device 110 is used to segment an input image into access units of an array. The input image may be a video frame, a photographic image, a graphical image, an animated image, etc. For example, the video frame may be a reference frame used by the video codec 146. The flow then proceeds to step S720.

In step S720, a frame buffer is allocated in the memory. In one example, the memory allocation device 110 is used to allocate a frame buffer 131 in the memory 130. The size of the frame buffer is equal to or larger than the size of the input image. In one embodiment, the start address of the frame buffer may be aligned with a memory boundary, such as a memory boundary at 0 bytes.

In step S730, two memory portions, a main portion and an auxiliary portion, are allocated to each access unit in the frame buffer. In one example, the memory allocation means is for allocating a primary portion and a secondary portion to each access unit in the frame buffer 131. In one embodiment, the sum of the size of the primary portion and the size of the secondary portion may be equal to the size of the access unit, e.g., may be the size of an uncompressed access unit. In one embodiment, the size of the main portion may depend on the compressibility of the input image, the compression method, the memory block size, and the like. Further, in one embodiment, the ratio of the size of the main portion to the size of the sub portion may depend on the compressibility of the input image, the compression method, and the like. For example, when an access unit can be compressed to a smaller size, a smaller primary portion is sufficient for storing the compressed access unit, and the respective secondary portion can be left empty, such that the ratio of the size of the primary portion to the size of the secondary portion is smaller.

Further, in one embodiment, the main portion may have a starting address aligned with a memory boundary and a size that is one or more times a size of the memory block, such that data stored in the main portion may be efficiently accessed.

Alternatively, when the size of the main portions is smaller than the memory block size, each main portion may be located within a respective memory block, and one or more main portions may have a starting address that is aligned with one or more memory boundaries.

In one embodiment, the size of the sub-portion may be a fraction of the memory block size. Thus, two or more secondary portions may be combined together as one or more secondary portion groups and stored separately from their respective primary portions. Further, in one embodiment, the first sub-portion in each respective sub-portion group may have a starting address that is aligned with a memory boundary.

In another embodiment, at least one of the sub-portions is not sequentially located after its respective main portion in the frame buffer.

The main portion and the sub portion may be arranged in a frame buffer, such as the frame buffer 131, according to different layouts. In one embodiment, the layout may include a repetition model of a super-block, where the super-block is the smallest repeating unit in the frame buffer. Thus, the primary and secondary portions of the frame buffer may be arranged, for example, by sequentially placing super blocks adjacent to each other. The size of the super block may be set to a multiple of the memory block size.

In one embodiment, the starting address of the main portion in the super block is aligned with the memory boundary. The sub-portions in a super-block may be combined into one or more sub-portion groups that are a multiple of the size of the memory block. The first sub-part in each sub-part group may be aligned with a memory boundary. Some exemplary superblocks with the above characteristics are shown in fig. 3 and 4.

In another embodiment, a super block may have one or more main portions that are smaller in size than the memory block size. Some exemplary superblocks are shown in fig. 5. For example, as many secondary portions as possible immediately follow a respective primary portion (e.g., S in superblock 541A of FIG. 5)₀Following M₀And S is₃Following M₃). As another example, each main portion is located entirely in the same memory block.

In one embodiment, the layout does not include a repetitive model of the superblock. Instead, by having a primary part packet and a secondary part packet, the primary part and the secondary part in the frame buffer may be arranged, for example as shown in the example of fig. 6. For example, the master portion packet includes a master portion having a starting address aligned with a contiguous memory boundary. The sub-portion group includes sub-portions adjacent to each other. A first sub-portion of the set of sub-portions may be aligned with a memory boundary.

In step S740, the access units may be compressed into compressed access units to reduce the bandwidth requirements for data transfer between the memory and another device accessing the memory. For example, in the memory system 100, the memory 130 may be located on a different chip than the memory allocation device 110, and the memory allocation device 110 is used to compress the access units to reduce the bandwidth requirements for data transfers between the memory 130 and the memory allocation device 110. Both lossless compression methods and lossy compression methods may be used to compress the access units. Lossless compression methods can preserve the quality of the original data, while lossy compression methods can achieve more compression. The compression method may be a general compression method, an image compression method, or a video compression method, etc. For example, the compression method may include run-length encoding (run-length encoding), dictionary-based algorithms, Hoffman encoding, downscaling (deflmation), chroma sub-sampling, discrete cosine transform, and the like.

In step S750, the size of the compressed access unit is compared with the size of the main portion. In one example, the memory allocation means 110 is arranged to compare the size of the compressed access unit with the size of the main portion. If the size of the compressed access unit is larger than the size of the main section, the flow continues to step S770. Otherwise, the flow proceeds to step S760.

In step S760, the compressed access unit may be stored entirely in the respective main portion because the size of the compressed access unit is smaller than or equal to the size of the main portion. In one example, memory controller 120 is used to store compressed access units to their respective main portions in response to instructions by memory allocation device 110.

When the size of the compressed access unit is larger than the size of the main part, the flow proceeds to step S770. In step S770, the first portions of the compressed access units may be stored to respective master portions. The first portion of the compressed access unit may be the same size as the main portion and fills in the respective main portion. In one example, memory controller 120 is to store a first portion of the compressed access units to their respective main portions in response to an instruction of memory allocation device 110.

In step S780, the second portion or the remaining portion of the compressed access unit may be stored to the respective sub-portion. Thus, when the size of the compressed access unit is larger than the size of the main portion, the compressed access unit may be stored separately to the respective main portion and sub portion. In one example, the memory controller 120 is configured to store the remaining portions of the compressed access units to the respective sub-portions in response to instructions by the memory allocation device 110.

Steps S740-S780 may be repeatedly performed for all access units before the flow proceeds to step S799 where the flow ends. In one example, the memory allocation device 110 and the memory controller 120 are configured to repeatedly perform steps S740-S780 for all access units in the input image.

In various embodiments, the size of the access unit, the size of the primary portion, and the size of the secondary portion may be selected and held constant for the input image. Alternatively, a plurality of input images, such as sequential frames of a video, may be stored by the memory. The size of the access unit, the size of the main section and the size of the secondary section may be selected for each individual input image, and thus may dynamically vary from one input image to another.

In various examples, the memory allocation apparatus 110 or the functionality of the memory allocation apparatus 110 may be implemented in hardware, software, or a combination thereof. In one example, the memory allocation means 110 is implemented in hardware, e.g., processing circuitry, which may comprise one or more of discrete components, integrated circuits, application-specific integrated circuits (ASICs), and the like. In another example, the functions of the memory distribution may be implemented in software or firmware including instructions stored on a computer-readable non-transitory storage medium, which when executed by the processing circuitry, cause the processing circuitry to perform the respective functions.

Since various aspects of the present invention have been described in conjunction with specific embodiments thereof which are presented as examples, alternatives, modifications, and variations of the examples may be made. Accordingly, the embodiments described herein are for illustrative purposes, and are not intended to be limiting. Changes may be made without departing from the scope of the claims.

Claims

1. A method of storing an input image, wherein the input image is stored in a memory, comprising:

allocating one or more frame buffers in the memory;

dividing the input image into a plurality of access units corresponding to a plurality of subsets of the input image and assigning a primary portion and a secondary portion to each of the plurality of access units in the frame buffer, wherein at least one of the secondary portions is out of sequence in the frame buffer after its respective primary portion;

compress the plurality of access units into a plurality of compressed access units; and

storing each compressed access unit into a respective primary portion and, if the size of the compressed access unit exceeds the size of the primary portion, storing the remainder of the compressed access unit into a respective secondary portion.

2. The method of storing an input image of claim 1, wherein the memory has a series of memory blocks separated by a plurality of memory boundaries located at a plurality of addresses that are multiples of a memory block size, and wherein the memory block size is determined based on characteristics of the memory and characteristics of a plurality of devices accessing the memory.

3. A method of storing an input image as claimed in claim 2, characterized in that the size of each main portion is one or more times the size of the memory block, and the start address of each main portion is aligned with a memory boundary respectively.

4. A method of storing an input image as claimed in claim 2, characterized in that the size of each main portion is a fraction of the size of the memory block, and each main portion is located inside a respective memory block.

5. The method of storing an input image as in claim 2, wherein each sub-portion is a fraction of the size of the memory block, and a plurality of sub-portions are combined into one or more sub-portion groups.

6. The method of storing an input image as claimed in claim 5, wherein the size of said one or more sub-group is one or more times the size of said memory block, and the starting address of the first sub-part of each of said sub-group is aligned with a memory boundary.

7. The method of storing an input image as claimed in claim 2,

the plurality of main parts and the plurality of auxiliary parts are arranged in a preset model to form a super block with the size being one or more times of the size of the memory block; and the plurality of main parts and the plurality of sub parts in the frame buffer are arranged by sequentially placing a plurality of super blocks adjacent to each other.

8. The method of storing an input image as claimed in claim 2, wherein the memory block size is selected to be 32 bytes, 64 bytes, 128 bytes, 256 bytes, 512 bytes, 1 kbyte, 2 kbytes or 4 kbytes.

9. The method of storing an input image as claimed in claim 1, wherein the input image is a still image or a video frame.

10. A system for storing an input image, comprising:

a memory having one or more frame buffers;

memory allocation means for receiving said input image, allocating a frame buffer in said memory for storing said input image, dividing said input image into a plurality of access units corresponding to a plurality of subsets of said input image, and allocating a main portion and a sub-portion to each access unit in said frame buffer, wherein at least one of said sub-portions is non-sequentially located after its respective main portion in said frame buffer; and

a memory controller for storing each compressed access unit into a respective primary portion in response to a plurality of instructions of the memory allocation means and storing a remainder of the compressed access unit into a respective secondary portion if the size of the compressed access unit exceeds the size of the primary portion.

11. The system for storing an input image according to claim 10, wherein the memory has a series of memory blocks separated by a plurality of memory demarcations located at a plurality of addresses that are multiples of a memory block size, and wherein the memory block size is determined based on characteristics of the memory and characteristics of a plurality of devices accessing the memory.

12. The system for storing an input image according to claim 11, wherein said memory allocation means is operable to select the size of each main portion to be one or more times the size of said memory block and to align the start address of each main portion with a memory boundary.

13. A system for storing an input image as in claim 11, wherein said memory allocation means is adapted to select the size of each main portion as a fraction of the size of said memory block and to place each main portion inside a respective memory block.

14. The system for storing an input image according to claim 11, wherein said memory allocation means is operable to select the size of each sub-portion as a fraction of the size of said memory block and to combine a plurality of sub-portions into one or more sub-portion groups.

15. The system for storing an input image according to claim 14, wherein said memory allocation means is operable to select one or more sub-group sizes as one or more times the size of said memory block and to align the starting address of the first sub-part of each of said sub-groups with a memory boundary.

16. The system for storing an input image according to claim 11, wherein said memory allocation means is adapted to arrange the plurality of main parts and sub-parts in a predetermined pattern to form a super block having a size one or more times a size of said memory block, and to place a plurality of super blocks adjacent to each other in said frame buffer.

17. The system for storing an input image according to claim 11, wherein the memory allocation means is for determining the memory block size to be 32 bytes, 64 bytes, 128 bytes, 256 bytes, 512 bytes, 1 kbytes, 2 kbytes or 4 kbytes.

18. A system for storing an input image as in claim 10, wherein said memory is located on a different integrated circuit chip than said memory allocation means.

19. The system for storing an input image as in claim 10, wherein said memory allocation means is integrated in a video codec.

20. A non-transitory computer readable medium having computer readable instructions stored thereon that, when executed by a processing circuit, the processing circuit performs a method comprising:

allocating one or more frame buffers in a memory;

dividing an input image into a plurality of access units corresponding to a plurality of subsets of the input image and assigning a primary portion and a secondary portion to each of the plurality of access units in the frame buffer, wherein at least one of the secondary portions is non-sequentially located after its respective primary portion in the frame buffer;

compress the plurality of access units into a plurality of compressed access units;

each compressed access unit is stored in a respective primary portion and, if the size of the compressed access unit exceeds the size of the primary portion, the remainder of the compressed access unit is stored in a respective secondary portion.