CN106575263A - Cache line compaction of compressed data segments - Google Patents
Cache line compaction of compressed data segments Download PDFInfo
- Publication number
- CN106575263A CN106575263A CN201580041874.4A CN201580041874A CN106575263A CN 106575263 A CN106575263 A CN 106575263A CN 201580041874 A CN201580041874 A CN 201580041874A CN 106575263 A CN106575263 A CN 106575263A
- Authority
- CN
- China
- Prior art keywords
- data
- data segments
- computing device
- address
- identified
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0875—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with dedicated cache, e.g. instruction or stack
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0877—Cache access modes
- G06F12/0886—Variable-length word access
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/10—Providing a specific technical effect
- G06F2212/1016—Performance improvement
- G06F2212/1021—Hit rate improvement
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/40—Specific encoding of data in memory or cache
- G06F2212/401—Compressed data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/45—Caching of specific data in cache memory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/60—Details of cache memory
- G06F2212/608—Details relating to cache mapping
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
Methods, devices, and non-transitory process-readable storage media for compacting data within cache lines of a cache. An aspect method may include identifying, by a processor of the computing device, a base address (e.g., a physical or virtual cache address) for a first data segment, identifying a data size (e.g., based on a compression ratio) for the first data segment, obtaining a base offset based on the identified data size and the base address of the first data segment, and calculating an offset address by offsetting the base address with the obtained base offset, wherein the calculated offset address is associated with a second data segment. In some aspects, the method may include identifying a parity value for the first data segment based on the base address and obtaining the base offset by performing a lookup on a stored table using the identified data size and identified parity value.
Description
Background technology
Lossless compressiong can be by the less size of data segments boil down to of configurable size.For example, lossless compress
Algorithm can utilize different compression ratios (for example, 4:1、4:2、4:3、4:4) data segments of 256 bytes are compressed into various big
The compressed data section of little (for example, 64 bytes, 128 bytes, 192 bytes, 256 bytes).Conventional technique can be to data
Section (or data block) is processed so that the data segments to be reduced to the sub-fraction of its original size, so as to contribute to reducing
The bandwidth consumed during data read/write between memory cell and stock number.
Such compressed data can be stored in the cache line of cache (for example, L3 caches).Generally,
The data segments (or compressed data section) of compressed format are stored in and the data segments of uncompressed form (or source number
According to section) the corresponding cache line of physical address in.(for example, each cache line can have the size of setting
128 bytes) or capacity, for any loading operation must be filled with the size or capacity of the setting, but regardless of required data
Size how.For example, when the section of 64 bytes or 192 bytes is loaded in the cache line of 128 bytes (B), may
Computing device is needed to read and store in cache line other potential unwanted 64 byte, it is slow at a high speed to be filled up completely with
Deposit 128 capable bytes.Therefore, the compressed data in cache line (or multiple cache lines) is loaded into and source
When data segments (or cache line) are not formed objects, there may be not comprising useful data in physical address space
" empty (hole) ".This undesirable loading to unwanted, disabled data is referred to as " cross and take ", and can cause
Memory cell (for example, Double Data Rate (DDR) random access memory (RAM) etc.) suboptimum and cache between
Resource and bandwidth are used.For example, partial cache-line can be used up in cavity, so as to both increased asked bandwidth of memory,
The space in cache is wasted again, and this causes the inhomogeneities that the resource in computing device is used.Due to compression scheme (example
Such as, Burrows-Wheeler conversion etc.) many compressed data sections of cache line can be yielded less than, therefore can go out
Now significant mistake takes, and so as to cause live load that is greatly inefficient and increasing, this can reduce the benefit using compress technique.
Additionally, compressed block size can be less than optimum DDR minimum access length (MAL), so as to cause worse performance.
The content of the invention
Various aspects provide the method for the data in the cache line for the cache for being compacted computing device, set
Standby and non-transitory processes readable storage medium storing program for executing.One side method can be with:Base address of the mark for the first data segments;Mark
Know the size of data for first data segments;The size of data for being identified based on first data segments and described
Base address is offset substantially to obtain;And enter line displacement to the base address to calculate partially by using the basic skew for being obtained
Address is moved, wherein, the offset address for being calculated can be associated with the second data segments.
In certain aspects, based on the size of data for being identified and the base address of first data segments obtaining
The basic skew by the computing device software of the computing device or can be coupled to the described of the computing device
Manage one of special circuit of device to perform, and the base address is carried out partially by using the basic skew for being obtained
In-migration calculates the offset address and by the computing device software of the computing device or can be coupled to the calculating
One of described special circuit of the processor of equipment is performing.
In certain aspects, the base address can be physical address or virtual address.In certain aspects, identified
Size of data can be identified based on the compression ratio being associated with first data segments.In certain aspects, it is described
Compression ratio can be in the following:4:1 compression ratio, 4:2 compression ratios, 4:3 compression ratios or 4:4 compression ratios.
In certain aspects, based on the base address of the compression ratio for being identified and first data segments to obtain
Stating basic skew can include:The parity values for first data segments are identified based on the base address, and
The basic skew is obtained using identified compression ratio and the parity values for being identified.
In certain aspects, using identified compression ratio and the parity values that identified obtaining the basic skew
Can include:Stored table is performed lookup to obtain by using the compression ratio for being identified and the parity values for being identified
The basic skew.In certain aspects, during the parity values indicate the base address of first data segments
One can be odd number or even number.In certain aspects, the parity values can be based on first data segments
Two positions in the base address.
In certain aspects, based on the size of data for being identified and the base address of first data segments obtaining
The basic skew can include:Obtain for first data segments the first basic skew and the second basic skew and
First size of data and the second size of data, and enter line displacement to the base address by using the basic skew for being obtained
Calculating the offset address can include:Calculate for the first offset address of first size of data and for described the
Second offset address of two size of data.
In certain aspects, methods described is additionally may included at calculated offset address and stores first data field
Section, this can include:First data segments are read at the base address as uncompressed data;With the number for being identified
First data segments are compressed according to size;And the first compressed data field is stored at the offset address for being calculated
Section.
In certain aspects, line displacement is entered to the base address by using the basic skew for being obtained described inclined to calculate
Move address can first data segments by compression after complete.In certain aspects, methods described can also include:
First data segments are read at the offset address for being calculated.
In certain aspects, methods described can also include:Determine second data segments whether with order to institute
The consecutive correct compression ratio of the first data segments is stated, and first data segments is read at the offset address for being calculated
Can include:In response to determining that second data segments have the correct compression ratio, using first data segments come
Prefetch second data segments.In certain aspects, methods described can also include:The first data segments to being read are entered
Row decompression.
Various aspects can include a kind of computing device, and the computing device is configured with processor executable to hold
The operation of row method described above.Various aspects can include a kind of computing device, and the computing device has for holding
The unit of the function of the operation of row method described above.Various aspects can include being stored thereon with the executable finger of processor
The non-transitory processor readable medium of order, the processor executable are configured such that the processor of computing device is held
The operation of row method described above.
Description of the drawings
The accompanying drawing for being incorporated herein and constituting the part of this specification shows the illustrative aspects of the present invention, and connects
With overall description as given above and specific descriptions given below, to the feature for explaining the present invention.
Fig. 1 is the memorizer knot of exemplary placement of the compressed data segments known in the art in cache line
Composition.
Fig. 2 shows the memorizer of the placed offset of data segments in cache line according to aspect compact technique
Structure chart.
Fig. 3 show it is in being suitable in terms of some, include it is corresponding with compression ratio and base address parity values
Basic deviant tables of data data structure diagram.
Fig. 4 A are to be suitable for examples in various aspects, function for returning to new address (for example, physical address)
Property false code.
Fig. 4 B be suitable for it is in various aspects, for calling figure 4A exemplary functions exemplary pseudo-code.
Fig. 4 C show be suitable for it is in various aspects, used during the realization of the exemplary functions of Fig. 4 A
The exemplary pseudo-code of example values.
Fig. 5 shows according to another aspect compact technique that the placed offset of data segments is deposited in cache line
Reservoir structures figure.
Fig. 6 is showed and is suitable for bases in various aspects, including two positions with compression ratio and corresponding to address
The data structure diagram of the tables of data of the corresponding basic deviant of address parity check value.
Fig. 7 A are be suitable in various aspects, function for returning to new address (for example, physical address) another
Exemplary pseudo-code.
Fig. 7 B be suitable for it is in various aspects, for calling figure 7A exemplary functions another exemplary puppet generation
Code.
Fig. 7 C show be suitable for it is in various aspects, used during the realization of the exemplary functions of Fig. 7 A
The another exemplary false code of example values.
Fig. 8 A- Fig. 8 C show the aspect side that the data in the cache line of cache are compacted for computing device
The procedure graph of method.
Fig. 9 shows the aspect side of the data compressed and be compacted in the cache line of cache for computing device
The procedure graph of method.
Figure 10 is the block component diagram of the computing device being suitable in various aspects.
Specific embodiment
Various aspects will be described in detail with reference to the attached drawings.Whenever possible, will be come using identical reference through accompanying drawing
Refer to same or analogous part.The reference of particular example and implementation is for illustrative purposes, it is not intended that limit
The scope of the system present invention or claims.
Word " exemplary " used herein expression " is used as example, example or explanation ".Here depicted as " exemplary "
Any implementation be not necessarily to be construed as it is preferably or more favourable than other implementations.
Various aspects include to realize with the data segments by being compacted in cache improving in computing device
The compactness of the compressed data section in cache memory, so that the side that data segments are overlapped on a cache line
Method.By overlapped data on a cache line, little compressed data section can be with share cache lines, so as to realize more
Efficient memory access and unwanted mistake less frequently take (overfetching).
Term " computing device " used herein is referring to any one in the following or whole:Cell phone, intelligence
Can phone, networking plate (web-pad), tablet PC, the cell phone with internet function, the electronics with WiFi function
Equipment, personal digital assistant (PDA), laptop computer, desk computer (or personal computer), server and equipment
There is the similar electronic equipment of at least processor.Computing device can utilize various frameworks to realize holding via their processor
Row software instruction, and can include one or more memory cells, for example random access memory unit (for example, DDR,
RAM etc.) and cache element (for example, L2 caches, L3 caches etc.).
Term " source data section " used herein is referring to the memorizer or cache that can be stored in computing device
Uncompressed data segments in unit.In in all fields, the size of source data section can be 256 bytes.Additionally, source number
Can be associated with the proper scope of the physical address space that can be stored in the caches according to section, the proper scope it is big
Little (for example, the size of uncompressed tile (tile) or data block) identical with source data section.For example, source data section
Proper scope can be 256 bytes.Source data section can be stored in more than one cache line, and therefore they
Proprietary scope can extend to cover (or partly covering) multiple cache lines.Term " base address " used herein is come
Refer to the physical address or virtual address of the beginning of the proper scope for representing source data section in cache.Art used herein
Language " offset address " is referring to the physical address or virtual address of the skew as base address (physics is virtual).
Term " compressed data section " used herein is referring to its size by performing conventional data compression algorithm
The data segments that computing device reduces.The size (for example, byte) of compressed data section can be less than their corresponding source numbers
According to section.For example, compressed data section can be 64 bytes, and its corresponding source data section can be 256 bytes.
Compress technique be generally used for by reduce the data volume transmitted between memory cell and cache element come
Improve the performance of computing device.However, no matter the less size of compressed data section, they still can be with cache
Particular address (for example, physical address) it is associated or align, the suboptimum for frequently resulting in cache line is used.For example, when making
(that is, 4 are used with 128 byte cachelines and by 256 byte data section boil down to, 64 byte:1 compression ratio) or 192 words
Section (that is, uses 4:3 compression ratios) compressed data section when, 64 bytes of cache line are taken and unrestrained because of crossing
Take.The compressed of little minimal compression size (for example, 64 byte) is wanted using than cache line size (for example, 128 byte)
Data segments, the data for being loaded into cache via routine techniquess include the data for wasting, and which increase bandwidth and make compression
It is worse.For example, the compressed data section of 64 bytes and 192 byte-sizeds can include (for example, using for a few thing load
Family interface (UI) workload) cache transactions signal portion, and therefore can cause to the low of cache memory space
Effect is used and unnecessary mistake takes.In many cases, for example, during the scene to the compression of UI workloads is related to, use
4 are utilized in process in the routine techniquess that compressed data is placed in cache line:1 or 4:3 compression ratios are come the data compressed
It is probably especially poorly efficient during section.
Fig. 2 further highlights the potential poor efficiency of routine techniquess, and Fig. 1 shows Figure 101, and the Figure 101 is shown in reality
In the cache line of the computing device of existing such conventional compact technology known in the art, compressed data section is problematic
Placement.Exemplary cache can include cache line 102a-102d, these cache lines align successively and
It is respectively provided with fixed size, such as 128 bytes.The first row 110 of Figure 101 shows can store in the caches first
Source data section 112 (or tile or data block) and the second source data section 114.First source data section 112 can be with
First cache line 102a and the second cache line 102b is corresponding, and (that is, the first source data section 112 can be stored in height
In both fast cache lines 102a-102b).Similarly, the second data segments 114 can be with the 3rd cache line 102c and the 4th
Cache line 102d is corresponding, and (that is, the second source data section 114 can be stored in both cache line 102c-102d
In).
The computing device for being configured to perform various conventional compact algorithms can be using different compression ratios come pressure source number
According to section 112,114, such as 4:1 compression ratio (for example, by 256 byte data section boil down to, 64 byte data section), 4:2 pressures
Contracting than (for example, by 256 byte data section boil down to, 128 byte data section), 4:3 compression ratios are (for example, by 256 byte numbers
According to 192 byte data section of section boil down to), and 4:4 compression ratios are (for example, by 256 byte data section boil down to, 256 word
Save data segments or without compression).
For example, second row 120 of Figure 101 shows that to source data section 112,114 first compresses.Specifically,
One source data section 112 (that is, can be made by the first compressed data section 122 that computing device boil down to size is 192 bytes
With 4:3 compression ratios).First compressed data section 122 can keep (that is, Jing pressures of aliging with the first source data section 112
The beginning of contracting data segments 122 can be stored in the first cache line 102a with not offseting).In the second row 120, close
In the second source data section 114, computing device can use 4:4 compression ratios (or not compressing), and therefore the second source data
The size of section 114 can remain 256 bytes.For example when the compression algorithm used by computing device, be not configured as can
Compress specific data segments or when computing device is otherwise configured to not select compressed data section, can select
Property ground pressure source data segments.
First compressed data section 122 can still with the first cache line 102a and the second cache line 102b
Correspondence (or alignment), however, due to its less size (that is, less than 256 original bytes), therefore the first compressed data area
Section 122 can be merely with a part (or half) of the second cache line 102b.In other words, the first Jing of 192 byte pressures
Contracting data segments 122 can be so that 128 byte the first cache line 102a are fully used and 128 bytes second are slow at a high speed
Deposit row 102b and only used half.As the half to the second cache line 102b is used, untapped data are (in FIG
Be referred to as " OF ") the one 64 byte cross the section 123 that fetches data and also be present in the second cache line 102b.For example, when
When reading the first compressed data section 122 from DDR or other memorizeies, first crosses and takes part 123 and can be loaded into the
In two cache line 102b.
Another example is lifted, the third line 130 of Figure 101 shows that to source data section 112,114 second compresses.Specifically
For, the first source data section 112 can be the second compressed data section 132 of 64 bytes by computing device boil down to size
(that is, use 4:1 compression ratio).Second compressed data section 132 can keep aliging (that is, with the first source data section 112
The beginning of two compressed data sections 132 can be stored in the first cache line 102a with not offseting).Second is compressed
Data segments 132 can not utilize the second cache line 102b, and therefore the second cache line 102b can be filled with
Untapped data segments of two 64 bytes 139 (for example, invalid, empty or other untapped data segments).Substitute
Ground, the second cache line 102b can be filled with the invalid data section of 128 bytes.Due to 64 bytes second it is compressed
Data segments 132 are only filled with the half of 128 byte the first cache line 102a, thus cross take/untapped data second
64 bytes are crossed the section 133 that fetches data and can also be present in the first cache line 102a.
Similar to the second compressed data section 132 in the third line 130, the second source data section 114 can be by calculating
Equipment boil down to size is that the 3rd compressed data section 134 of 64 byte (that is, uses 4:1 compression ratio).3rd compressed data
Section 134 can keep aliging with the second source data section 114, and (that is, the beginning of the 3rd compressed data section 134 can not have
It is stored in the 3rd cache line 102c to skew).3rd compressed data section 134 can not utilize the 4th cache
Row 102d, and therefore the 4th cache line 102d can filled with untapped data segments of two 64 bytes 139 (or
Alternatively untapped data segments of 128 byte).However, as the 3rd compressed data section 134 of 64 byte is only filled with
The half of the 3rd cache line 102c of 128 byte, thus cross take/the 3rd 64 byte of untapped data crosses the section that fetches data
135 can also be present in the 3rd cache line 102c.
Another example is lifted, the fourth line 140 of Figure 101 shows the 3rd compression to source data section 112,114.Specifically
For, the first source data section 112 can be the 4th compressed data section of 128 bytes by computing device boil down to size
142 (that is, use 4:2 compression ratios).4th compressed data section 142 can keep aliging with the first source data section 112
(that is, the beginning of the 4th compressed data section 142 can be stored in the first cache line 102a with not offseting).4th
Compressed data section 142 can not utilize the second cache line 102b, and therefore the second cache line 102b can be with
Filled with untapped data segments of 128 bytes 149 (or alternatively untapped data segments of two 64 bytes).
Second source data section 114 can be the 5th compressed data area of 192 bytes by computing device boil down to size
Section 144 (that is, uses 4:3 compression ratios).5th compressed data section 144 can keep aliging with the second source data section 114
(that is, the beginning of the 5th compressed data section 144 can be stored in the 3rd cache line 102c with not offseting).5th
Compressed data section 144 can only partially utilize the 4th cache line 102d, and therefore the 4th cache line
102d can cross the section 145 that fetches data filled with the 4th 64 byte.
In order to improve the poorly efficient placement of the compressed data section of the routine techniquess shown in Fig. 1, various aspects include
For the data segments that are compacted in cache so that data segments overlap on a cache line method, equipment and non-
Temporary process readable storage medium storing program for executing.In other words, data segments can be placed on the usual of cache another with other
The associated address realm (for example, range of physical addresses) of individual data segments is interior rather than is placed on its own conventional address model
In enclosing, so as to obtain to the more efficient using (that is, it needs to less cache line is storing same amount of number of cache line
According to).Using this overlap, little compressed data section can be with share cache lines, so as to realize more efficient memorizer
Access (for example, DDR MAL) and unwanted mistake less frequently take.
In in all fields, computing device can be configured to perform cache line compacting operation, and the operation is by data
Proper scope that placement displacement (or skew) of section is associated to the usual of cache with another data segments (or address
Space) in.For example, computing device can offset two continuous proper scopes so that they extend to single cache line
Each half.In order to realize this compacting, computing device can identify the base that will be applied to the first data segments (or data block)
The basic skew of address, so that the first data segments are stored in the cache line being generally associated with the second data segments
(for example, in physical address).Basic skew at least can be determined by computing device based on the size of data of data segments.Can be with
The size of data is identified based on the compression ratio ratio of compressed size (that is, uncompressed size with) of data segments.
For example, when data segments are with the first compression ratio (for example, 4:4), when being compressed, the basic skew of the data segments can be from base
First amount (in units of byte) of address, and when data segments are with the second compression ratio (for example, 4:1) when being compressed, the number
Can be the second amount from base address according to the basic skew of section.Can be used to compress based on by the computing device by computing device
The various compression algorithms of data segments are identifying this compression ratio.
Outside size of data (for example, compression ratio), it is basic to determine that computing device is also based on base address itself
Skew.For example, when data segments base address certain position (for example when, 8) position is odd number (that is, ' 1 '), with certain hierarchy compression
(for example, 4:1) data segments to compress can offset (for example, 64 byte) with first, and when certain position is even number
When (that is, ' 0 '), the data segments can be with the second skew (for example, 256 byte).In certain aspects, the position 8 of base address
Value can with base address in every 256 byte and change.For example, for cache in the first address, position 8 can be
' 0 ' value, but at a distance of the second address of 256 bytes, 8 value of position can be ' 1 ' value with the first address.
In certain aspects, the basic skew of compressed data section can be less than 4KB page of size.Therefore, computing device
Can perform any virtual address or physical address for various memorizeies or cache element aspect compacting operation or
Technology.
In certain aspects, computing device can use compression ratio and with regard to base address (for example, the thing in cache
Reason address) information search to identify corresponding basic skew to perform predefined tables of data.However, in some respects
In, computing device can be identified using logic, software, circuit and other functions rather than using predetermined tables of data
The deviant of the data segments for being compacted in cache.For example, computing device can be via the processor of the computing device
Perform software (for example, for accessing stored operation of tables of data etc.) and/or via the place for being coupled to the computing device
The special circuit of reason device, identifies deviant and calculates offset address.
In certain aspects, computing device can be by all proper scopes from the half of base address offset cache line
(being equal to minimal compression size).In this way, it is possible to by the compressed data for being not filled by cache line section be displaced to separately
In the shared cache line of the proper scope of one source data section, so as to realize that the opportunistic to whole cache line is used
(that is, continuous compression can be mapped to shared cache line).
Aspect technology can be by computing device for carrying out compaction data section independently of other data segments in buffer.
For example, regardless of the compression ratio for adjacent data section, specific data segments can be based only upon the compression ratio of its own
With base address in cache line bias internal.In other words, for compaction data section, computing device may not be needed to know which
The compression ratio of its data segments and/or skew.
The following is the exemplary application of the aspect method carried out by computing device.Computing device can be configured to perform from
The reading of memory cell (for example, DDR) is fetching uncompressed data segments (that is, source data section).It is uncompressed, take
The data segments returned generally can be associated with the base address in cache (for example, physical address).Computing device can be commented
Fetched data segments are estimated identify may be via the hierarchy compression (or ratio) of compression algorithm application.The compression ratio for being identified
Can be that possible maximum is maximized to use cache memory space for data segments.Computing device can be with
The compression ratio carrys out compressed data section.The base address being associated based on compression ratio and with uncompressed data segments, calculating are set
For the skew that can be identified in cache to store compressed data segments, for example, from 64 byte of base address offset.Calculate
Equipment subsequently can be loaded into compressed data segments in offset address.
Various aspects can be beneficial to reduce and (for these computing devices, right in various types of computing devices
The improvement that power and/or bandwidth of memory (for example, DDR) are used be it is desirable that) in mistake take correlation unnecessary band
It is wide to use.For example, use operation can be come by the mobile device (for example, smart phone) using SOC(system on a chip).Additionally, logical
Cross in realizing for L3 caches or other similar memorizeies or memory element it is pre- fetch the performance for improving computing device, respectively
Individual aspect can be beneficial.Specifically, can obtain simultaneously the data segments that are stored in shared cache line and
Adjacent data section (or adjacent sections), so as to reduce reading.For example, computing device can fetch a Jing of 64 byte-sizeds
Second compressed data section of the compressed data section together with 64 adjacent byte-sizeds.
Aspects herein described cache line compacting operation can be performed to reduce to various works by computing device
The mistake for making load (including UI workloads) takes.Aspect operation can be particularly useful for reducing to including with 4:1 and 4:3 compression ratios
Come the data segments (which is occurred in continuous blocks) compressed and with 4:1 or 4:2 come the workload of data segments that compresses
Cross and take.
Routine techniquess include for by cache arrangement be support compressed and uncompressed row operation and
It is used to determine whether to want the operation of compressed data.For example, routine techniquess can include:For only because the typical case of squeeze operation
Effect and by two adjacent uncompressed rows be compressed to the method in discrete component (for example, in physical address space order
Adjacent lines due to squeeze operation can be compressed to load a line in).
Different from routine techniquess, various aspects simultaneously not simply discuss compression or compressed data storage is slow in high speed
In depositing.Conversely, aspect technology is provided to can be used for changing this conventional compact technology so that the data of conventional compact can be with
The function being preferably placed in cache line.In other words, aspect technology can by computing device compressed data it
Afterwards processing stage used in so that (post- after the compression that would generally be placed in different cache row
Compression) export (that is, compressed data segments) to be bundled in cache in a more efficient manner.For example, hold
The computing device of row aspect operation can be obtained and originally can be loaded into that two different, in the cache line that partly uses two
Individual different compressed data section, and by using offset shift by the two compressed data storage of sectors to single high speed
In cache lines.In other words, aspect feature is interior to compressed in the address space (for example, physical address space) of cache
Data segments shifted, change compressed data section base address, with improve cache use.
For the sake of simplicity, the behaviour of the compressed data section being compacted in cache is may refer to the description of various aspects
Make.However, various aspects operation can be applied to be compacted any memory element or framework (for example, SRAM, high speed by computing device
Caching etc.) in any kind of data (for example, compressed or uncompressed).In other words, various aspects are not
It is intended to be limited to be used together with data compression scheme.For example, computing device can perform aspects herein described method, with
Improvement includes any data set of the data segments of various length or any data set (example with cavity to buffer
Such as, untapped data segments) compacting.Although base address can be referred to as physical address herein, it will be appreciated that
In certain aspects, base address can be virtual address.
In in all fields, computing device can utilize continuous (that is, adjacent storage) cache line (and therefore
Using data that are being associated with cache line or being stored in cache line).However, in certain aspects, calculate
Equipment can be configured to the data segments in the discontinuous address in storage/access cache.In other words, aspect technology
Can realize to be stored in may or may not be in memory diverse location in cache line in data field
The compacting of section.
For the sake of simplicity, perform various aspects method computing device can be referred to as identify or otherwise use with
The associated compression ratio of data segments is to calculate offset address.It is to be appreciated, however, that the computing device in terms of performing these
Any form of instruction of the size of data to such data segments can be identified or otherwise use, so as to according to each
Aspect carrys out compaction data section.In other words, various aspects can be used for being compressed or may un-compressed data
Section performs compacting.The computing device for performing various aspects technology can be identified based on various compression ratios and data size distribution
The size of data of data segments, and therefore the compression ratio example that refers in the description to various aspects be not intended to right
The scope of requirement is limited to 4:1、4:2、4:3、4:4 compression ratios.
Fig. 2 shows basis by the aspect cache line compacting operation of computing device, data segments at a high speed
Placement in cache lines 220-227.Computing device can be configured to, with conventional compact algorithm to compress each data segments.
Specifically, Fig. 2 shows an example, and wherein computing device can use compression algorithm, so as to via 4:1 compression ratio will
The source data section 200a-206a of 256 bytes is compressed into the compressed section of 64 bytes, via 4:2 compression ratios are compressed it into
The compressed section of 128 bytes, via 4:3 compression ratios compress it into the compressed section of 192 bytes or via 4:4 pressures
Contracting ratio compresses it into the compressed section of 256 bytes.For the following description to Fig. 2 it is clear for the sake of, illustrated using vertical line
Cache line maps the fixed size (for example, 128 byte) of the cache line 220-227 to indicate to align successively, and
Dotted line 270 can indicate the intertexture of 1kB.
Cache line 220-227 can be associated (that is, the first cache with the physical address that from left to right can increase
The physical address of row 220 is the digital little numeral than being associated with the physical address of the second cache line 221, according to this class
Push away).In certain aspects, the physical address of cache line 220-227 can be represented with least 8.Fig. 2 is shown to each
The instruction of the 8th and the 7th of the associated physical address of individual cache line 220-227.For example, the first cache line
220 the 8th (or position 8) can be ' 0 ' value, and the 7th (or the position 7) of the first cache line 220 can be ' 0 '
Value, the 8th (or the position 8) of the second cache line 221 can be ' 0 ' value, and the 7th of the second cache line 221
(or position 7) can be ' 1 ' value, and the 8th (or position 8) of the 3rd cache line 222 can be ' 1 ' value, and the 3rd high speed is slow
The 7th (or position 7) for depositing row 222 can be ' 0 ' value, and the rest may be inferred.
The first row 228 of Figure 200 shows that uncompressed source data section 200a-206a is deposited according to routine techniquess
Storage in the caches when acquiescence in the 256 byte sections border 210-216 place.In other words, source data section 200a-
206a can be stored in their base address with the interval of 256 bytes.Specifically, the first source data section 200a is (in Fig. 2
It is referred to as ' A ') can be no placed offset in the first cache line 220 and the second cache line 221.Second source number
Can be no placed offset in the 3rd cache line 222 and the 4th cache according to section 202a (being referred to as ' B ' in Fig. 2)
In row 223.3rd source data section 204a (being referred to as ' C ' in Fig. 2) can be no placed offset in the 5th cache line
224 and the 6th in cache line 225.4th source data section 206a (being referred to as ' D ' in Fig. 2) can be not placed offset
In the 7th cache line 226 and the 8th cache line 227.
Row 229,230,240,250,260 in Figure 200 shows the source data area to compressed format according to aspect technology
The placed offset Chong Die with base address of section 200a-206a.In certain aspects, as described below, used by computing device
Can be for example following based on the lookup performed to predefined tables of data in the basic skew for placing each compressed data section
With reference to described by Fig. 3.
Second row 229 of Figure 200 shows 256 byte compressed data section 200b-206b in the base from each of which
64 bytes of address offset substantially in simple placed offset.The mapping may be needed using all compressed of this aspect
Distribute 64 other bytes at the ending of data buffer.A Jing corresponding with the first source data section 200a (that is, ' A ')
Compressed data section 200b can be placed on the half of the first cache line 220, whole by computing device with 64 byte offsets
Individual second cache line 221, and the half of the 3rd cache line 222 in.First compressed data section 200b can be with
280 are overlapped by 64 bytes to overlap onto in the 3rd cache line 222.It is corresponding with the second source data section 202a (that is, ' B ')
The second compressed data section 202b the 3rd cache line 222 can be placed on 64 byte offsets by computing device
Half, whole 4th cache line 223, and the half of the 5th cache line 224 in.Second compressed data section
202b can overlap 282 by 64 bytes and overlap onto in the 5th cache line 224.With the 3rd source data section 204a (i.e.,
' C ') corresponding the 3rd compressed data section 204b can be placed on the 5th at a high speed by computing device with 64 byte offsets
The half of cache lines 224, whole 6th cache line 225, and the half of the 7th cache line 226 in.3rd Jing is pressed
Contracting data segments 204b can overlap 284 by 64 bytes and overlap onto in the 7th cache line 226.With the 4th source data section
The 4th 206a (that is, ' D ') corresponding compressed data section 206b can be placed on 64 byte offsets by computing device
The half of the 7th cache line 226, whole 8th cache line 227, and another cache line (not shown) one
In half.4th compressed data section 206b can overlap onto another across interweaving and can overlap 286 by 64 bytes
In cache line (not shown).Compressed data section 200b-206b can be by computing device with 4:4 compression ratios compressing,
Or alternatively do not compress and only offset to shift via basic.
The third line 230 of Figure 200 show to by computing device with 4:1 compression ratio is come the 64 byte data sections that compress
Placed offset.Specifically, the one 64 byte compressed data area corresponding with the first source data section 200a (that is, ' A ')
Section 231 can be placed on 256 byte offsets in the half of the 3rd cache line 222 by computing device.With the second source number
Can be inclined with 64 bytes by computing device according to the 2nd 64 corresponding byte compressed data sections 232 of section 202a (that is, ' B ')
In-migration be placed on the 3rd cache line 222 second half in.In other words, 231 He of the one 64 byte compressed data section
2nd 64 byte compressed data section 232 can share the 3rd cache line 222, although cache line 222 is generally only
It is associated with the second source data section 202a (that is, ' B ').Due to respectively to the one 64 byte compressed data section 231 and
256 bytes of 2 64 byte compressed data sections 232 and 64 byte offsets are placed, and the first cache line 220, second is at a high speed
Cache lines 221 and the 4th cache line 223 can include untapped data 299.In other words, it is possible to need not calculate
Equipment is obtaining any data for these cache lines 220,221,223.These untapped cache lines are (i.e.,
The cache line 220 that is associated with untapped data 299,221,223) can be it is idle so as to cache assignment to
Other requests.
Referring also to the third line 230 of Figure 200, three 64 byte corresponding with the 3rd source data section 204a (that is, ' C ')
Compressed data section 234 can be placed on the half of the 7th cache line 226 by computing device with 256 byte offsets
It is interior.The 192 byte compressed data sections 236 corresponding with the 4th source data section 206a (that is, ' D ') can be by computing device
It is placed on 64 byte offsets in second half and whole 8th cache line 227 of the 7th cache line 226.192
Byte compressed data section 236 may be by computing device with 4:3 compression ratios are compressing.3rd 64 byte compressed data area
234 and 192 byte compressed data sections 236 of section can share the 7th cache line 226, although cache line 226 is logical
It is often only associated with the 4th source data section 206a (that is, ' D ').Due to respectively to the 3rd 64 byte compressed data section 234
Place with 256 bytes and 64 byte offsets of 192 byte compressed data sections 236, the 5th cache line 224 and the 6th is high
Fast cache lines 225 can include untapped data 299.
The fourth line 240 of Figure 200 show to by computing device with 4:2 compression ratios are come the 128 byte data sections that compress
Placed offset.Specifically, the one 128 byte compressed data corresponding with the first source data section 200a (that is, ' A ')
Section 241 can be placed on 128 byte offsets in the second cache line 221 by computing device.With the second source data area
The 2nd 128 corresponding byte compressed data section 242 of section 202a (that is, ' B ') can be by computing device with 128 byte offsets
To be placed in the 4th cache line 223.In this way, the one 128 byte compressed data section 241 and the 2nd 128 word
Section compressed data section 242 can be remained stored in their typical section boundaries 210,212, but can be shifted by for
So that the first cache line 220 and the 3rd cache line 222 include untapped data 299.Stated differently, since 128
Byte offset, may not need computing device to obtain any data for these cache lines 220,223.
Referring also to the fourth line 240 of Figure 200, three 128 word corresponding with the 3rd source data section 200c (that is, ' C ')
Section compressed data section 244 can be placed on 128 byte offsets in the 6th cache line 225 by computing device.With
64 4th source data section 206a (that is, ' D ') corresponding byte compressed data sections 246 can be by computing device with 64 words
Save skew to be placed in the 7th cache line 226.In this way, the 3rd 128 byte compressed data section 244 and 64 words
Section compressed data section 246 can be remained stored in their typical section boundaries 214,216, but can be shifted by for
So that the 5th cache line 224 and the 8th cache line 227 include untapped data 299.However, due to 64 bytes Jing
Compressed data section 246 is not filled up completely with the 7th cache line 226 and the 3rd 128 byte compressed data section 244
It is not displaced in the 4th section boundaries 216, therefore the half of the 7th cache line 226 can be fetched data filled with crossing
291。
The fifth line 250 of Figure 200 show to by computing device with 4:3 compression ratios are come the 192 byte data sections that compress
Placed offset.Specifically, the one 192 byte compressed data corresponding with the first source data section 200a (that is, ' A ')
Section 251 can be placed on the second cache line 221, and the 3rd cache by computing device with 128 byte offsets
In the half of row 222.The two 192 byte compressed data section corresponding with the second source data section 202a (that is, ' B ')
252 half and the 4th cache that can be placed on the 3rd cache line 222 by computing device with 64 byte offsets
In row 223.In this way, the one 192 byte compressed data section 251 and the 2nd 192 byte compressed data section 252 can
To be shifted by share the 3rd cache line 222, so that the first cache line 220 can include untapped data
299.Stated differently, since 192 byte offsets, may not need computing device to obtain for the first cache line 220
Any data.
Referring also to the fifth line 250 of Figure 200, the 64 bytes Jing pressure corresponding with the 3rd source data section 200c (that is, ' C ')
Contracting data segments 254 can be placed on 256 byte offsets in the 7th cache line 226 by computing device.With the 4th source
256 corresponding byte compressed data sections 256 of data segments 206a (that is, ' D ') can be inclined with 64 bytes by computing device
In-migration is placed on the half of the 7th cache line 226, whole 8th cache line 227, and another cache line
In the half of (not shown).In this way, the 5th cache line 224 and the 6th cache line 225 can include being not used
Data 299.However, 256 byte compressed data sections 256 can across interweave and therefore computing device may need to make
Use single affairs.
6th row 260 of Figure 200 show to by computing device with 4:4 compression ratios are come the 256 byte data sections that compress
Placed offset.Specifically, the one 256 byte compressed data corresponding with the first source data section 200a (that is, ' A ')
Section 261 can be placed on the half of the first cache line 220, whole second high speed by computing device with 64 byte offsets
Cache lines 221, and the half of the 3rd cache line 222 in.It is corresponding with the second source data section 202a (that is, ' B ')
2nd 256 byte compressed data section 262 can be placed on the 3rd cache line by computing device with 64 byte offsets
222 half, whole 4th cache line 223, and the half of the 5th cache line 224 in.Additionally, with the 3rd source number
Can be by computing device with 64 bytes according to the 3rd 256 corresponding byte compressed data sections 264 of section 200c (that is, ' C ')
Offset to be placed on the half of the 5th cache line 224, whole 6th cache line 225, and the 7th cache line
In 226 half.The 64 byte compressed data sections 266 corresponding with the 4th source data section 206a (that is, ' D ') can be by
Computing device is placed on 64 byte offsets in the half of the 7th cache line 226.In this way, the 8th cache line
227 can include untapped data 299.However, as the one 256 byte compressed data section 261 is not filled up completely with
First cache line 220, therefore the half of the first cache line 220 can fetch data 290 filled with crossing.
In certain aspects, data segments can be separated into cross-border independent affairs by computing device, so as to by number
Alleviated when being mapped as across more intertexture and/or page boundary according to section.For example and as described in FIG. 5 below, work as meter
256 byte compressed data sections (for example, are utilized 4 by calculation equipment:3 compression ratios are come the section that compresses) be placed on intertexture and/or
When in page boundary, it may be necessary to two single affairs.
Fig. 3 show it is in being suitable in terms of some, include and compression ratio and base address information (that is, base address odd even
Check value) corresponding deviant tables of data 300.In terms of realization, the computing device of compact technique can be pressed using specific Jing
The compression ratio information 310 and base address parity information 302 of contracting data segments to perform lookup to tables of data 300, to identify
The size of the basic skew of placement (or access) data segments in cache should be used for.Base address parity information
302 can be the position 8 of the base address to the data segments in cache (for example, physical address) be odd number (that is, ' 1 ' value) also
It is the instruction of even number (that is, ' 0 ' value).In certain aspects, tables of data 300 can be stored in the memorizer of computing device
Two-dimentional (2D) array.In certain aspects, both hardware and softwares can be used for realizing each technology, such as by computing device profit
With table or logic.
For example, in response to for 4:1 compression ratio is come data segments compress and the base address with even number value position 8
To perform tables of data 300 lookup, computing device can identify 256 bytes and offset substantially.In response to for 4:1 compression ratio comes
The data segments of the compress and base address with odd number value position 8 to perform tables of data 300 lookup, and computing device can be marked
Know 64 bytes to offset substantially.In response to for 4:2 compression ratios are come base address compress and with even number or odd number value position 8
Data segments to perform tables of data 300 lookup, computing device can identify 128 bytes and offset substantially.In response to 4:3 pressures
Contracting ratio to perform lookup to tables of data 300 data segments compress and the base address with even number value position 8, computing device
128 bytes can be identified to offset substantially.In response to for 4:3 compression ratios are come the compress and base with odd number value position 8
The data segments of location to perform tables of data 300 lookup, and computing device can identify 64 bytes and offset substantially.In response to for
4:4 compression ratios are performed to tables of data 300 the compress and data segments with even number or the base address of odd number value position 8 and are looked into
Look for, computing device can identify 64 bytes and offset substantially.
Fig. 4 A show can be by computing device to identify for compressed data section in cache
The exemplary pseudo-code 400 of the function of the new address (for example, physical address) of placed offset.For example, computing device can be matched somebody with somebody
It is equipped with and instructs so as to as shown in Fig. 2 above to place data segments as described below.Shown in false code 400
Function can by computing device for example during or after packing routine (for example, the packing routine to data set) is performed, as
Using, firmware, application programming interface (API), routine and/or operation it is a part of performing.It should be recognized that, there is provided the pseudo- generation
Code 400 is in order at the purpose of general remark, and therefore is not intended to represent any specific programming language, structure or form
Change.
False code 400 (being referred to as " false code A " in Fig. 4 A) can be represented and be referred to as " getNewAddress_
The function of oddOrEven () ", the function may need the Base_Address (base _ address) related to specific data segments
|input paramete and Compression_Ratio (compression _ compare) |input paramete.In certain aspects, Base_Address can be
The binary representation of physical address or virtual address.Function shown in false code 400 can be included for computing device pair
Base_Address |input parametes perform shift right operation so as to by the position 8 of the base address of data segments move to rightmost position for
Be stored in Shift (displacement) variable (shift right operation be indicated as in Figure 4 A ">>8 " instruction).Shown in false code 400
The function for going out can include performing for the position 8 (that is, Shift variables) to address and ' 1 ' value (for example, " ... 000000001 ")
Step-by-step and the instruction of computing (or ' & ' in Fig. 4 A), the computing can be generated and be stored in Parity (even-odd check) variable
Value.In other words, when Shift variables are 0, the value in Parity variables can be 0, and when Shift variables are 1,
Parity variate-values can be 1.
Function shown in false code 400 can also include using Compression_Ratio defeated for computing device
Enter parameter and the Parity variables that calculated to perform predefined table the instruction of search operation, such as described in Fig. 3.
Can be stored in Offset variables by searching the value fetched.In certain aspects, search operation could be for access and deposit
The operation of information of the storage in two dimension (2D) array in the memorizer of computing device.Finally, the letter shown in false code 400
Number can include Base_Address and Offset additions of variables is (for example, new to generate offset address for computing device
Offsetting physical address) instruction, the offset address can be returned so as to by computing device be used for place data segments.
In certain aspects, computing device can be configured with simplified logic, circuit, software instruction and/or routine
Generate new address rather than operated using table search, as indicated by using false code 400.
Fig. 4 B- Fig. 4 C show and can be used for calling to the function shown in the false code 400 described in Fig. 4 A
Exemplary pseudo-code and value.It should be recognized that the example values in Fig. 4 B- Fig. 4 C are in order at descriptive purpose, it is not intended that
Limit the scope of each side.
Fig. 4 B show exemplary pseudo-code 450, the exemplary pseudo-code 450 can by computing device with call as
The exemplary functions for determining the new offset address of data segments described in Fig. 4 A.Exemplary pseudo-code 450 can be wrapped
Include for arranging the instruction of the value of Base_Address and Compression_Ratio variables, wherein Base_Address and
Compression_Ratio variables may serve as the input ginseng called to getNewAddress_oddOrEven () function
Number.For example, Base_Address can be configured so that the physical address (for example, " ... 100000000 ") of data segments, wherein thing
Reason address includes at least eight.Another example is lifted, Compression_Ratio variables can be configured so that instruction computing device
The value (for example, 4 of the hierarchy compression (or compression ratio) of compressed data section is carried out using compression algorithm:4).False code 450 can be with
Including calling function getNewAddress_oddOrEven () using these |input parametes and value, and can cause to count
Calculation equipment can be stored in the return value from function in new_address (new _ address) variable, for example, realize at a high speed
Placed offset, data segments new physicses address in cache lines.
Fig. 4 C show exemplary pseudo-code 460, and the false code 460 still also includes similar to the false code 400 of Fig. 4 A
For the example values in each instruction based on the input parameter value described in above figure 4B.In other words, false code 460
It is the function equivalent scheme of the exemplary pseudo-code 400 of Fig. 4 A, which has to property function getNewAddress_ as an example
The instruction of a part of increase of the value of calculating to perform of oddOrEven ().
For example, in figure 4 c, function can utilize the Base_Address |input parametes with " ... 100000000 " value with
And have " 4:4 " the Compression_Ratio |input parametes of value.Move to right in response to performing to Base_Address |input parametes
Computing, computing device can generate the value of " ... 000000010 " so as to the base address (for example, physical address) by data segments
Position 8 moves to rightmost position for storage in Shift variables.Function can include for Shift variables and 1 value (or " ...
000000001 " instruction of step-by-step and computing) is performed, the computing can generate the value being stored in Parity variables.For example, when
When Shift variables have value " 000000010 ", step-by-step and computing can produce " ... 000000000 " (or simply 0)
Parity variate-values.Using exemplary Parity variate-values 0 and the example values 4 of Compression_Ratio:4, in Fig. 3
The result of the lookup of described predefined table can return the Offset variate-values of 64 bytes.Function can subsequently return number
According to the new offset address of section, the new offset address is Offset variate-values (for example, 64 byte) and Base_Address (examples
Such as, " ... 100000000 ") combination.
In certain aspects, computing device can be configured to:By will not across in the way of page boundary or intertexture by data
Section is placed offset in cache line.In other words, 64 byte data sections can be mapped to 512 words by computing device
Outside section part (or two 256 byte sections), so that data segments surround the beginning of 512 byte sections to avoid across page
Border.In terms of illustrating these, Fig. 5 is shown according to using the circular another aspect cache line compact technique, right
The placed offset of the data segments compressed using various compression ratios.Figure 50 0 of Fig. 5 is similar to above with reference to described by Fig. 2
Figure 200, difference are that the aspect placed offset shown in Fig. 5 may not allow for data segments to be placed on cache
In so that they across more than 512 bytes page or intertexture border.Aspect placed offset shown in Fig. 6 can also be obtained
Use for the more uniform cache bank that data segments are placed.However, putting with the aspect skew shown in Fig. 2
Put and compare, the alternative aspect shown in Fig. 5 can be averagely caused with 4:4 compression ratios are come the half point in the data segments compressed
From into two affairs.In other words, for placing with 4:The discontinuous request of data segments of 4 compression ratios to compress can be obtained
Than the more detached request of aspect technology shown in Fig. 2.Aspect technology shown in Fig. 5 can be beneficial to reduce great majority
Cross and take, particularly when being used together with L3 caches.Additionally, can be real in the case of with and without L3 caches
Now simply prefetch.In in all fields, 4:4 compression ratios can be compression ratio most unlikely, and therefore by data segments
It can be extreme case to be separated into two affairs.
As shown in Figure 5, cache line 520-527 can be associated with the physical address that from left to right can increase
(that is, the physical address of the first cache line 520 is the numeral than being associated with the physical address of the second cache line 521
Little numeral is wanted, the rest may be inferred).In certain aspects, the physical address of cache line 520-527 can with least 9 come
Represent.Fig. 5 shows the nine, the 8th and the 7th s' of the associated physical address to each cache line 520-527
Indicate.For example, the 9th (or position 9) of the first cache line 520 can be ' 0 ' value, the 8th of the first cache line 520 the
Position (or position 8) can be ' 0 ' value, and the 7th (or position 7) of the first cache line 520 can be ' 0 ' value, and the 5th
The 9th (or position 9) of cache line 524 can be ' 1 ' value, and the 8th (or the position 8) of the 5th cache line 524 can be with
It is ' 0 ' value, and the 7th (or the position 7) of the 5th cache line 524 can be ' 0 ' value.
As described above with reference to Figure 2, the first row 228 of Figure 50 0 shows source data section 202a-206a in basis
Acquiescence when routine techniquess are to store in the caches in the 256 byte sections border 210-216 is placed.First source data area
Section 200a (being referred to as ' A ' in Fig. 5) can not have placed offset in the first cache line 520 and the second row by computing device
In 521.Second source data section 202a (being referred to as ' B ' in Fig. 5) can not have placed offset in three-hypers by computing device
In fast cache lines 522 and the 4th cache line 523.3rd source data section 204a (being referred to as ' C ' in Fig. 5) can be by calculating
Equipment does not have placed offset in the 5th cache line 524 and the 6th cache line 525.4th source data section 206a
(being referred to as ' D ' in Fig. 5) can not have placed offset in the 7th cache line 526 and the 8th cache by computing device
In row 527.
However, the row 529,530,540,550,560 of Figure 50 0 according to operated by the aspect of computing device show it is right
The placed offset Chong Die with base address of the source data section 200a-206a of compressed format.In certain aspects, by computing device
For place each compressed data section basic skew can based on the lookup performed to predefined tables of data, for example under
Described in the Fig. 6 of face.
Second row 529 of Figure 50 0 show to 256 byte compressed data sections 501,502a-502b, 504a-504b,
506 placed offset.The first compressed data section 501 corresponding with the first source data section 200a (that is, ' A ') can be by
Computing device be placed on 64 byte offsets the half of the first cache line 520, whole second cache line 521, with
And the 3rd cache line 522 half in.First compressed data section 501 can overlap overlap onto the by 64 bytes
In three cache lines 522.The second compressed data section 502a corresponding with the second source data section 202a (that is, ' B ')
The half and whole 4th high speed that the 3rd cache line 522 can be placed on 64 byte offsets by computing device is delayed
Deposit in row 523.However, in order to avoid across intertexture, computing device can be configured to 256 the second compressed data of byte areas
Section 502a is stored in two single parts.Therefore, the remainder 502b of 256 byte the second compressed data section 502a
(being referred to as " B-Rem " in Fig. 5) can be placed in the first half of the first cache line 520 by computing device.With the 3rd source
The 3rd corresponding compressed data section 504a of data segments 204a (that is, ' C ') can be put with not offseting by computing device
Put the 5th cache line 524, and the half of the 6th cache line 525 in.Similar to the second compressed data section
502a, the 3rd compressed data section 504a can be separated into two parts to avoid across intertexture.Therefore, the 3rd is compressed
The remainder 504b of data segments 504a can be placed in the half of the 8th cache line 527.With the 4th source data area
The 4th corresponding compressed data section 506 of section 206a (that is, ' D ') can be placed with minus 64 byte offset by computing device
Half, whole 7th cache line 526 in the 6th cache line 525, and the 8th cache line 527 half
It is interior.
The third line 530 of Figure 50 0 show to by computing device with 4:1 compression ratio is come the 64 byte data sections that compress
Placed offset.Specifically, the one 64 byte compressed data area corresponding with the first source data section 200a (that is, ' A ')
Section 531 can be placed on 256 byte offsets in the half of the 3rd cache line 522 by computing device.With the second source number
Can be inclined with 64 bytes by computing device according to the 2nd 64 corresponding byte compressed data sections 532 of section 202a (that is, ' B ')
In-migration be placed on the 3rd cache line 522 second half in.In other words, 531 He of the one 64 byte compressed data section
2nd 64 byte compressed data section 532 can share the 3rd cache line 522, although cache line 522 is generally only
It is associated with the second source data section 202a (that is, ' B ').Due to respectively to the one 64 byte compressed data section 531 and
256 bytes of 2 64 byte compressed data sections 532 and 64 byte offsets are placed, and the first cache line 520, second is at a high speed
Cache lines 521 and the 4th cache line 523 can include untapped data 599.In other words, it is possible to need not calculate
Equipment is obtaining any data for these cache lines 520,521,523.These untapped cache lines are (i.e.,
The cache line 520 that is associated with untapped data 599,521,523) can be it is idle so as to cache assignment to
Other requests.
Referring also to the third line 530 of Figure 50 0, three 64 byte corresponding with the 3rd source data section 204a (that is, ' C ')
Compressed data section 534 can be placed on the half of the 6th cache line 525 by computing device with 128 byte offsets
It is interior.The 192 byte compressed data sections 536 corresponding with the 4th source data section 206a (that is, ' D ') can be by computing device
It is placed on minus 64 byte offset in second half and whole 7th cache line 526 of the 6th cache line 525.
192 byte compressed data sections 536 may by computing device 4:3 compression ratios are compressing.Due to respectively to the 3rd 64 word
128 bytes and minus 64 byte offset of section compressed data section 534 and 192 byte compressed data sections 536 are placed, and the 5th
Cache line 524 and the 8th cache line 527 can include untapped data 599.
The fourth line 540 of Figure 50 0 show to by computing device with 4:2 compression ratios are come the 128 byte data sections that compress
Placed offset.Specifically, the one 128 byte compressed data corresponding with the first source data section 200a (that is, ' A ')
Section 541 can be placed on 128 byte offsets in the second cache line 521 by computing device.With the second source data area
The 2nd 128 corresponding byte compressed data section 542 of section 202a (that is, ' B ') can be by computing device with 128 byte offsets
To be placed in the 4th cache line 523.In this way, the one 128 byte compressed data section 541 and the 2nd 128 word
Section compressed data section 542 can be remained stored in their typical section boundaries 210,212, but can be shifted by for
So that the first cache line 520 and the 3rd cache line 522 include untapped data 599.Stated differently, since 128
Byte offset, may not need computing device to obtain any data for these cache lines 520,523.
Referring also to the fourth line 540 of Figure 50 0, three 128 word corresponding with the 3rd source data section 200c (that is, ' C ')
Section compressed data section 544 can not have placed offset in the 5th cache line 524 by computing device.With the 4th source
64 corresponding byte compressed data sections 546 of data segments 206a (that is, ' D ') can be inclined with minus 64 byte by computing device
In-migration is placed in the half of the 6th cache line 525.In this way, the 3rd 128 byte compressed data section 544 and 64
Byte compressed data section 546 still can be shifted by cause the 7th cache line 526 and the 8th cache line 527
Including untapped data 599.However, as 64 byte compressed data sections 546 are not filled up completely with the 6th cache
Row 525, therefore the half of the 6th cache line 525 can fetch data 590 filled with crossing.
The fifth line 550 of Figure 50 0 show to by computing device with 4:3 compression ratios are come the 192 byte data sections that compress
Placed offset.Specifically, the one 192 byte compressed data corresponding with the first source data section 200a (that is, ' A ')
Section 551 can be placed on the second cache line 521, and the 3rd cache by computing device with 128 byte offsets
In the half of row 522.The two 192 byte compressed data section corresponding with the second source data section 202a (that is, ' B ')
552 half and the 4th cache that can be placed on the 3rd cache line 522 by computing device with 64 byte offsets
In row 523.In this way, the one two 192 byte compressed data section 551 and the 2nd 192 byte compressed data section 552
Can be shifted by share the 3rd cache line 522, so that the first cache line 520 can include untapped number
According to 599.Stated differently, since 192 byte offsets, may not need computing device to obtain for the first cache line
520 any data.
Referring also to the fifth line 550 of Figure 50 0, the 64 bytes Jing pressure corresponding with the 3rd source data section 200c (that is, ' C ')
Contracting data segments 554 can be placed on 128 byte offsets in the half of the 6th cache line 525 by computing device.With
256 4th source data section 206a (that is, ' D ') corresponding byte compressed data sections 556 can be by computing device with negative
64 byte offsets are being placed on the half of the 6th cache line 525, whole 7th cache line 526, and the 8th at a high speed
In the half of cache lines 527.In this way, the 5th cache line 524 can include untapped data 599.However, cross taking
Data 591 can be in the half of the 8th cache line 527.
6th row 560 of Figure 50 0 show to by computing device with 4:4 compression ratios are come the 256 byte data sections that compress
Placed offset.Specifically, the one 256 byte compressed data corresponding with the first source data section 200a (that is, ' A ')
Section 561 can be placed on the half of the first cache line 520, whole second high speed by computing device with 64 byte offsets
Cache lines 521, and the half of the 3rd cache line 522 in.It is corresponding with the second source data section 202a (that is, ' B ')
2nd 256 byte compressed data section 562a can be placed on the 3rd cache line by computing device with 64 byte offsets
In 522 half and whole 4th cache line 523.However, computing device can be configured to the 2nd 256 byte
Compressed data section 562a is stored in two single parts.Therefore, the 2nd 256 byte compressed data section 562a
Remainder 562b (being referred to as " B-Rem " in Fig. 5) can be placed on the first half of the first cache line 520 by computing device
It is interior.The threeth compressed data section 564a corresponding with the 3rd source data section 204a (that is, ' C ') can not had by computing device
Be placed on offsetting the 5th cache line 524, and the half of the 6th cache line 525 in.Press similar to the 2nd Jing
Contracting data segments 502a, the 3rd compressed data section 564a can be separated into two parts to avoid across intertexture.Therefore,
The remainder 564b of the 3rd compressed data section 564a can be placed in the half of the 8th cache line 527.With
64 four source data section 206a (that is, ' D ') corresponding byte compressed data sections 566 can be by computing device with minus 64 word
Section offsets to be placed in the half of the 6th cache line 525.However, crossing, fetch data 592 can be in the 8th cache line
In 527 half.
Fig. 6 show it is in being suitable in terms of some, include it is corresponding with compression ratio and base address parity values
Basic skew tables of data figure.As described above with reference to Figure 5, in terms of realization, the computing device of compact technique can make
Tables of data 600 is performed with the compression ratio information 610 and base address parity information 602 of specific compressed data section
Search, to identify the size that should be used for the basic skew that (or access) data segments are placed in cache.At some
In aspect, tables of data 600 can be stored in the two dimension in the memorizer of computing device (2D) array.
The tables of data 600 can be similar to the tables of data 300 above with reference to described by Fig. 3, and difference is the base of Fig. 6
Address parity check information 602 can be the instruction of the value of the position 8 and position 9 of the base address to data segments (for example, for two
Position 0 or 1).In other words, the lookup for performing to tables of data 600 may need the value and compression ratio of two positions to set to calculate
It is standby to identify basic deviant to be applied to the base address (for example, physical address) of data segments.Retouched in tables of data 600 and Fig. 3
Another difference between the tables of data 300 stated is that tables of data 600 can be stored for some compression ratios (for example, 4:4 compressions
Than) more than one basic deviant and two positions associated with data segments address (for example, physical address) value.Tool
For body, when need by segments apart into two affairs with avoid transnational page boundary or interweave when, tables of data 600 can include pin
The first basic deviant and the second basic deviant for the second affairs to the first affairs.In these cases, data
Table 600 can indicate the size of each affairs being associated with each basic deviant.In certain aspects, computing device can be with
Lookup is performed using other combinations of position, wherein this combination can be based on the hash to institute's data storage.
For example, in response to for 4:1 compression ratio is come compressing and have 0 value and for its position 8 with for its position 9
The data segments of the base address with 0 value to perform lookup to tables of data 600, and it is substantially inclined that computing device can identify 256 bytes
Move.In response to for 4:1 compression ratio is come compressing and 9 have 0 value and for which is 8 with 1 value with for which
The data segments of base address to perform tables of data 600 lookup, and computing device can identify 64 bytes and offset substantially.In response to pin
To with 4:1 compression ratio come it is compressing and with for its 9 have 1 value and for its 8 with 0 value base address data
Section to perform tables of data 600 lookup, and computing device can identify 128 bytes and offset substantially.In response to for 4:1 compression
Than come compressing and 9 there is 1 value and for its 8 data segments of base address with 1 value carrys out logarithm with for which
Lookup is performed according to table 600, computing device can identify minus 64 byte and offset substantially.
Another example is lifted, in response to for 4:2 compression ratios come it is compressing and with for its position 9 have 0 value and
For the data segments of its 8 base address with 0 value to perform tables of data 600 lookup, computing device can identify 128 words
Section is basic to be offset.In response to for 4:2 compression ratios are come compressing and have 0 value and for its position 8 has with for its position 9
The data segments for having the base address of 1 value to perform tables of data 600 lookup, and computing device can identify 128 bytes and offset substantially.
In response to for 4:2 compression ratios come it is compressing and with for its 9 have 1 value and for its 8 with 0 value base
The data segments of location to perform tables of data 600 lookup, and computing device can identify 0 byte and be offset (that is, without substantially inclined substantially
Move).In response to for 4:2 compression ratios are come compressing and 9 have 1 value and for which is 8 with 1 value with for which
The data segments of base address to perform tables of data 600 lookup, and computing device can identify 0 byte and offset substantially (that is, no base
This skew).
Another example is lifted, in response to for 4:3 compression ratios come it is compressing and with for its position 9 have 0 value and
For the data segments of its 8 base address with 0 value to perform tables of data 600 lookup, computing device can identify 128 words
Section is basic to be offset.In response to for 4:3 compression ratios are come compressing and have 0 value and for its position 8 has with for its position 9
The data segments for having the base address of 1 value to perform tables of data 600 lookup, and computing device can identify 64 bytes and offset substantially.
In response to for 4:3 compression ratios come it is compressing and with for its 9 have 1 value and for its 8 with 0 value base
The data segments of location to perform tables of data 600 lookup, and computing device can identify 0 byte and be offset (that is, without substantially inclined substantially
Move).In response to for 4:3 compression ratios are come compressing and 9 have 1 value and for which is 8 with 1 value with for which
The data segments of base address to perform tables of data 600 lookup, and computing device can identify minus 64 byte and offset substantially.
Another example is lifted, in response to for 4:4 compression ratios come it is compressing and with for its position 9 have 0 value and
For the data segments of its 8 base address with 0 value to perform tables of data 600 lookup, computing device can identify 64 words
Section is basic to be offset.In response to for 4:4 compression ratios are come compressing and have 0 value and for its position 8 has with for its position 9
The data segments for having the base address of 1 value to perform tables of data 600 lookup, and it is 64 bytes that computing device can be identified for size
Minus 256 byte of the first affairs offset substantially and for size be that 64 bytes of the second affairs of 192 bytes are offset substantially.
In response to for 4:4 compression ratios come it is compressing and with for its 9 have 1 value and for its 8 with 0 value base
The data segments of location to perform tables of data 600 lookup, computing device can identify for size be 192 bytes the first affairs
0 byte offset substantially and for size be that 448 bytes of the second affairs of 64 bytes are offset substantially.In response to for 4:
4 compression ratios come it is compressing and with for its 9 have 1 value and for its 8 have 1 value base address data segments
Lookup is performed to tables of data 600, computing device can identify minus 64 byte and offset substantially.
Fig. 7 A show according to another aspect cache line compact technique, can be by computing device to identify
For the exemplary pseudo-code 700 of the function of the new address of placed offset of the compressed data section in cache.For example,
Computing device can be configured with and instruct so as to as shown in Fig. 5 above to place data segments as described below.
False code 400 of the false code 700 of Fig. 7 A similar to Fig. 4 A described above, difference are that false code 700 can be wrapped
Include for so that computing device utilizes the two of base address (for example, physical address) when search operation is performed to identify basic skew
The instruction of individual position (that is, position 8 and position 9), and can be additionally configured to potentially return two offset address.Depending on base address
It is physical address or virtual address, this offset address can be offsetting physical address or skew virtual address.By false code
Function shown in 700 can by computing device for example during packing routine (for example, the packing routine to data set) is performed or
Afterwards, as a part of performing of application, firmware, application programming interface (API), routine and/or operation.It should be recognized that
The purpose that the false code 700 is in order at general remark is provided, and therefore is not intended to represent exercisable any specific volume
Cheng Yuyan, structure or formatting.
False code 700 (being referred to as " false code B " in Fig. 7 A) can be represented and be referred to as " getNewAddress_bit8-9
The function of () ", the function may need the Base_Address |input paramete related to specific data segments and
Compression_Ratio |input parametes.In certain aspects, Base_Address can be the binary form of physical address
Show.The operation content of function can include for computing device to Base_Address |input parametes perform the first shift right operation with
Just the position 8 of the base address of data segments is moved to into rightmost position for storage in the instruction in Shift variables.The displacement may be used also
So that the position 9 of Base_Address to be moved to the second rightmost position in Shift variables.Function can be included for shifted
Location (that is, Shift variables) and ' 1 ' value (for example, " ... 011 ") for two rightmost positions perform step-by-step and computing (or in Fig. 7 A
' & ') instruction, the computing can generate the value being stored in Parity variables.In other words, step-by-step can be by Jing with computing
Any other zero in displacement address (that is, Shift variables), so that shifted address may serve as looking into for performing
The parity values looked for.
Function shown in false code 700 can also include being input into using Compression_Ratio for computing device
Parameter and the Parity variables for being calculated to perform predefined table the instruction of search operation, such as with reference to described by Fig. 5.
However, two kinds of possible situations that more than one basic deviant is returned from table search may be needed due to existing, therefore function can
Whether to be equal to 4 including assessment Compression_Ratio:Whether indicating bit 9 is identical with position 8 for the value of 4 or Parity variables
(that is, " ... 00 " or " ... 11 ").If computing device determines that Compression_Ratio is equal to 4:4 or Parity variables
Value indicating bit 9 (that is, " ... 00 " or " ... 11 ") identical with position 8, then computing device can using Compression_Ratio and
Parity performing search operation, to fetch single deviant for storage in Offset variables.The single deviant can
To combine with Base_Address and return for placed offset.
If however, computing device determines that Compression_Ratio is equal to 4:The value indicating bit of 4 and Parity variables
9 and position 8 different (that is, " ... 01 " or " ... 10 "), then computing device can be held using Compression_Ratio and Parity
Row search operation, to fetch two deviants for storage in Offset [] aray variable.Each in two deviants
Deviant individually can be combined with Base_Address, to generate the new address of two affairs for data segments.At each
In aspect, the size and deviant of two affairs can be returned.In certain aspects, it is possible to use calling or function in addition
Individually to calculate and/or be returned for the size and deviant of these affairs.
In certain aspects, computing device can be configured with simplified logic, circuit, software instruction and/or routine
Generate new address rather than utilize table search, as indicated by using false code 700.
Fig. 7 B- Fig. 7 C show the example that can be used in the calling of the function to the false code 700 described in Fig. 7 A
Property false code and value.It should be recognized that the example values in Fig. 7 B- Fig. 7 C are in order at descriptive purpose, it is not intended that limit each
The scope of aspect.
Fig. 7 B show can by computing device with call as described in Fig. 7 A above for determining data field
The exemplary pseudo-code 750 of the exemplary functions of the new offset address of section.Exemplary pseudo-code 750 can be included for arranging
The instruction of the value of Base_Address and Compression_Ratio variables, wherein Base_Address and Compression_
Ratio variables may serve as the |input paramete called to getNewAddress_bit8-9 () function.For example, Base_
Address can be configured so that the physical address (for example, " ... 100000000 ") of data segments, wherein physical address include to
It is few nine.Another example is lifted, Compression_Ratio variables can be configured so that instruction computing device is calculated using compression
Method carrys out the value (for example, 4 of the hierarchy compression (or compression ratio) of compressed data section:4).False code 750 can include using these
|input paramete and value are called to function getNewAddress_bit8-9 (), and can enable computing device in the future
The number that the placed offset in cache line is for example realized in new_address variables is stored in from the return value of function
According to the new physicses address of section.
Fig. 7 C show exemplary pseudo-code 760, and the false code 760 still also includes similar to the false code 700 of Fig. 7 A
For the example values of each instruction based on the input parameter value described in above figure 7B.In other words, false code 760 is
The function equivalent scheme of the exemplary pseudo-code 700 of Fig. 7 A, which has to property function getNewAddress_bit8- as an example
The instruction of a part of increase of the example values of calculating to perform of 9 ().
For example, in fig. 7 c, function can utilize the Base_Address |input parametes with " ... 100000000 " value with
And have " 4:4 " the Compression_Ratio |input parametes of value.Move to right in response to performing to Base_Address |input parametes
Computing, computing device can generate the value of " ... 000000010 " so as to the base address (for example, physical address) by data segments
Position 8 moves to rightmost position for storage in Shift variables.Function can be included for Shift variables and " ... 11 " value
The instruction of step-by-step and computing is performed, the computing can generate the value being stored in Parity variables.For example, when Shift becomes measurer
When having the value of " ... 000000010 ", step-by-step and computing can produce the Parity variate-values of " ... 000000010 ".
As exemplary Compression_Ratio is " 4:4 " and exemplary Parity variate-values are not equal to " ... 00 "
Or " ... 11 ", therefore computing device may not perform from look-up table the operation for fetching single basic deviant.Conversely, computing device
The instruction of the lookup that two basic deviants are returned for execution can be performed.Specifically, utilize showing for " ... 000000010 "
" the 4 of example property Parity variate-values and Compression_Ratio:4 " example values, to predefined described in Fig. 6
The result of the search operation of table can return the second basic deviant of the first basic deviant and 448 bytes of 0 byte.One
In a little aspects, the result of lookup can also include the size of the affairs being associated with basic deviant.Function can subsequently be returned
For two new basic offset address of two affairs of data segments, the first address is the group of 0 byte and Base_Address
Close (for example, " ... 100000000 "+0 byte), and the second address be the combination of 448 bytes and Base_Address (for example,
" ... 100000000 "+448 bytes).
Fig. 8 A show the aspect method 800 that the data in the cache line of cache are compacted for computing device.
Method 800 can by computing device as routine, application, function or can with reference to or in response to computing device conventional compact
Algorithm and other operations for occurring are performing.For example, in response to compressed data section, computing device can perform method 800
Operation is so that placed offset compressed data section is so that data segments are overlapped on a cache line.Although following retouches
State and may refer to can be placed in the individual data section in offset address (for example, offsetting physical address), it will be appreciated that meter
Calculation equipment can be configured to execution method 800 for example to place multiple data segments in the loop.Additionally, should realize
Arrive, computing device can perform the operation of method 800 with the data (compressed or uncompressed) by any variable-length
It is placed in various types of memory cells or cache element.In certain aspects, computing device can be configured to
The operation of execution method 800 is placing the data segments compressed by another equipment.
In frame 802, (for example, base is physically for the base address that the processor of computing device can identify for data segments
Location or base virtual address).As described above, the base address can be the generally associated cache line of data segments or
Cached address.For example, base address can be data segments (or source data section) in uncompressed state when be stored in
The address of the first cache line therein.Base address can be the initial initial address of data segments, but which may not refer to
Show the quantity of the cache line (or cache memory space) that may require data storage section.In certain aspects, base address
Can be the initial address of the block asked in memorizer, which can be represented by the row distributed in cache, but generally
Can not be represented by the row distributed in cache, because cache can be allocated for compressed version.
In frame 804, the processor of computing device can identify size of data for data segments (for example, based on pressure
Contracting ratio).For example, the size of data for being identified can be based on the compression ratio of data segments, such as by computing device to another data
Buffer performs read operation to obtain what compression ratio was identified.In in all fields, it is configured with depending on computing device
Compression algorithm type, it is understood that there may be for the multiple available compression ratio of data segments.For example, when computing device is configured to
During using compression algorithm, can be with 4:1 compression ratio, 4:2 compression ratios (or 2:1 compression ratio), 4:3 compression ratios, and/or 4:4 compressions
Than (or 1:1 compression ratio) carry out compressed data section.The size of data for data segments for being identified can be based on so that data
The size maximum possible of section reduces the compression ratio of (or maximum compression).In in all fields, the size of data that identified (or pressure
Contracting size) can be different for different types of data, data structure and/or context.For example, computing device can be identified
For the first size of data of the first compression ratio of the first data segments of the first data type, and identify for the second data class
Second size of data of the second compression ratio of the second data segments of type.
In frame 806, the processor of computing device can be based on the size of data for being identified of data segments and base address
To obtain basic skew.Basic skew can be a quantity of byte, such as 64 bytes, 128 bytes, 192 bytes and/or
256 bytes.In certain aspects, basic skew can be the byte of negative, such as minus 64 byte.In certain aspects, substantially
The size of skew can be the multiple of the size of the half of the cache line of computing device high speed caching.In some respects
In, computing device can be performed to tables of data and search and fetch basic deviant, as above with reference to described by Fig. 3 or Fig. 6.
In in terms of some, computing device can determine basic skew using equation, routine, circuit and/or software module.Fig. 8 B are illustrated
For obtaining the particular aspects operation of basic skew.
In block 808, the processor of computing device can be carried out to base address by using the basic skew for being obtained partially
In-migration calculates offset address.For example, computing device can be by the basic skew for being obtained (for example, a quantity of byte) and base
Address is combined to obtain new address (for example, new physicses address), the new address occur base address in the caches it
It is front or afterwards.When the basic skew for being obtained be on the occasion of when, the calculating can obtain the offset address bigger than base address, and work as
When the basic skew for being obtained is negative value, the offset address less than base address is obtained.In optional frame 810, the place of computing device
Reason device can be in data storage section at the offset address for being calculated.In other words, the offset address for being calculated can be used for filling out
Fill cache.For example, computing device can read data segments from memorizer (for example, DDR) and be loaded into data segments
With in one or more cache lines that offset address starts.In certain aspects, computing device can be deposited from compressed
Reservoir reads data segments as compressed data section.In optional frame 812, the processor of computing device can calculated
Offset address at read data segments.In other words, the offset address for being calculated can be used for fetching or finding being previously stored
Data segments in cache line.For example, computing device can use calculated offset address to fetch from cache
Operation etc. of the data segments for application.Additionally, reading for operating in optional frame 812 can have the size specified, example
Such as size of data determined above.In optional frame 813, the processor of computing device can enter to the data segments for being read
Row decompression.In certain aspects, decompressed data segments can be stored in locally, for example, be stored in its associated base
In address.
In certain aspects, data segments in the caches, and may can not need to obtain from memorizer and insert
Enter in cache.In certain aspects, may not there is cache, and prefetch can by various aspects technology come
Be benefited.
Fig. 8 B show the aspect method 850 that the data in the cache line of cache are compacted for computing device.
Operation of the operation of method 850 similar to the method 800 above with reference to described by Fig. 8 A, difference are that method 850 can be with
Including the operation for performing search operation in the predefined tables of data of deviant.For example, computing device can be configured
It is to fetch deviant from tables of data using the information with regard to base address and compression ratio, as above with reference to described by Fig. 3 and Fig. 6
's.
In frame 802-804,808-812 operation can with it is identical above with reference to described by Fig. 8 A.In frame 852, meter
The processor of calculation equipment can identify the parity values for data segments based on base address.In certain aspects, calculate
Equipment can assess position 8 in base address to identify parity values, the parity values indicating bit 8 be even number value (that is, 0) also
It is odd number value (that is, 1).The use of the parity values is shown in Fig. 2-Fig. 4 C.In certain aspects, in order to identify odd even school
Test value, computing device shift right operation can be performed to the binary representation of base address so that position 8 becomes rightmost position, and to " ...
001 " the binary result application step-by-step of binary value and shift right operation and computing are identifying parity values.In some respects
In, computing device can assess the position 9 and position 8 of base address with identify the two positions whether with may serve as tables of data or 2D numbers
The predefined combination (for example, " 00 ", " 01 ", " 10 ", " 11 ") of the index in group matches.The odd even is shown in Fig. 5-Fig. 7 C
The use of check value.In certain aspects, in order to identify parity values, computing device can be to the binary representation of base address
Shift right operation is performed so that position 8 becomes rightmost position, and the binary result application to " ... 011 " binary value and shift right operation is pressed
Position with computing with identify parity values (for example, in the binary result by shift right operation all values zero, the two of base address
Except 8 and 9 that system is represented).
In frame 854, the processor of computing device can be by using the size of data for being identified (for example, based on being marked
The size of the compression ratio of knowledge) and the table of the parity values that identified to being stored perform lookup, obtain basic skew.Example
Such as, computing device can use identified compression ratio (for example, 4:1 etc.) and parity values expression or code as number
According to table or the index of 2D arrays, go out as shown in figs. 3 and 6.In certain aspects, the skew for being obtained can include being directed to
First skew of the first affairs and the second skew for the second affairs, as described above.For example, in order to avoid permission
Across interweaving or page boundary, data segments can be separated into two affairs to the data segments of skew, the two affairs with from base
First skew (for example, 0 byte) and the second skew (for example, 448 byte) of location is associated.Computing device can continue frame 808
In operation calculating offset address.In certain aspects, when the basic skew for being obtained includes two skews, computing device
Two offset address can be calculated for detached section is placed in cache.
Fig. 8 C show the aspect method 870 that the data in the cache line of cache are compacted for computing device.
Operation of the operation of method 870 similar to the method 800 above with reference to described by Fig. 8 A, difference are that method 870 can be with
Including for prefetching the operation of next data segments.Operation in frame 802-804,808-812 can with above with reference to Fig. 8 A institutes
What is described is identical.
In decision box 872, the processor of computing device can determine that next data segments (or second data segments) are
It is no with using the operation in frame 812 come the consecutive correct compression ratio of data segments (or first data segments) that reads.
For example, when computing device is configured to larger read requests size to perform optional prefetching, computing device can fetch or
Read and data segments (performing the operation of frame 802-808 in cache to be sized and identify for the data segments)
Close to subsequent data chunk (or compressed data block).This determination can be based on the data field being configured to determine that with read
Section close to data block whether with the logic testing routine of the consecutive correct hierarchy compression of data segments for being read or
Circuit.In other words, computing device can determine whether next data segments are compressed to a certain being sized such that to data field
The reading of section can also include next data segments.In response to determining that next data segments have consecutive correct compression ratio
(namely it is decided that frame 872="Yes"), the processor of computing device can in frame 874 using the reading to another data segments come
Prefetch next data segments.In response to determining that next data segments do not have consecutive correct compression ratio (namely it is decided that frame 872
="No"), the processor of computing device can be with ending method 870.
Fig. 9 shows
The aspect method 900 of data.Operation of the operation of method 900 similar to the method 800 above with reference to described by Fig. 8 A, difference
Place is that method 900 can be included for computing device also for example by holding before data segments are stored in offset address
Row compression algorithm carrys out the operation of compressed data section.
In frame 802,808-808 operation can with it is identical above with reference to described by Fig. 8 A.In frame 901, calculating sets
Standby processor can identify the compression ratio for data segments, for example can be with the optimal compression ratio of compressed data section (for example,
4:1、4:2、4:3、4:4 etc.).Operation in frame 901 can operate class with those described by the frame 804 above with reference to Fig. 8
Seemingly.In box 902, the processor of computing device for example by fetching data section from cache read can be read at base address
Section fetch data as uncompressed data.In frame 904, the processor of computing device can be pressed with the compression ratio for being identified
Contracting data segments.For example, the compression ratio (for example, 4 for being identified based on the maximum that can be used for data segments:1、4:2、4:3、4:4 etc.
Deng), computing device can perform compression algorithm or routine so that the size of data segments reduces.As described above, count
Calculation equipment can obtain basic skew in frame 806 and calculate offset address in block 808.In frame 906, computing device
Processor can store compressed data section at the offset address for being calculated.
Various forms of computing devices (for example, personal computer, smart phone, laptop computer etc.) can be used for
Realize various aspects.This computing device generally includes the component shown in Figure 10, and wherein Figure 10 shows example calculation
Equipment 1000.In in all fields, computing device 1000 can include processor 1001, and wherein processor 1001 is coupled to touch
Screen controller 1004 and internal storage 1002.Processor 1001 can be specified for universal or special process task one
Individual or multiple multinuclear IC.Internal storage 1002 can be volatibility or nonvolatile memory, and can be safety and/or
Encrypted memory or dangerous and/or unencryption memorizer or its combination in any.Touch screen controller 1004 and process
Device 1001 is also coupled to touch panel 1012, such as resistance-type sensing touch screen, capacitance type sensing touch screen, infrared sense
Survey touch screen etc..In certain aspects, computing device 1000 can have one or more 1008 (examples of wireless signal transceiver
Such as,Wi-Fi, RF radio) and the antenna for sending and receiving
1010, these antenna is coupled to each other and/or is coupled to processor 1001.Transceiver 1008 and antenna 1010 can with it is mentioned above
Circuit be used together to realize various wireless transmission protocol stacks and interface.In certain aspects, computing device 1000 can be wrapped
Cellular network wireless modem chip 1016 is included, which is realized the communication via cellular network and is coupled to processor.Meter
Calculation equipment 1000 can include the ancillary equipment connecting interface 1018 for being coupled to processor 1001.Ancillary equipment connecting interface 1018
Can individually be configured to receive a type of connection, or multiple be configured to receive various types of public or special
Some physics and communication connection, such as USB, FireWire, Thunderbolt or PCIe.Ancillary equipment connecting interface 1018 is also
It may be coupled to the ancillary equipment connectivity port (not shown) of similar configuration.Computing device 1000 can also be included for providing sound
The speaker 1014 of frequency output.The shell that computing device 1000 can also include by the combination of plastics, metal or each material constructing
Body 1020 is for comprising all or some component in component discussed in this article.Computing device 1000 can include being coupled to
The power supply 1022 of processor 1001, such as disposable or rechargeable battery.Rechargeable battery is also coupled to periphery and sets
Standby connectivity port receives charging current with the source from outside computing device 1000.
Processor 1001 can be configured to perform various functions (including above by software instruction (for example, applying)
The function of described various aspects) any programmable microprocessor, pico computer or multiple processor chips.Set at each
In standby, multiple processors can be provided, such as one processor is exclusively used in radio communication function, and a processor is exclusively used in
Operation other application.Generally, software application is accessed and can be stored in internal storage before being loaded into processor 1001
In 1002.Processor 1001 can include being enough to store the internal storage of application software instructions.In many equipment, inside is deposited
Reservoir can be volatibility or nonvolatile memory (for example, flash memory) or the mixing of the two.For purposes of this description, it is right
The general reference of memorizer refers to the memorizer that can be accessed by processor 1001, including internal storage or is inserted into each and sets
Removable memory and the memorizer in processor 1001 in standby.
Preceding method description and procedure graph are provided as just illustrated examples, it is not intended that require or hint must be used
The step of given order is to perform various aspects.As skilled in the art will be aware of, can with any order come
Perform the order of the step in aforementioned aspect.Such as " hereafter ", " and then ", the word of " following " etc is not intended to step
Order limited;These words are only used for guiding reader through the description to method.Additionally, in the singular will to right
Any reference of key element is sought, for example, be should not be construed as and key element is limited to into odd number using article " ", " one " or " described ".
Various illustrative box, module, circuit and algorithm steps with reference to described by each side disclosed herein can
To be embodied as electronic hardware, computer software or combination.In order to be clearly shown that this of hardware and software can be mutual
It is transsexual, general description has been carried out around its function to various illustrative components, frame, module, circuit and step above.
Hardware or software are implemented as this function, this depends on the design specifically applied and be applied in total system about
Beam.Technical staff can be directed to each application-specific and realize described function in a different manner, but this realization is determined
Plan should not be interpreted as causing deviation the scope of the present invention.
Using being designed to perform the general processor of functionality described herein, digital signal processor (DSP), special
Integrated circuit (ASIC), field programmable gate array (FPGA) or other PLDs, discrete gate or transistor are patrolled
Collect device, discrete hardware components or its combination in any, it is possible to achieve or perform for realizing with reference to each side disclosed herein
The hardware of described various illustrative logical block, box, module and circuit.General processor can be microprocessor
Device, but in replacement scheme, the processor can be any conventional processor, controller, microcontroller or state machine.Process
Device is also implemented as the combination of computing device, for example, the combination of DSP and microprocessor, multi-microprocessor and DSP core
With reference to one or more microprocessors, or any other such configuration.Alternatively, can be by the electricity specific to given function
Road is performing some steps or method.
In one or more illustrative aspects, institute can be realized in hardware, software, firmware or its combination in any
The function of description.If realized in software, the function can be stored in non-temporary as one or more instructions or code
When property processor is readable, on computer-readable or server computer-readable recording medium or non-transitory processor readable storage medium or
Person is transmitted by which.The step of method disclosed herein or algorithm can be embodied in the executable software module of processor or
In processor executable, the executable software module of the processor or processor executable may reside within non-transitory
Computer-readable recording medium, non-transitory server readable storage medium storing program for executing and/or non-transitory processor readable storage medium
On.For example, such instruction can be stored processor executable software instruction.Tangible non-transitory computer-readable is deposited
Storage media can be by any usable medium of computer access.Mode nonrestrictive by way of example, it is this non-temporary
When property computer-readable medium can include RAM, ROM, EEPROM, CD-ROM or other optical disc storages, disk storage or other
Magnetic storage apparatus, or can be used for store with instruction or data structure form expectation program code and can be by computer
Any other medium of access.As it is used herein, disk (disk) and CD (disc) are including compact disk (CD), laser
The usual magnetically replicate data of CD, CD, digital versatile disc (DVD), floppy disk and Blu-ray Disc, wherein disk, and light
Disk utilizes laser to optically replicate data.Combinations of the above is also included within the scope of non-transitory computer-readable medium
It is interior.In addition, the operation of method or algorithm can reside in as one of code and/or instruction or combination in any or set can be with
Tangible non-transitory processor readable storage medium and/or the computer-readable medium being merged in computer program
On.
Being previously described so that any person skilled in the art can implement or use this to disclosed aspect is provided
It is bright.Various modifications in terms of these will be apparent to those skilled in the art, and without departing from this
In the case of bright spirit or scope, generic principles defined herein can apply to other side.Therefore, the present invention not
Be intended to be limited to aspect shown herein, but to be given with appended claims and principles disclosed herein and
The consistent widest scope of novel features.
Claims (62)
1. a kind of method of the data for being compacted in the cache line of the cache of computing device, including:
The base address for the first data segments is identified by the processor of the computing device;
The size of data for first data segments is identified by the processor of the computing device;
By the computing device based on first data segments the size of data for being identified and the base address obtaining base
This skew;And
The base address is entered line displacement to calculate offset address by using the basic skew for being obtained by the computing device,
Wherein, the offset address for being calculated is associated with the second data segments.
2. method according to claim 1, wherein, the size of data for being identified and institute based on first data segments
State base address obtain the basic skew be by the computing device the computing device software or be coupled to described
One of special circuit of the processor of computing device performing, and
Wherein, line displacement is entered to the base address by using the basic skew for being obtained to calculate the offset address be by institute
The Special electric of the computing device software for stating computing device or the processor for being coupled to the computing device
One of road is performing.
3. method according to claim 1, wherein, the base address is physical address or virtual address.
4. method according to claim 1, wherein, the size of data for being identified be based on the first data segments phase
The compression ratio of association is identifying.
5. method according to claim 4, wherein, the compression ratio is in the following:4:1 compression ratio, 4:2
Compression ratio, 4:3 compression ratios or 4:4 compression ratios.
6. method according to claim 1, wherein, by the computing device based on the compression ratio for being identified and described first
The base address of data segments come obtain it is described it is basic skew include:
By the processor of the computing device based on the base address identifying the odd even for first data segments
Check value;And
By the processor of the computing device using the compression ratio for being identified and the parity values for being identified to obtain
State basic skew.
7. method according to claim 6, wherein, obtained using identified compression ratio and the parity values for being identified
Obtaining the basic skew includes:By the computing device the processor is by using the compression ratio that identified and is identified
Parity values perform lookup to obtain the basic skew to the table for being stored.
8. method according to claim 6, wherein, the parity values indicate the base of first data segments
One in address is odd number or even number.
9. method according to claim 6, wherein, the parity values are based on described in first data segments
Two positions in base address.
10. method according to claim 1, wherein, by the computing device being marked based on first data segments
The size of data of knowledge and the base address include obtaining the basic skew:
The the first basic skew and the second basic skew and for first data segments is obtained by the computing device
One size of data and the second size of data;And by the computing device by using the basic skew for being obtained to the base
Line displacement is entered in location to be included calculating the offset address:First for first size of data is calculated by the computing device
Offset address and the second offset address for second size of data.
11. methods according to claim 1, also include:It is inclined what is calculated by the processor of the computing device
Move storage first data segments at address.
12. methods according to claim 11, wherein, by the processor of the computing device in the skew for being calculated
First data segments are stored at address to be included:
First data segments are read as uncompressed in the base address by the processor of the computing device
Data;
By the processor of the computing device with the size of data that identified compressing first data segments;And
Compressed first data segments are stored in the offset address for being calculated by the processor of the computing device.
13. methods according to claim 11, wherein, it is right by using the skew substantially for being obtained by the computing device
The base address enter line displacement calculate the offset address first data segments by compression after complete.
14. methods according to claim 1, also include:It is inclined what is calculated by the processor of the computing device
Move.
15. methods according to claim 14, also include:Determine that by the computing device second data segments are
It is no with order to the consecutive correct compression ratio of first data segments,
Wherein, the first data segments bag is read in the offset address for being calculated by the processor of the computing device
Include:In response to determining that second data segments have the correct compression ratio, by the processor profit of the computing device
Second data segments are prefetched with first data segments.
16. methods according to claim 14, also include:By the computing device the processor to read
One data segments are decompressed.
A kind of 17. computing devices, including:
The unit of the base address of the first data segments is directed to for mark;
The unit of the size of data of first data segments is directed to for mark;
The list of basic skew is obtained for based on the size of data for being identified of first data segments and the base address
Unit;And
For entering line displacement to the base address to calculate the unit of offset address by using the basic skew for being obtained, its
In, the offset address for being calculated is associated with the second data segments.
18. computing device according to claim 17, wherein, the base address is physical address or virtual address.
19. computing devices according to claim 17, wherein, the size of data for being identified be based on first data
Section associated compression ratio is identifying.
20. computing devices according to claim 19, wherein, the compression ratio is in the following:4:1 compression
Than, 4:2 compression ratios, 4:3 compression ratios or 4:4 compression ratios.
21. computing devices according to claim 17, wherein, for based on the compression ratio for being identified and first data
The base address of section includes come the unit for obtaining the basic skew:
The unit of the parity values for first data segments is identified for based on the base address;And
The unit of the basic skew is obtained for using the compression ratio for being identified and the parity values for being identified.
22. computing devices according to claim 21, wherein, for the odd even for using identified compression ratio He identified
Check value includes come the unit for obtaining the basic skew:For by using the compression ratio for being identified and the odd even school for being identified
Test table of the value to being stored and perform the unit searched to obtain the basic skew.
23. computing devices according to claim 21, wherein, the parity values indicate first data segments
One in the base address is odd number or even number.
24. computing devices according to claim 21, wherein, the parity values are based on first data segments
The base address in two positions.
25. computing devices according to claim 17, wherein:
The basic skew is obtained for based on the size of data for being identified of first data segments and the base address
Unit include:For obtaining the first basic skew and the second basic skew and the first number for first data segments
According to size and the unit of the second size of data;And
For entering line displacement to the base address to calculate the unit of the offset address by using the basic skew for being obtained
Including:For calculating for the first offset address of first size of data and for the second of second size of data
The unit of offset address.
26. computing devices according to claim 17, also include:For storage described the at the offset address for being calculated
The unit of one data segments.
27. computing devices according to claim 26, wherein, for storing described first at the offset address for being calculated
The unit of data segments includes:
For first data segments are read at the base address as the unit of uncompressed data;
The unit of first data segments is compressed for the size of data that identified;And
For the unit of the first compressed data segments is stored at the offset address for being calculated.
28. computing devices according to claim 26, wherein, for by using the basic skew for being obtained to the base
The unit that line displacement is entered to calculate the offset address in address includes:For first data segments by compression after lead to
Cross and utilize the basic skew for being obtained to enter line displacement to the base address to calculate the unit of the offset address.
29. computing devices according to claim 17, also include:For described the is read at the offset address for being calculated
The unit of one data segments.
30. computing devices according to claim 29, also include:For determining whether second data segments have it is
Unit with the consecutive correct compression ratio of first data segments,
Wherein, the unit for first data segments are read at the offset address for being calculated includes:For in response to true
Fixed second data segments have the correct compression ratio, and second data field is prefetched using first data segments
The unit of section.
31. computing devices according to claim 29, also include:Decompress for the first data segments to being read
The unit of contracting.
A kind of 32. computing devices, including processor, the processor are configured with for performing the process for including following operation
Device executable instruction:
Base address of the mark for the first data segments;
Size of data of the mark for first data segments;
Substantially offset to obtain based on the size of data for being identified and the base address of first data segments;And
The base address is entered line displacement to calculate offset address by using the basic skew for being obtained, wherein, calculated
Offset address is associated with the second data segments.
33. computing devices according to claim 32, also including special circuit, the special circuit is coupled to the process
Device and be configured to perform include following operation:
Based on the size of data for being identified and the base address of first data segments obtaining the basic skew;And
Enter line displacement to the base address to calculate the offset address by using the basic skew for being obtained.
34. computing devices according to claim 32, wherein, the base address is physical address or virtual address.
35. computing devices according to claim 32, wherein, the size of data for being identified be by the processor based on
First data segments associated compression ratios is identifying.
36. computing devices according to claim 35, wherein, the compression ratio is in the following:4:1 compression
Than, 4:2 compression ratios, 4:3 compression ratios or 4:4 compression ratios.
37. computing devices according to claim 32, wherein, the processor be configured with processor executable with
Operation is performed, so that obtaining described basic based on the base address of the compression ratio for being identified and first data segments
Skew includes:
The parity values for first data segments are identified based on the base address;And
The basic skew is obtained using identified compression ratio and the parity values for being identified.
38. computing devices according to claim 37, wherein, the processor be configured with processor executable with
Operation is performed, so that the basic skew is obtained using the compression ratio for being identified and the parity values for being identified including:
Lookup is performed to the table for being stored by using the compression ratio for being identified and the parity values for being identified described basic to obtain
Skew.
39. computing devices according to claim 37, wherein, the parity values indicate first data segments
One in the base address is odd number or even number.
40. computing devices according to claim 37, wherein, the parity values are based on first data segments
The base address in two positions.
41. computing devices according to claim 32, wherein, the processor be configured with processor executable with
Execution is operable so that:
Based on first data segments the size of data for being identified and the base address come obtain it is described it is basic skew include:
Obtain the first basic skew and the second basic skew and the first size of data and the second number for first data segments
According to size;And
Entering line displacement to the base address by using the basic skew for being obtained includes calculating the offset address:Calculate pin
The first offset address and the second offset address for second size of data to first size of data.
42. computing devices according to claim 32, wherein, the processor be configured with processor executable with
Perform:First data segments are stored at the offset address for being calculated.
43. computing devices according to claim 42, wherein, the processor be configured with processor executable with
Operation is performed, so that first data segments are stored at the offset address for being calculated including:
First data segments are read at the base address as uncompressed data;
First data segments are compressed with the size of data that identified;And
The first compressed data segments are stored at the offset address for being calculated.
44. computing devices according to claim 42, wherein, the processor be configured with processor executable with
Operation is performed, so that entering line displacement to the base address to calculate the offset address by using the basic skew for being obtained
First data segments by compression after complete.
45. computing devices according to claim 32, wherein, the processor be configured with processor executable with
Perform:First data segments are read at the offset address for being calculated.
46. computing devices according to claim 45, wherein, the processor be configured with processor executable with
Perform:Determine second data segments whether with order to mutually continuous with first data segments
Correct compression ratio,
Wherein, reading first data segments at the offset address for being calculated includes:In response to determining second data
Section has the correct compression ratio, and second data segments are prefetched using first data segments.
47. computing devices according to claim 45, wherein, the processor be configured with processor executable with
Perform:The first data segments to being read are decompressed.
A kind of 48. non-transitory processor readable storage mediums for being stored thereon with processor executable, the processor
Executable instruction is configured such that the computing device of computing device includes following operation:
Base address of the mark for the first data segments;
Size of data of the mark for first data segments;
Substantially offset to obtain based on the size of data for being identified and the base address of first data segments;And
The base address is entered line displacement to calculate offset address by using the basic skew for being obtained, wherein, calculated
Offset address is associated with the second data segments.
49. non-transitory processor readable storage mediums according to claim 48, wherein, the base address is physically
Location or virtual address.
50. non-transitory processor readable storage mediums according to claim 48, wherein, the size of data for being identified is
By the processor based on the compression ratio being associated with first data segments identifying.
51. non-transitory processor readable storage mediums according to claim 50, wherein, the compression ratio is following
One in:4:1 compression ratio, 4:2 compression ratios, 4:3 compression ratios or 4:4 compression ratios.
52. non-transitory processor readable storage mediums according to claim 48, wherein, the processor for being stored can be held
Row instruction is configured such that the computing device operation of the computing device so that based on the compression ratio for being identified and
The base address of first data segments come obtain it is described it is basic skew include:
The parity values for first data segments are identified based on the base address;And
The basic skew is obtained using identified compression ratio and the parity values for being identified.
53. non-transitory processor readable storage mediums according to claim 52, wherein, the processor for being stored can be held
Row instruction is configured such that the computing device operation of the computing device so that use identified compression ratio and
The parity values for being identified obtain it is described it is basic skew include:By using the compression ratio for being identified and the odd even for being identified
Check value performs lookup to obtain the basic skew to the table for being stored.
54. non-transitory processor readable storage mediums according to claim 52, wherein, the parity values are indicated
One in the base address of first data segments is odd number or even number.
55. non-transitory processor readable storage mediums according to claim 52, wherein, the parity values are bases
Two positions in the base address of first data segments.
56. non-transitory processor readable storage mediums according to claim 48, wherein, the processor for being stored can be held
Row instruction is configured such that the computing device of the computing device is operable so that:
Based on first data segments the size of data for being identified and the base address come obtain it is described it is basic skew include:
Obtain the first basic skew and the second basic skew and the first size of data and the second number for first data segments
According to size;And
Entering line displacement to the base address by using the basic skew for being obtained includes calculating the offset address:Calculate pin
The first offset address and the second offset address for second size of data to first size of data.
57. non-transitory processor readable storage mediums according to claim 48, wherein, the processor for being stored can be held
Row instruction is configured such that the computing device of the computing device also includes following operation:In the skew for being calculated
First data segments are stored at address.
58. non-transitory processor readable storage mediums according to claim 57, wherein, the processor for being stored can be held
Row instruction is configured such that the computing device operation of the computing device, so that at the offset address for being calculated
Storing first data segments includes:
First data segments are read at the base address as uncompressed data;
First data segments are compressed with the size of data that identified;And
The first compressed data segments are stored at the offset address for being calculated.
59. non-transitory processor readable storage mediums according to claim 57, wherein, the processor for being stored can be held
Row instruction is configured such that the computing device operation of the computing device, so that basic by using what is obtained
Skew line displacement is entered to the base address come calculate the offset address be first data segments by compression after it is complete
Into.
60. non-transitory processor readable storage mediums according to claim 48, wherein, the processor for being stored can be held
Row instruction is configured such that the computing device of the computing device also includes following operation:In the skew for being calculated
First data segments are read at address.
61. non-transitory processor readable storage mediums according to claim 60, wherein, the processor for being stored can be held
Row instruction is configured such that the computing device of the computing device also includes following operation:Determine second number
According to section whether with order to the consecutive correct compression ratio of first data segments,
Wherein, reading first data segments at the offset address for being calculated includes:In response to determining second data
Section has the correct compression ratio, and second data segments are prefetched using first data segments.
62. non-transitory processor readable storage mediums according to claim 60, wherein, the processor for being stored can be held
Row instruction is configured such that the computing device of the computing device also includes following operation:To read first
Data segments are decompressed.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/451,639 | 2014-08-05 | ||
US14/451,639 US9361228B2 (en) | 2014-08-05 | 2014-08-05 | Cache line compaction of compressed data segments |
PCT/US2015/039736 WO2016022247A1 (en) | 2014-08-05 | 2015-07-09 | Cache line compaction of compressed data segments |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106575263A true CN106575263A (en) | 2017-04-19 |
Family
ID=53758529
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201580041874.4A Pending CN106575263A (en) | 2014-08-05 | 2015-07-09 | Cache line compaction of compressed data segments |
Country Status (5)
Country | Link |
---|---|
US (2) | US9361228B2 (en) |
EP (1) | EP3178005B1 (en) |
JP (1) | JP6370988B2 (en) |
CN (1) | CN106575263A (en) |
WO (1) | WO2016022247A1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111367831A (en) * | 2020-03-26 | 2020-07-03 | 超验信息科技(长沙)有限公司 | Deep prefetching method and component for translation page table, microprocessor and computer equipment |
CN112699063A (en) * | 2021-03-25 | 2021-04-23 | 轸谷科技(南京)有限公司 | Dynamic caching method for solving storage bandwidth efficiency of general AI processor |
CN114422499A (en) * | 2021-12-27 | 2022-04-29 | 北京奇艺世纪科技有限公司 | File downloading method, system and device |
Families Citing this family (26)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9361228B2 (en) | 2014-08-05 | 2016-06-07 | Qualcomm Incorporated | Cache line compaction of compressed data segments |
JP2016091242A (en) * | 2014-10-31 | 2016-05-23 | 富士通株式会社 | Cache memory, access method to cache memory and control program |
US10025956B2 (en) * | 2015-12-18 | 2018-07-17 | Intel Corporation | Techniques to compress cryptographic metadata for memory encryption |
US9916245B2 (en) * | 2016-05-23 | 2018-03-13 | International Business Machines Corporation | Accessing partial cachelines in a data cache |
US10042737B2 (en) | 2016-08-31 | 2018-08-07 | Microsoft Technology Licensing, Llc | Program tracing for time travel debugging and analysis |
US10031834B2 (en) | 2016-08-31 | 2018-07-24 | Microsoft Technology Licensing, Llc | Cache-based tracing for time travel debugging and analysis |
US10031833B2 (en) | 2016-08-31 | 2018-07-24 | Microsoft Technology Licensing, Llc | Cache-based tracing for time travel debugging and analysis |
US10489273B2 (en) | 2016-10-20 | 2019-11-26 | Microsoft Technology Licensing, Llc | Reuse of a related thread's cache while recording a trace file of code execution |
US10310977B2 (en) | 2016-10-20 | 2019-06-04 | Microsoft Technology Licensing, Llc | Facilitating recording a trace file of code execution using a processor cache |
US10324851B2 (en) | 2016-10-20 | 2019-06-18 | Microsoft Technology Licensing, Llc | Facilitating recording a trace file of code execution using way-locking in a set-associative processor cache |
US10310963B2 (en) | 2016-10-20 | 2019-06-04 | Microsoft Technology Licensing, Llc | Facilitating recording a trace file of code execution using index bits in a processor cache |
US10540250B2 (en) | 2016-11-11 | 2020-01-21 | Microsoft Technology Licensing, Llc | Reducing storage requirements for storing memory addresses and values |
US10318332B2 (en) | 2017-04-01 | 2019-06-11 | Microsoft Technology Licensing, Llc | Virtual machine execution tracing |
US10296442B2 (en) | 2017-06-29 | 2019-05-21 | Microsoft Technology Licensing, Llc | Distributed time-travel trace recording and replay |
US10459824B2 (en) | 2017-09-18 | 2019-10-29 | Microsoft Technology Licensing, Llc | Cache-based trace recording using cache coherence protocol data |
US10558572B2 (en) | 2018-01-16 | 2020-02-11 | Microsoft Technology Licensing, Llc | Decoupling trace data streams using cache coherence protocol data |
US11907091B2 (en) | 2018-02-16 | 2024-02-20 | Microsoft Technology Licensing, Llc | Trace recording by logging influxes to an upper-layer shared cache, plus cache coherence protocol transitions among lower-layer caches |
US10496537B2 (en) | 2018-02-23 | 2019-12-03 | Microsoft Technology Licensing, Llc | Trace recording by logging influxes to a lower-layer cache based on entries in an upper-layer cache |
US10642737B2 (en) | 2018-02-23 | 2020-05-05 | Microsoft Technology Licensing, Llc | Logging cache influxes by request to a higher-level cache |
KR20200006379A (en) * | 2018-07-10 | 2020-01-20 | 에스케이하이닉스 주식회사 | Controller and operating method thereof |
US10942808B2 (en) * | 2018-12-17 | 2021-03-09 | International Business Machines Corporation | Adaptive data and parity placement using compression ratios of storage devices |
US10997085B2 (en) * | 2019-06-03 | 2021-05-04 | International Business Machines Corporation | Compression for flash translation layer |
US11601136B2 (en) | 2021-06-30 | 2023-03-07 | Bank Of America Corporation | System for electronic data compression by automated time-dependent compression algorithm |
US11567872B1 (en) * | 2021-07-08 | 2023-01-31 | Advanced Micro Devices, Inc. | Compression aware prefetch |
US11573899B1 (en) * | 2021-10-21 | 2023-02-07 | International Business Machines Corporation | Transparent interleaving of compressed cache lines |
US12014047B2 (en) * | 2022-08-24 | 2024-06-18 | Red Hat, Inc. | Stream based compressibility with auto-feedback |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030131184A1 (en) * | 2002-01-10 | 2003-07-10 | Wayne Kever | Apparatus and methods for cache line compression |
US20040073747A1 (en) * | 2002-10-10 | 2004-04-15 | Synology, Inc. | Method, system and apparatus for scanning newly added disk drives and automatically updating RAID configuration and rebuilding RAID data |
US20060184734A1 (en) * | 2005-02-11 | 2006-08-17 | International Business Machines Corporation | Method and apparatus for efficiently accessing both aligned and unaligned data from a memory |
CN102141905A (en) * | 2010-01-29 | 2011-08-03 | 上海芯豪微电子有限公司 | Processor system structure |
CN102541747A (en) * | 2010-10-25 | 2012-07-04 | 马维尔国际贸易有限公司 | Data compression and encoding in a memory system |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH07129470A (en) * | 1993-11-09 | 1995-05-19 | Hitachi Ltd | Disk control method |
JP3426385B2 (en) * | 1995-03-09 | 2003-07-14 | 富士通株式会社 | Disk controller |
US6658552B1 (en) * | 1998-10-23 | 2003-12-02 | Micron Technology, Inc. | Processing system with separate general purpose execution unit and data string manipulation unit |
US7143238B2 (en) | 2003-09-30 | 2006-11-28 | Intel Corporation | Mechanism to compress data in a cache |
US7162584B2 (en) | 2003-12-29 | 2007-01-09 | Intel Corporation | Mechanism to include hints within compressed data |
US7162583B2 (en) | 2003-12-29 | 2007-01-09 | Intel Corporation | Mechanism to store reordered data with compression |
US7257693B2 (en) | 2004-01-15 | 2007-08-14 | Intel Corporation | Multi-processor computing system that employs compressed cache lines' worth of information and processor capable of use in said system |
US8341380B2 (en) | 2009-09-22 | 2012-12-25 | Nvidia Corporation | Efficient memory translator with variable size cache line coverage |
US9361228B2 (en) | 2014-08-05 | 2016-06-07 | Qualcomm Incorporated | Cache line compaction of compressed data segments |
-
2014
- 2014-08-05 US US14/451,639 patent/US9361228B2/en active Active
-
2015
- 2015-07-09 CN CN201580041874.4A patent/CN106575263A/en active Pending
- 2015-07-09 JP JP2017505616A patent/JP6370988B2/en not_active Expired - Fee Related
- 2015-07-09 WO PCT/US2015/039736 patent/WO2016022247A1/en active Application Filing
- 2015-07-09 EP EP15742447.4A patent/EP3178005B1/en not_active Not-in-force
-
2016
- 2016-03-22 US US15/077,534 patent/US10261910B2/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030131184A1 (en) * | 2002-01-10 | 2003-07-10 | Wayne Kever | Apparatus and methods for cache line compression |
US20040073747A1 (en) * | 2002-10-10 | 2004-04-15 | Synology, Inc. | Method, system and apparatus for scanning newly added disk drives and automatically updating RAID configuration and rebuilding RAID data |
US20060184734A1 (en) * | 2005-02-11 | 2006-08-17 | International Business Machines Corporation | Method and apparatus for efficiently accessing both aligned and unaligned data from a memory |
CN102141905A (en) * | 2010-01-29 | 2011-08-03 | 上海芯豪微电子有限公司 | Processor system structure |
CN102541747A (en) * | 2010-10-25 | 2012-07-04 | 马维尔国际贸易有限公司 | Data compression and encoding in a memory system |
Non-Patent Citations (1)
Title |
---|
ALAMELDEEN A R ET AL: "Adaptive cache compression for high-performance processors", 《COMPUTER ARCHITECTURE, 2004. PROCEEDINGS. 31ST ANNUAL INTERNATIONAL SY MPOSIUM ON MUNCHEN》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111367831A (en) * | 2020-03-26 | 2020-07-03 | 超验信息科技(长沙)有限公司 | Deep prefetching method and component for translation page table, microprocessor and computer equipment |
CN111367831B (en) * | 2020-03-26 | 2022-11-11 | 超睿科技(长沙)有限公司 | Deep prefetching method and component for translation page table, microprocessor and computer equipment |
CN112699063A (en) * | 2021-03-25 | 2021-04-23 | 轸谷科技(南京)有限公司 | Dynamic caching method for solving storage bandwidth efficiency of general AI processor |
CN112699063B (en) * | 2021-03-25 | 2021-06-22 | 轸谷科技(南京)有限公司 | Dynamic caching method for solving storage bandwidth efficiency of general AI processor |
CN114422499A (en) * | 2021-12-27 | 2022-04-29 | 北京奇艺世纪科技有限公司 | File downloading method, system and device |
CN114422499B (en) * | 2021-12-27 | 2023-12-05 | 北京奇艺世纪科技有限公司 | File downloading method, system and device |
Also Published As
Publication number | Publication date |
---|---|
JP6370988B2 (en) | 2018-08-08 |
EP3178005A1 (en) | 2017-06-14 |
US9361228B2 (en) | 2016-06-07 |
WO2016022247A1 (en) | 2016-02-11 |
EP3178005B1 (en) | 2018-01-10 |
US10261910B2 (en) | 2019-04-16 |
US20160203084A1 (en) | 2016-07-14 |
JP2017529591A (en) | 2017-10-05 |
US20160041905A1 (en) | 2016-02-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106575263A (en) | Cache line compaction of compressed data segments | |
CN106537327B (en) | Flash memory compression | |
US20170177497A1 (en) | Compressed caching of a logical-to-physical address table for nand-type flash memory | |
US11010079B2 (en) | Concept for storing file system metadata within solid-stage storage devices | |
US20150019834A1 (en) | Memory hierarchy using page-based compression | |
TWI750243B (en) | Nonvolatile memory storage device | |
CN106663059B (en) | Power-aware filling | |
TWI619018B (en) | Garbage collection method for data storage device | |
CN104281528A (en) | Data storage method and device | |
CN105373369A (en) | Asynchronous caching method, server and system | |
JP2017501504A (en) | System and method for defragmenting memory | |
CN106575262B (en) | The method and apparatus of supplement write-in cache command for bandwidth reduction | |
CN106687937B (en) | Cache bank expansion for compression algorithms | |
CN106610790A (en) | Repeated data deleting method and device | |
US9477605B2 (en) | Memory hierarchy using row-based compression | |
CN107077423A (en) | The local sexual system of efficient decompression for demand paging | |
CN109313609A (en) | The system and method to interweave for odd mode storage channel | |
CN108604211A (en) | System and method for the multi-tiling data transactions in system on chip | |
CN107078746A (en) | The decompression time is reduced without influenceing compression ratio | |
US11079955B2 (en) | Concept for approximate deduplication in storage and memory | |
CN107111560A (en) | System and method for providing improved delay in non-Unified Memory Architecture | |
US20210200679A1 (en) | System and method for mixed tile-aware and tile-unaware traffic through a tile-based address aperture | |
CN110235110A (en) | It is reduced or avoided when the write operation of pause occurs from the uncompressed cache memory compressed in storage system through evicting the buffering of high-speed buffer memory data from | |
CN107844579B (en) | Method, system and equipment for optimizing distributed database middleware access | |
CN108780422A (en) | It is compressed using compression indicator CI hint directories to provide bandwidth of memory in the system based on central processing unit CPU |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20170419 |