CN117331853B - Cache processing method, device, electronic equipment and medium - Google Patents

Cache processing method, device, electronic equipment and medium Download PDF

Info

Publication number
CN117331853B
CN117331853B CN202311316175.9A CN202311316175A CN117331853B CN 117331853 B CN117331853 B CN 117331853B CN 202311316175 A CN202311316175 A CN 202311316175A CN 117331853 B CN117331853 B CN 117331853B
Authority
CN
China
Prior art keywords
data
cache
address
request
virtual address
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311316175.9A
Other languages
Chinese (zh)
Other versions
CN117331853A (en
Inventor
陶友龙
刘洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hexin Technology Co ltd
Shanghai Hexin Digital Technology Co ltd
Original Assignee
Hexin Technology Co ltd
Shanghai Hexin Digital Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hexin Technology Co ltd, Shanghai Hexin Digital Technology Co ltd filed Critical Hexin Technology Co ltd
Priority to CN202311316175.9A priority Critical patent/CN117331853B/en
Publication of CN117331853A publication Critical patent/CN117331853A/en
Application granted granted Critical
Publication of CN117331853B publication Critical patent/CN117331853B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0811Multiuser, multiprocessor or multiprocessing cache systems with multilevel cache hierarchies
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0877Cache access modes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44521Dynamic linking or loading; Link editing at or after load time, e.g. Java class loading

Abstract

The application provides a cache processing method, a cache processing device, electronic equipment and a medium. The method comprises the following steps: obtaining a request virtual address according to a data loading request of a processor core; if the data loading request is not hit in the data cache of the current level, the data loading request is sent to the data cache of the next level, so that the data cache of the next level returns the target data and the identification domain of the target physical address; returning target data to the processor core, and determining identification domains of a plurality of physical addresses from the data cache of the level according to each bit except page flag bits of the index domain of the request virtual address; updating the valid bit of the cache line corresponding to the identification domain of the physical address identical to the identification domain of the target physical address to a value representing invalidity; and writing the target data into an updated cache line in the data cache of the level, and updating the valid bit of the updated cache line to a value representing the validity. The method is used for enabling the index range of the data cache to be capable of being spread, expanding the cache capacity and guaranteeing the accuracy of data access.

Description

Cache processing method, device, electronic equipment and medium
Technical Field
The present application relates to computer caching, and in particular, to a cache processing method, apparatus, electronic device, and medium.
Background
With the development of processors, the difference between the operating speed of external storage and the operating frequency of the processor is larger, so in modern computer systems, a cache (cache) is used as a buffer for coordinating the operating speed of a processor core and the operating speed of external storage, and has become an integral part of a processor system. Specifically, when the processor core needs to access data, the processor core will first search from the cache. If the data is in the cache, i.e. a cache hit, the data access is performed directly under the cache, when there is no corresponding data in the cache, the next level of cache may be accessed, and so on until the main memory is accessed.
In practical applications, with the continuous improvement of the performance of the processor and the expansion of the capacity of the main memory, higher requirements are also put on the capacity of the cache. In some technologies, the improvement of the cache capacity can be realized by adopting a multi-page architecture, but when data access is performed on a multi-page cache, the situation of page crossing can be faced, and the accuracy of the data access is affected.
Disclosure of Invention
The application provides a cache processing method, a device, electronic equipment and a medium, which are used for enabling an index range of a data cache to be capable of being spread, expanding the cache capacity and guaranteeing the accuracy of data access.
In one aspect, the present application provides a cache processing method, including:
obtaining a request virtual address according to a data loading request of a processor core; the index domain of the request virtual address comprises page identification bits, and the values of different page identification bits represent that the cache line corresponding to the request virtual address is positioned on different pages in the data array of the data cache of the level; the data array of the present level data cache comprises a plurality of pages;
If the data loading request is not hit in the data cache of the current level, the index domain of the virtual address of the request is saved, and the data loading request is sent to the data cache of the next level, so that the data cache of the next level returns the target data and the identification domain of the target physical address of the target data according to the data loading request;
Returning the target data to the processor core, and determining identification domains of a plurality of physical addresses from an address array of the data cache of the present level according to each bit except the page flag bit of the index domain of the request virtual address;
updating valid bits of cache lines corresponding to the identification domains of the physical addresses in the identification domains of the physical addresses which are the same as the identification domains of the target physical addresses to be invalid values; and writing the target data into an updated cache line in the data cache of the level, and updating the valid bit of the updated cache line to a value representing validity.
Optionally, the method further comprises:
determining a group in the data array of the data cache of the current level according to the index field of the request virtual address; determining a predicted cache line in the set based on a way prediction technique;
Inquiring address mapping storage according to the request virtual address to obtain a request physical address corresponding to the request virtual address;
if the identification domain of the request physical address is the same as the identification domain of the physical address of the prediction cache line and the valid bit corresponding to the prediction cache line is a value representing validity, determining that the data loading request hits in the data cache of the present level; otherwise, the data loading request is judged to be missed in the data cache of the level.
Optionally, the method further comprises:
If the data loading request hits in the data cache of the present level, according to the offset domain of the request virtual address, obtaining predicted cache data from the predicted cache line, and returning the predicted cache data to the processor core.
Optionally, the method further comprises:
Determining a plurality of first cache lines in a data array of the data cache according to the index domain of the request virtual address, wherein the index domain of the virtual address corresponding to the first cache lines is the same as the index domain of the request virtual address, and the corresponding valid bit is a value representing validity;
Inquiring address mapping storage according to the request virtual address to obtain a request physical address corresponding to the request virtual address;
If the identification domain of the request physical address is the same as the identification domain of the physical address of any first cache line, judging that the data loading request hits in the data cache of the level; otherwise, judging that the data loading request is not hit in the data cache of the present level.
Optionally, the method further comprises:
If the data loading request hits in the data cache of the present level, taking the first cache line with the same identification domain of the physical address and the identification domain of the request physical address in the plurality of first cache lines as a hit cache line;
And acquiring hit data from the hit cache line according to the offset field of the request virtual address, and returning the hit data to the processor core.
Optionally, the querying the address mapping storage according to the request virtual address to obtain a request physical address corresponding to the request virtual address includes:
inquiring whether the address mapping of the request virtual address exists in the first-level address mapping storage; the address mapping characterizes the corresponding relation between the virtual address and the physical address;
If so, obtaining a request physical address corresponding to the request virtual address according to the address mapping of the request virtual address in the first-level address mapping storage;
If the virtual address does not exist, the address mapping of the request virtual address is obtained from the next-stage address mapping storage, the request physical address corresponding to the request virtual address is obtained according to the address mapping of the request virtual address in the next-stage address mapping storage, and the address mapping of the request virtual address is added into the first-stage address mapping storage; the address maps other than the last stage store address maps for recording part of the virtual addresses, and the last stage address map stores address maps for storing all the virtual addresses.
Optionally, before writing the target data into the updated cache line in the data cache of the present level, the method further includes:
And determining the updated cache line from one group corresponding to the index domain of the request virtual address in the data array of the data cache of the current level based on a least recently used algorithm, and clearing the current data in the updated cache line.
Optionally, the page identification bit of the request virtual address is the highest bit of the index field of the request virtual address.
In another aspect, the present application provides a cache processing apparatus, applied to the present level data cache, including:
the first address calculation component is used for obtaining a request virtual address according to a data loading request of the processor core; the index field of the request virtual address comprises a page identification bit, wherein the value of the page identification bit represents a cache line corresponding to the request virtual address, and a page in a data array of the data cache of the current level is located; the data array of the present level data cache comprises a plurality of pages;
The first missing queue is used for storing an index domain of the request virtual address if the data loading request is not hit in the data cache of the current level, and sending the data loading request to the data cache of the next level so that the data cache of the next level returns the target data and an identification domain of a target physical address of the target data according to the data loading request;
the first result selection module is used for returning the target data to the processor kernel;
The return identification comparison module is used for determining a plurality of first cache lines from the data array of the data cache of the current level according to the index field of the request virtual address; in the index field of the virtual address of the first cache line, each bit except the page flag bit has the same value as the corresponding bit of the request virtual address; updating valid bits of the cache line, in which the identification domain of the physical address is the same as the identification domain of the target physical address, in the first cache line to represent invalid values;
the first cache updating module is used for writing the target data into an updated cache line in the data cache of the level;
The return identifier comparison module is further configured to update the valid bit of the updated cache line to a value that characterizes the validity.
Optionally, the apparatus further includes:
The first path prediction module is used for determining a group in the data array of the data cache of the current level according to the index domain of the request virtual address; determining a predicted cache line in the set based on a way prediction technique;
The first mapping inquiry module is used for inquiring address mapping storage according to the request virtual address to obtain a request physical address corresponding to the request virtual address;
the prediction identification comparison module is used for judging that the data loading request hits in the data cache of the level if the identification domain of the request physical address is the same as the identification domain of the physical address of the prediction cache line and the valid bit corresponding to the prediction cache line is a value representing validity; otherwise, the data loading request is judged to be missed in the data cache of the level.
Optionally, the apparatus further includes:
The data acquisition module is used for acquiring predicted cache data from the predicted cache line according to the offset domain of the request virtual address if the data loading request hits in the data cache of the current level;
The first result selection module is further configured to return the prediction cache data to the processor core.
Optionally, the apparatus further includes:
The first cache inquiry module is used for determining a plurality of first cache lines in the data array of the data cache according to the index domain of the request virtual address, wherein the index domain of the virtual address corresponding to the first cache lines is the same as the index domain of the request virtual address, and the corresponding valid bit is a value representing validity;
The first mapping query module is used for querying address mapping storage according to the request virtual address to obtain a request physical address corresponding to the request virtual address;
the prediction identification comparison module is further configured to determine that the data loading request hits in the data cache of the present level if the identification field of the physical address of the request is the same as the identification field of the physical address of any one of the first cache lines; otherwise, judging that the data loading request is not hit in the data cache of the present level.
Optionally, the data acquisition module is further configured to:
if the data loading request hits in the data cache of the present level, taking the first cache line with the same identification domain of the physical address and the identification domain of the request physical address in the plurality of first cache lines as a hit cache line; acquiring hit data from a hit cache line according to the offset field of the request virtual address;
The first result selection module is further configured to return the hit data to the processor core.
Optionally, the first mapping query module is specifically configured to:
inquiring whether the address mapping of the request virtual address exists in the first-level address mapping storage; the address mapping characterizes the corresponding relation between the virtual address and the physical address;
If so, obtaining a request physical address corresponding to the request virtual address according to the address mapping of the request virtual address in the first-level address mapping storage;
If the virtual address does not exist, the address mapping of the request virtual address is obtained from the next-stage address mapping storage, the request physical address corresponding to the request virtual address is obtained according to the address mapping of the request virtual address in the next-stage address mapping storage, and the address mapping of the request virtual address is added into the first-stage address mapping storage; the address maps other than the last stage store address maps for recording part of the virtual addresses, and the last stage address map stores address maps for storing all the virtual addresses.
Optionally, the first cache updating module is further configured to:
And determining the updated cache line from one group corresponding to the index domain of the request virtual address in the data array of the data cache of the current level based on a least recently used algorithm, and clearing the current data in the updated cache line.
Optionally, the page identification bit of the request virtual address is the highest bit of the index field of the request virtual address.
In yet another aspect, the present application provides an electronic device, comprising: a processor, and a memory communicatively coupled to the processor; the memory stores computer-executable instructions; the processor executes computer-executable instructions stored in the memory to implement the method as described above.
In yet another aspect, the application provides a computer-readable storage medium having stored therein computer-executable instructions for performing the method as described above when executed by a processor.
In the cache processing method, the device, the electronic equipment and the medium provided by the application, a page identification bit is defined in a virtual address of a cache line to represent a page where the cache line is located, when a data loading request of a processor core is received, a request virtual address is obtained, and a query is carried out in a local level cache, if no hit exists in the local level data cache, a target data corresponding to the request and an identification field of a physical address thereof are obtained from a next level data cache, and returned to the processor core, and valid bits of each cache line are updated, namely, according to each bit except for a page identification bit of an index field of the request virtual address, identification fields of a plurality of physical addresses are determined from an address array in the local level data cache, and valid bits of the cache line corresponding to the identification field of the physical address identical to the identification field of the target physical address are updated to be a value representing invalid, and valid bits of the updated cache line written with the target data in the local level cache are updated to be a value representing valid. In the scheme, the page identification bit is defined in the virtual address of the cache line, and the valid bit of each cache line in the data cache is updated based on the page identification bit, so that the valid bit management and maintenance under the condition of page crossing cache are accurately and rapidly realized, the index range of the data cache can be opened, the cache capacity is expanded, and the accuracy of data access is ensured.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
Fig. 1 is a schematic flow chart illustrating a cache processing method according to a first embodiment of the present application;
FIG. 2 is a schematic diagram schematically illustrating a basic memory architecture of a processor according to a first embodiment of the present application;
Fig. 3 is a schematic diagram schematically illustrating a structure of a present level data cache according to a first embodiment of the present application;
FIG. 4 is a diagram schematically showing an exemplary structure of a data cache data array according to the first embodiment of the present application;
FIG. 5 is a schematic flow chart of an embodiment of the present application for obtaining address mapping;
fig. 6 is a schematic structural diagram of a cache processing device according to a second embodiment of the present application;
Fig. 7 is a schematic structural diagram schematically illustrating another cache processing apparatus according to a third embodiment of the present application;
Fig. 8 is a schematic flow chart of another cache processing method according to the fourth embodiment of the present application;
fig. 9 is a schematic structural diagram of another cache processing apparatus according to a fifth embodiment of the present application;
FIG. 10 is a schematic diagram schematically illustrating another cache processing system according to a sixth embodiment of the present application;
fig. 11 is a schematic structural diagram of a buffering electronic device according to a seventh embodiment of the present application.
Specific embodiments of the present application have been shown by way of the above drawings and will be described in more detail below. The drawings and the written description are not intended to limit the scope of the inventive concepts in any way, but rather to illustrate the inventive concepts to those skilled in the art by reference to the specific embodiments.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the application. Rather, they are merely examples of apparatus and methods consistent with aspects of the application as detailed in the accompanying claims.
In order to clearly describe the technical solution of the embodiments of the present application, in the embodiments of the present application, the words "first", "second", etc. are used to distinguish the same item or similar items having substantially the same function and effect. It will be appreciated by those of skill in the art that the words "first," "second," and the like do not limit the amount and order of execution, and that the words "first," "second," and the like do not necessarily differ.
It should be noted that, in the embodiments of the present application, words such as "exemplary" or "such as" are used to mean serving as an example, instance, or illustration. Any embodiment or design described herein as "exemplary" or "for example" should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete fashion.
The modules in the present application refer to functional modules or logic modules. It may be in the form of software, the functions of which are implemented by the execution of program code by a processor; or may be in hardware. "and/or", describes an association relationship of an association object, and indicates that there may be three relationships, for example, a and/or B, and may indicate: a exists alone, A and B exist together, and B exists alone. The character "/" generally indicates that the context-dependent object is an "or" relationship.
With the development of processors, the operating speed of external storage is increasingly different from the operating frequency of processors, so in modern computer systems, caches have become an integral part of the processor system as buffers for coordinating the operating speed of the processor cores with the operating speed of external storage. The data cache is usually in a physically marked Virtual cache mode (Virtually-Indexed Physically-Tagged, VIPT), that is, a cache line is searched by accessing the cache through an index field of a Virtual Address (VA) output by a processor core, and meanwhile, the Virtual Address is subjected to Address translation, after the Address translation is completed, the cache line is matched with an identification field of a physical Address (PHYSICAL ADDRESS, PA) corresponding to the Virtual Address, and whether the data hit is judged.
In a multi-core processor, data in a memory can have data copies in a plurality of processor cores, and a certain processor core generates modification operation, so that the problem of inconsistent data is generated. While coherency protocols are used to ensure coherency of cache-shared data among multiple processor cores, the write-invalidate protocol (write-invalidate) may be used to maintain the valid bits of a cache line. The write failure protocol includes: after one processor core writes into the cache, an invalidation request is broadcast, and when other processor cores sniff, the valid bit of the corresponding cache line in the own cache is updated to be a value representing invalidation. When data is read from the cache, the valid bit of the cache line is checked before the cache line is accessed, whether the cache line is valid or not is judged, if invalid, no data which need to be read in the cache line is judged, and therefore the accuracy of data hit is ensured.
Therefore, for the data cache adopting the VIPT structure, if the cache capacity is expanded by adopting the multi-page cache architecture, the problem of aliasing may occur when data loading is performed, that is, different virtual addresses are mapped to the same physical address, and data corresponding to the same physical address may be stored in different cache lines, so that the accuracy of data access is affected.
The technical scheme of the application is illustrated in the following specific embodiments. The following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments.
Example 1
Fig. 1 is a flowchart illustrating a cache processing method according to an embodiment of the application. As shown in fig. 1, the cache processing method provided in this embodiment may include:
s101, obtaining a request virtual address according to a data loading request of a processor core; the index domain of the request virtual address comprises page identification bits, and the values of different page identification bits represent that the cache line corresponding to the request virtual address is positioned on different pages in the data array of the data cache of the level; the data array of the present level data cache comprises a plurality of pages;
s102, if the data loading request is not hit in the data cache of the current level, saving an index domain of the virtual address of the request, and sending the data loading request to a data cache of the next level so that the data cache of the next level returns target data and an identification domain of a target physical address of the target data according to the data loading request;
S103, returning the target data to the processor core, and determining identification domains of a plurality of physical addresses from an address array of the data cache of the present level according to each bit except the page flag bit of the index domain of the request virtual address;
S104, updating valid bits of a cache line corresponding to the identification domain of the physical address, which is the same as the identification domain of the target physical address, in the identification domains of the physical addresses to be invalid; and writing the target data into an updated cache line in the data cache of the level, and updating the valid bit of the updated cache line to a value representing validity.
In practical applications, the execution body of the embodiment may be a cache processing device applied to the present level data cache, where the device may be implemented by a computer program, for example, application software, etc.; or may be embodied as a medium storing a related computer program, e.g., a usb disk, a cloud disk, etc.; or may also be implemented by physical means, e.g. chips, servers, etc., integrated or installed with the relevant computer program.
The multi-level Cache memory structure may be various, for example, a two-level Cache may be adopted, fig. 2 is a schematic structural diagram of a basic processor memory structure provided by an embodiment of the present application, and as shown in fig. 2, a first-level Cache (L1 Cache) is a Cache closest to a processor core, and a second-level Cache (L2 Cache) is a Cache closest to an external memory, that is, a main memory, and has a larger capacity but a slower access speed than a first-level data Cache. When a processor core tries to access data of a certain address, firstly inquiring whether the data hit or not from a first-level cache, and if the first-level cache hit, returning the data to the processor core by the first-level cache; if the first level cache misses, a query is made from the second level cache as to whether it hits. If the second-level cache hits, the second-level cache returns the data to the first-level cache; if the second-level cache is missing, the data is queried from the external storage and returned to the first-level cache, the second-level cache and the processor core. A third level cache may also be employed, including a first level cache, a second level cache, a third level cache, or other storage structures, without limitation.
Specifically, the data cache of the present level is a cache that currently receives a data loading request, where the data cache of the present level may be a first-level data cache, or may be a second-level data cache, or the like, specifically according to a data loading process and a storage structure; obtaining the request virtual address according to the data loading request of the processor core includes performing address calculation according to the data loading request, and calculating the request virtual address from an operand of the data loading request. The index field of the request virtual address comprises page identification bits, and different values of the page identification bits represent cache lines corresponding to the request virtual address and are positioned on different pages in the data array of the data cache of the level. For example, the request virtual address may include an upper field, an index field (index), and an offset field (offset), and the physical address may be divided into an identification field (tag) and a lower field. In practical application, the offset domain of the physical address is the same as the low-order domain of the virtual address, and a mapping relationship exists between the identification domain of the physical address and the address segments of the virtual address except for the offset domain, so that the physical address corresponding to a certain virtual address can be determined according to the mapping relationship.
Specifically, the page identification bits are part of the bits in the index field. In one example, the page identification bit of the requesting virtual address is the most significant bit of the index field of the requesting virtual address. For example, assuming that the data array of the data cache of the present level includes 2 pages, the index field of the request virtual address is VA [12:6], the page identification bit defining the request virtual address is VA [12], VA [12] is 0 to indicate that the corresponding cache line of the request virtual address in the data array of the data cache of the present level is located on page 1, and VA [12] is 1 to indicate that the corresponding cache line of the request virtual address in the data array of the data cache of the present level is located on page 2. It should be noted that the foregoing is merely an example, the page identification bit may also be a plurality of bits, and specifically, the data length of the page identification bit may be determined according to the number of pages. Still in combination with the above example, assuming that the data array of the present level data cache includes 4 pages, the index field of the requested virtual address may be defined as VA [13:6], where VA [13:12] is the page identification bit. Specifically, VA [13:12] is 00 to represent that the corresponding cache line of the request virtual address in the data array of the data cache of the current level is positioned on page 1, VA [13:12] is 01 to represent that the corresponding cache line of the request virtual address in the data array of the data cache of the current level is positioned on page 2, VA [13:12] is 10 to represent that the corresponding cache line of the request virtual address in the data array of the data cache of the current level is positioned on page 3, and VA [13:12] is 11 to represent that the corresponding cache line of the request virtual address in the data array of the data cache of the current level is positioned on page 4.
This example facilitates finding aliased items caused by page crossing in the data cache by defining page identification bits for the virtual address, thereby eliminating the aliased items.
And inquiring in the data cache of the level according to the request virtual address, and when the data loading request is not hit in the data cache of the level, namely the data cache of the level does not contain the data requested to be accessed, storing an index field of the request virtual address and sending the data loading request to the data cache of the next level. For example, assuming that the present level data cache may be a level one data cache, the next level data cache is a level two data cache. Assuming that the next-level data cache is a physical cache (PHYSICALLY INDEXED PHYSICALLY TAGGED, PIPT), the data loading request sent to the next-level data cache further comprises a request physical address corresponding to the request virtual address, and the next-level data cache acquires target data according to the request physical address and returns the target data to the current-level data cache. Since the cache line is the minimum unit of cache data transmission, in practical application, the next level data cache returns the cache line data where the target data is located to the current level data cache.
Fig. 3 is a schematic structural diagram of a data cache of the present level according to an embodiment of the present application, where, as shown in fig. 3, the data cache of the present level includes a data array, and a valid bit array and an address array corresponding to the data array. The data array (DATA ARRAY) may be regarded as storing data in a matrix, a row of the matrix representing a group (set), a list of the matrix representing a way, wherein each element is a cache line (CACHE LINE), a location of each cache line in the data array may be uniquely determined according to the group and way in which the cache line is located, and each cache line in the data array has a valid bit entry and an address entry corresponding to each valid bit entry in the valid bit array and the address array, and the valid bit and the identification field of the physical address corresponding to each cache line are stored respectively.
Because the offset in the page of the alias item is the same (i.e. the index fields of the virtual address are the same except for the page identification bit) and the identification bit of the physical address is the same, in order to determine whether the alias item exists in the data cache of the present level, it is necessary to screen the identification fields of multiple physical addresses from the data array of the data cache of the present level according to the index fields of the requested virtual address except for the page identification bit, and determine whether the identification fields of the physical addresses are the same as the identification fields of the target physical address, if so, it is indicated that the physical address is the same as the physical address corresponding to the target data, so that the alias item exists in the high-speed data cache. To eliminate the alias entry, the valid bit of the cache line corresponding to the identification field of the physical address of the identification field of the plurality of physical addresses that is the same as the identification field of the target physical address may be updated to a value that characterizes the invalidation, e.g., set to 0. Writing the cache line data of the target data into an updating cache line in the data cache of the present level, and writing the identification field of the target physical address corresponding to the target data into an item corresponding to the updating cache line in the address array of the data cache of the present level. And writes the valid bit corresponding to the updated cache line in the valid bit array of the data cache of the present level as a value representing the validity, for example, as 1.
For example, fig. 4 is a diagram illustrating an exemplary structure of a data array of a data cache of the present level according to an embodiment of the present application, and as shown in fig. 4, the data array of the data cache of the present level adopts an 8-way set connection structure, which includes 64 sets and 8 ways, and has a total of 64×8 cache lines. Assuming that the size of the data cache of the present level is 64 kilobytes, the cache line size is 128 bytes, the data length of the request virtual address includes 64 bits, i.e., VA [63:0], the data length of the request physical address corresponding to the request virtual address includes 50 bits, i.e., PA [49:0], the offset field of the request virtual address is VA [6:0], the index field of the request virtual address is VA [12:7], the page identification bit of the request virtual address is VA [12], and the identification field of the request physical address is PA [49:12]. The data cache of the present level comprises 2 physical pages with the size of 4 kilobytes, wherein all cache lines in the 0 th group to the 31 st group are positioned on the 1 st page, all cache lines in the 32 nd group to the 63 th group are positioned on the 2 nd page, namely when VA 12 is 0, the corresponding cache line in the data array of the data cache of the present level is positioned on the 1 st page, and when VA 12 is 1, the corresponding cache line in the data array of the data cache of the present level is positioned on the 2 nd page. For example, assuming that the index field of the request virtual address is 000001, if the data load request misses in the data cache of the present level, the identification fields of the plurality of physical addresses corresponding to the 1 st group and the 33 st group are determined in the data array of the data cache of the present level according to VA [11:7] which is the index field of the request virtual address except for the page flag bit, and the valid bit of the cache line corresponding to the identification field of the physical address identical to the identification field of the target physical address in the identification fields of the plurality of physical addresses is updated to 0. And writing the cache line data of the target data into an updated cache line in the data cache of the current level, and updating the valid bit of the updated cache line to be 1, so that the valid bit under multi-page cache is accurately updated. Subsequently, when a new data loading request sent by the processor core is received, searching the cache line through the index field of the virtual address output by the processor core, judging whether the cache line is valid or not according to the valid bit of the cache line, and if the cache line is invalid, skipping the cache line.
Thus, in this example, from the index fields of the requested virtual address other than the page identification bits, the identification fields of the plurality of physical addresses for which alias entries may exist are determined in the level-one cache; and comparing the identification domains of the physical addresses with the identification domains of the target physical addresses, if the identification domains of the physical addresses are the same, indicating that the cache behavior alias items corresponding to the identification domains of the physical addresses correspond to the same physical address as the target data, taking the effective position of the cache line as an invalid representation value, and maintaining the effective position of the cache line corresponding to the same physical address as the target data in the level cache to improve the accuracy of accessing the cache data by the processor kernel, so that the cache capacity can be improved through a multi-page architecture.
In practical application, after receiving a data loading request, firstly, inquiring whether hit occurs in the level-one cache, namely detecting whether data required by the request exists in the level-one cache. Specifically, the method of detecting whether a hit is detected is not limited. In one example, the method further comprises:
Determining a plurality of first cache lines in a data array of the data cache according to the index domain of the request virtual address, wherein the index domain of the virtual address corresponding to the plurality of first cache lines is the same as the index domain of the request virtual address, and the corresponding valid bit is a value representing validity;
Inquiring address mapping storage according to the request virtual address to obtain a request physical address corresponding to the request virtual address;
If the identification domain of the request physical address is the same as the identification domain of the physical address of any first cache line, judging that the data loading request hits in the data cache of the level; otherwise, judging that the data loading request is not hit in the data cache of the present level.
Specifically, the index domain of the request virtual address indexes the data array of the data cache of the present level, and a first cache line, in which the index domains of the plurality of virtual addresses are identical to the index domain of the request virtual address, is determined. The index field of the virtual address corresponding to the cache line is used for indicating the group of the cache line in the data array, and assuming that the index field of the request virtual address is 000000, all valid cache lines in the 0 th group in the data array are used as the first cache line. After the first cache line is determined, the request physical address corresponding to the request virtual address is obtained by querying an address conversion cache. Comparing the identification domain of the physical address corresponding to each first cache line with the identification domain of the request physical address, if the identification domain of the physical address of any first cache line is the same as the identification domain of the request physical address, indicating that the cache line contains data required by the request, and judging the cache hit of the data of the present level; if the identification domains of the physical addresses of all the first cache lines are different from the identification domains of the request physical addresses, the cache lines are indicated to not contain the data required by the request, and the data cache miss of the level is judged.
For example, still referring to FIG. 4, assuming that the index field of the requested virtual address is 100000, it is known that the page identification bit is 1, which indicates that the cache line is on page 2, and the remaining bits "00000" indicate the first group. I.e. the first set of data arrays below page 2, i.e. all valid cache lines in the 32 nd set of the entire data array are determined to be the first cache line. After the request physical address corresponding to the request virtual address is obtained, the identification domains of the physical addresses of all the first cache lines are compared with the identification domains of the domain request physical address, and if the first cache line positioned in the 3 rd path is obtained through comparison, the identification domains of the physical addresses are identical with the identification domains of the request physical address, then the data loading request is judged to hit in the data cache of the current level.
In this example, by determining a plurality of first cache lines in the data array through the index field of the request virtual address, the identification field of the physical address of the first cache lines is compared with the identification field of the request physical address corresponding to the request virtual address, so as to determine whether the data cache of the present level hits.
When determining the physical address corresponding to the virtual address, the address mapping of the virtual address needs to be obtained, and in practical application, the address mapping of the virtual address can be uniformly stored so as to be convenient to manage. As an example, it may be stored in an address mapping store. Alternatively, to speed up the address map lookup process, the address map store may employ a multi-level structure. Thus, in one example, the querying the address translation cache according to the request virtual address to obtain the request physical address corresponding to the request virtual address includes:
inquiring whether the address mapping of the request virtual address exists in the first-level address mapping storage; the address mapping characterizes the corresponding relation between the virtual address and the physical address;
If so, obtaining a request physical address corresponding to the request virtual address according to the address mapping of the request virtual address in the first-level address mapping storage;
If the virtual address does not exist, the address mapping of the request virtual address is obtained from the next-stage address mapping storage, the request physical address corresponding to the request virtual address is obtained according to the address mapping of the request virtual address in the next-stage address mapping storage, and the address mapping of the request virtual address is added into the first-stage address mapping storage; the address maps other than the last stage store address maps for recording part of the virtual addresses, and the last stage address map stores address maps for storing all the virtual addresses.
Specifically, fig. 5 is a flow chart of obtaining an address map according to an embodiment of the present application, as shown in fig. 5, it is assumed that an address map storage includes a two-level storage structure, where a first-level address map storage stores an address map of a recently accessed virtual address, which may be an address translation fast table, and a first-level address map storage corresponds to a next-level address map storage, that is, a last-level address map storage, which may be a main storage. When the address mapping of the request virtual address is obtained, firstly, inquiring the address mapping of the request virtual address in a first-level address mapping storage, and if the address mapping of the request virtual address is inquired in the first-level address mapping storage, namely, the first-level address mapping storage hits, obtaining a request physical address corresponding to the request virtual address according to the address mapping of the request virtual address; if the address mapping of the request virtual address is not queried in the first-level address mapping storage, namely the first-level address mapping storage is missed, the address mapping of the request virtual address is acquired from the next-level address mapping storage and is written into the first-level address mapping storage. In fig. 5, only one possible situation is shown, and in practical application, to further improve the efficiency of obtaining the physical address of the request, a more-level address mapping storage structure may also be adopted, where each level of address mapping storage stores the address mapping that is accessed recently in the corresponding next level of address mapping storage, and this is not limited herein.
The method and the device can effectively improve the efficiency of acquiring the address mapping through the cooperation of multi-level address mapping storage.
Specifically, by adopting the data hit mode, when the data load request hits in the data cache of the present level, corresponding data can be returned to the processor core. Then, as an example, the method further comprises:
If the data loading request hits in the data cache of the present level, taking the first cache line with the same identification domain of the physical address and the identification domain of the request physical address in the plurality of first cache lines as a hit cache line;
And acquiring hit data from the hit cache line according to the offset field of the request virtual address, and returning the hit data to the processor core.
Specifically, after the data hit mode is adopted to determine the data cache hit of the present level, a first cache line with the same identification domain of the physical address as the identification domain of the request physical address is used as a hit cache line, and hit data is acquired from the hit cache line according to the offset domain of the request virtual address and returned to the processor core. For example, assume that the first cache line in the data array at set 32, way 3, is taken as the hit cache line, as previously described. The offset field of the request virtual address is 000000, then the Byte data of 0Byte in the hit cache line is read as hit data, and the hit data is returned to the processor core.
When the data loading request is in the data cache hit of the level, the hit data is determined in the hit cache line according to the offset domain of the virtual address of the request and returned to the processor core, so that the efficiency of the processor for accessing the data is effectively improved.
It should be noted that, the present embodiment is not limited to the manner of determining whether the data cache of the present level hits. As an example, a data hit mode combined with way prediction may also be used to avoid opening useless cache ways and reduce power consumption. So in one example, the method further comprises:
determining a group in the data array of the data cache of the current level according to the index field of the request virtual address; determining a predicted cache line in the set based on a way prediction technique;
Inquiring address mapping storage according to the request virtual address to obtain a request physical address corresponding to the request virtual address;
if the identification domain of the request physical address is the same as the identification domain of the physical address of the prediction cache line and the valid bit corresponding to the prediction cache line is a value representing validity, determining that the data loading request hits in the data cache of the present level; otherwise, the data loading request is judged to be missed in the data cache of the level.
Specifically, a set is determined in the data array of the data cache of the present level based on the index field of the requested virtual address, for example, assuming that the index field of the requested virtual address is 100000, the 32 nd set of the data array is selected. Based on a way prediction technology, predicting a way in which a cache line to be accessed is located in a determined group, thereby determining a predicted cache line; the method of obtaining the request physical address corresponding to the request virtual address is as described above, the identification domain of the physical address corresponding to the predicted cache line is obtained from the address array, the identification domain of the physical address of the predicted cache line is compared with the identification domain of the request physical address, if the identification domains are the same, the data cache hit of the present level is determined, and if the identification domains are not the same, the data cache miss of the present level is determined.
In this embodiment, a data hit method based on way prediction is used. In practical application, the way prediction usually uses the most recently used algorithm, determines the most frequently accessed way in each group based on the access record of the data array of the cache data, and after determining a group according to the index domain of the virtual address, determines the cache line to be accessed according to the most frequently accessed way in the group. For example, assuming that after the 0 th group is determined in the data array, based on the way prediction technology, the most recently used algorithm is used to obtain the 1 st way of the most recently accessed way in the 0 th group, the cache line corresponding to the 0 th group and the 1 st way in the data array is taken as the predicted cache line. After the request physical address corresponding to the request virtual address is obtained, comparing the identification domain of the physical address of the predicted cache line with the identification domain of the request physical address, and judging that the data loading request is not hit in the data cache of the level if the identification domain of the physical address of the predicted cache line is XXXXX1 and the identification domain of the request physical address is XXXXX 0.
The present example effectively reduces power consumption by requesting an index field of a virtual address, determining a set in a data array, and determining a predicted cache line in the set using way prediction, avoiding comparing the identification fields of the physical addresses of multiple cache lines one by one.
Specifically, by adopting a data hit mode based on way prediction, when the data loading request is judged to hit in the data cache of the present level, corresponding data can be returned to the processor core. Accordingly, in one example, the method further comprises:
If the data loading request hits in the data cache of the present level, according to the offset domain of the request virtual address, obtaining predicted cache data from the predicted cache line, and returning the predicted cache data to the processor core.
Specifically, after the data cache hit of the present level is determined, corresponding byte data is determined in the predicted cache line as predicted cache data according to the offset field of the request virtual address, and the predicted cache data is returned to the processor core. For example, assuming that the offset field of the requested virtual address is 111111, the Byte data of 127 bytes in the predicted cache line is read as hit data, and the hit data is returned to the processor core.
When the data loading request is in the data cache hit of the level, the hit data is determined in the hit cache line according to the offset domain of the virtual address of the request and returned to the processor core, so that the efficiency of the processor for accessing the data is effectively improved.
In practical applications, the updating of the cache data also needs to be considered. For example, when the data cache of the present level is not hit, after the target data returned by the data cache of the next level is obtained, the target data needs to be written into the data cache of the present level to update the data in the cache. Further, to determine a location of a cache line in the current level data cache where the target data is written, in one example, before writing the target data to the updated cache line in the current level data cache, the method further includes:
And determining the updated cache line from one group corresponding to the index domain of the request virtual address in the data array of the data cache of the current level based on a least recently used algorithm, and clearing the current data in the updated cache line.
Specifically, a set is determined in the data array of the data cache of the present level based on the index field of the requested virtual address, in the manner described above. For example, assuming that the index field of the third virtual address is VA [12:7], and the index field is 01111, group 31 is selected. Based on the least recently used algorithm, in the determined one group, a cache line that has not been used for the longest time is determined as an updated cache line to be written with the target data, and the current data in the updated cache line is cleared.
In practical applications, there are various implementations of the least recently used algorithm, and a matrix method is illustrated here as an example. Firstly, defining a matrix with the same number of rows and columns as the number of cache lines, and defining a 4×4 matrix assuming that the data cache of the present level comprises 4 cache lines. When one cache line in a certain cache line is accessed, setting all lines in a matrix corresponding to the line as 1, setting all columns in the matrix corresponding to the line as 0, and using the cache line corresponding to the cache line with the least number of lines in the matrix as an updated cache line.
The cache line to be replaced is determined through the least recently used algorithm, and the locality principle is effectively utilized, so that cache data stored in the data cache of the present level is the most commonly accessed data or a memory area near the data, and the number of times that the processor kernel accesses the main memory is reduced.
In the cache processing method provided by the embodiment, a page identification bit is defined in a virtual address of a cache line to represent a page where the cache line is located, when a data loading request of a processor core is received, a request virtual address is obtained, and a query is performed in a local level cache, if no hit exists in the local level data cache, target data corresponding to the request and an identification field of a physical address thereof are obtained from a next level data cache, and returned to the processor core, and valid bits of each cache line are updated, namely, according to each bit except the page identification bit of an index field of the request virtual address, identification fields of a plurality of physical addresses are determined from an address array in the local level data cache, and valid bits of the cache line corresponding to the identification field of a physical address identical to the identification field of the target physical address are updated to represent invalid values, and valid bits of the updated cache line in the local level cache, in which the target data is written are updated to represent valid values. In the scheme, the page identification bit is defined in the virtual address of the cache line, and the valid bit of each cache line in the data cache is updated based on the page identification bit, so that the valid bit management and maintenance under the condition of page crossing cache are accurately and rapidly realized, the index range of the data cache can be opened, the cache capacity is expanded, and the accuracy of data access is ensured.
Example two
Fig. 6 is a schematic structural diagram of a cache processing device according to an embodiment of the application. As shown in fig. 6, the cache processing apparatus 600 provided in this embodiment may include:
A first address calculation unit 61, configured to obtain a request virtual address according to a data loading request of the processor core; the index domain of the request virtual address comprises page identification bits, and the values of different page identification bits represent that the cache line corresponding to the request virtual address is positioned on different pages in the data array of the data cache of the level; the data array of the present level data cache comprises a plurality of pages;
A first miss queue 62, configured to store an index field of the requested virtual address if the data load request misses in the current level data cache, and send the data load request to a next level data cache, so that the next level data cache returns, according to the data load request, target data and an identification field of a target physical address of the target data;
A first result selection module 63, configured to return the target data to the processor core;
A return identifier comparing module 64, configured to determine, from the address array of the present level data cache, an identifier field of a plurality of physical addresses according to each bit of the index field of the requested virtual address except the page flag bit; updating valid bits of cache lines corresponding to the identification domains of the physical addresses in the identification domains of the physical addresses which are the same as the identification domains of the target physical addresses to be invalid values;
a first cache update module 65, configured to write the target data into an updated cache line in the level data cache;
the return identity comparison module 64 is further configured to update the valid bit of the updated cache line to a value that characterizes the validity.
In practical application, the cache processing device may be implemented by a computer program, for example, application software or the like; or may be embodied as a medium storing a related computer program, e.g., a usb disk, a cloud disk, etc.; or may also be implemented by physical means, e.g. chips, servers, etc., integrated or installed with the relevant computer program.
Specifically, the data cache of the present level is a cache that currently receives a data loading request, where the data cache of the present level may be a first-level data cache, or may be a second-level data cache, or the like, specifically according to a data loading process and a storage structure; obtaining the request virtual address according to the data loading request of the processor core includes performing address calculation according to the data loading request, and calculating the request virtual address from an operand of the data loading request. The index field of the request virtual address comprises page identification bits, and different values of the page identification bits represent cache lines corresponding to the request virtual address and are positioned on different pages in the data array of the data cache of the level. For example, the request virtual address may include an upper field, an index field (index), and an offset field (offset), and the physical address may be divided into an identification field (tag) and a lower field. In practical application, the offset domain of the physical address is the same as the low-order domain of the virtual address, and a mapping relationship exists between the identification domain of the physical address and the address segments of the virtual address except for the offset domain, so that the physical address corresponding to a certain virtual address can be determined according to the mapping relationship.
Specifically, the page identification bits are part of the bits in the index field. In one example, the page identification bit of the requesting virtual address is the most significant bit of the index field of the requesting virtual address. For example, assuming that the data array of the data cache of the present level includes 2 pages, the index field of the request virtual address is VA [12:6], the page identification bit defining the request virtual address is VA [12], VA [12] is 0 to indicate that the corresponding cache line of the request virtual address in the data array of the data cache of the present level is located on page 1, and VA [12] is 1 to indicate that the corresponding cache line of the request virtual address in the data array of the data cache of the present level is located on page 2. It should be noted that the foregoing is merely an example, the page identification bit may also be a plurality of bits, and specifically, the data length of the page identification bit may be determined according to the number of pages. Still in combination with the above example, assuming that the data array of the present level data cache includes 4 pages, the index field of the requested virtual address may be defined as VA [13:6], where VA [13:12] is the page identification bit. Specifically, VA [13:12] is 00 to represent that the corresponding cache line of the request virtual address in the data array of the data cache of the current level is positioned on page 1, VA [13:12] is 01 to represent that the corresponding cache line of the request virtual address in the data array of the data cache of the current level is positioned on page 2, VA [13:12] is 10 to represent that the corresponding cache line of the request virtual address in the data array of the data cache of the current level is positioned on page 3, and VA [13:12] is 11 to represent that the corresponding cache line of the request virtual address in the data array of the data cache of the current level is positioned on page 4.
This example facilitates finding aliased items caused by page crossing in the data cache by defining page identification bits for the virtual address, thereby eliminating the aliased items.
And inquiring in the data cache of the level according to the request virtual address, and when the data loading request is not hit in the data cache of the level, namely the data cache of the level does not contain the data requested to be accessed, storing an index field of the request virtual address and sending the data loading request to the data cache of the next level. For example, assuming that the present level data cache may be a level one data cache, the next level data cache is a level two data cache. Assuming that the next-level data cache is a physical cache, the data loading request sent to the next-level data cache further comprises a request physical address corresponding to the request virtual address, and the next-level data cache acquires target data according to the request physical address and returns the target data to the current-level data cache. Since the cache line is the minimum unit of cache data transmission, in practical application, the next level data cache returns the cache line data where the target data is located to the current level data cache.
Because the offset in the page of the alias item is the same (i.e. the index fields of the virtual address are the same except for the page identification bit) and the identification bit of the physical address is the same, in order to determine whether the alias item exists in the data cache of the present level, it is necessary to screen the identification fields of multiple physical addresses from the data array of the data cache of the present level according to the index fields of the requested virtual address except for the page identification bit, and determine whether the identification fields of the physical addresses are the same as the identification fields of the target physical address, if so, it is indicated that the physical address is the same as the physical address corresponding to the target data, so that the alias item exists in the high-speed data cache. To eliminate the alias entry, the valid bit of the cache line corresponding to the identification field of the physical address of the identification field of the plurality of physical addresses that is the same as the identification field of the target physical address may be updated to a value that characterizes the invalidation, e.g., set to 0. Writing the cache line data of the target data into an updating cache line in the data cache of the present level, and writing the identification field of the target physical address corresponding to the target data into an item corresponding to the updating cache line in the address array of the data cache of the present level. And writes the valid bit corresponding to the updated cache line in the valid bit array of the data cache of the present level as a value representing the validity, for example, as 1.
Thus, in this example, from the index fields of the requested virtual address other than the page identification bits, the identification fields of the plurality of physical addresses for which alias entries may exist are determined in the level-one cache; and comparing the identification domains of the physical addresses with the identification domains of the target physical addresses, if the identification domains of the physical addresses are the same, indicating that the cache behavior alias items corresponding to the identification domains of the physical addresses correspond to the same physical address as the target data, taking the effective position of the cache line as an invalid representation value, and maintaining the effective position of the cache line corresponding to the same physical address as the target data in the level cache to improve the accuracy of accessing the cache data by the processor kernel, so that the cache capacity can be improved through a multi-page architecture.
In practical application, after receiving a data loading request, firstly, inquiring whether hit occurs in the level-one cache, namely detecting whether data required by the request exists in the level-one cache. Specifically, the method of detecting whether a hit is detected is not limited. In one example, the apparatus further comprises:
The first cache inquiry module is used for determining a plurality of first cache lines in the data array of the data cache according to the index domain of the request virtual address, wherein the index domain of the virtual address corresponding to the first cache lines is the same as the index domain of the request virtual address, and the corresponding valid bit is a value representing validity;
The first mapping inquiry module is used for inquiring address mapping storage according to the request virtual address to obtain a request physical address corresponding to the request virtual address;
the prediction identification comparison module is used for judging that the data loading request hits in the data cache of the level if the identification domain of the request physical address is the same as the identification domain of the physical address of any first cache line; otherwise, judging that the data loading request is not hit in the data cache of the present level.
Specifically, the index domain of the request virtual address indexes the data array of the data cache of the present level, and a first cache line, in which the index domains of the plurality of virtual addresses are identical to the index domain of the request virtual address, is determined. The index field of the virtual address corresponding to the cache line is used for indicating the group of the cache line in the data array, and assuming that the index field of the request virtual address is 000000, all valid cache lines in the 0 th group in the data array are used as the first cache line. After the first cache line is determined, the request physical address corresponding to the request virtual address is obtained by querying an address conversion cache. Comparing the identification domain of the physical address corresponding to each first cache line with the identification domain of the request physical address, if the identification domain of the physical address of any first cache line is the same as the identification domain of the request physical address, indicating that the cache line contains data required by the request, and judging the cache hit of the data of the present level; if the identification domains of the physical addresses of all the first cache lines are different from the identification domains of the request physical addresses, the cache lines are indicated to not contain the data required by the request, and the data cache miss of the level is judged.
In this example, by determining a plurality of first cache lines in the data array through the index field of the request virtual address, the identification field of the physical address of the first cache lines is compared with the identification field of the request physical address corresponding to the request virtual address, so as to determine whether the data cache of the present level hits.
When determining the physical address corresponding to the virtual address, the address mapping of the virtual address needs to be obtained, and in practical application, the address mapping of the virtual address can be uniformly stored so as to be convenient to manage. As an example, it may be stored in an address mapping store. Alternatively, to speed up the address map lookup process, the address map store may employ a multi-level structure. Thus, in one example, the first mapping query module may be specifically configured to:
inquiring whether the address mapping of the request virtual address exists in the first-level address mapping storage; the address mapping characterizes the corresponding relation between the virtual address and the physical address;
If so, obtaining a request physical address corresponding to the request virtual address according to the address mapping of the request virtual address in the first-level address mapping storage;
If the virtual address does not exist, the address mapping of the request virtual address is obtained from the next-stage address mapping storage, the request physical address corresponding to the request virtual address is obtained according to the address mapping of the request virtual address in the next-stage address mapping storage, and the address mapping of the request virtual address is added into the first-stage address mapping storage; the address maps other than the last stage store address maps for recording part of the virtual addresses, and the last stage address map stores address maps for storing all the virtual addresses.
Specifically, it is assumed that the address mapping storage includes a two-level storage structure, where the first-level address mapping storage stores the address mapping of the most recently accessed virtual address, which may be an address translation fast table, and the first-level address mapping storage corresponds to the next-level address mapping storage, that is, the last-level address mapping storage, which may be a main storage. When the address mapping of the request virtual address is obtained, firstly, inquiring the address mapping of the request virtual address in a first-level address mapping storage, and if the address mapping of the request virtual address is inquired in the first-level address mapping storage, namely, the first-level address mapping storage hits, obtaining a request physical address corresponding to the request virtual address according to the address mapping of the request virtual address; if the address mapping of the request virtual address is not queried in the first-level address mapping storage, namely the first-level address mapping storage is missed, the address mapping of the request virtual address is acquired from the next-level address mapping storage and is written into the first-level address mapping storage. In the above-mentioned possible cases, in order to further improve the efficiency of obtaining the physical address of the request in practical application, a more-level address mapping storage structure may be adopted, where each level of address mapping storage stores the address mapping that is accessed recently in the corresponding next level of address mapping storage, and the address mapping is not limited herein.
The method and the device can effectively improve the efficiency of acquiring the address mapping through the cooperation of multi-level address mapping storage.
Specifically, by adopting the data hit mode, when the data load request hits in the data cache of the present level, corresponding data can be returned to the processor core. Then, as an example, the apparatus further comprises:
The data acquisition module is used for taking the first cache line with the same identification domain of the physical address as the identification domain of the request physical address in the first cache lines as a hit cache line if the data loading request hits in the data cache of the level; acquiring hit data from a hit cache line according to the offset field of the request virtual address;
The first result selection module 63 is further configured to return the hit data to the processor core.
Specifically, after the data hit mode is adopted to determine the data cache hit of the present level, a first cache line with the same identification domain of the physical address as the identification domain of the request physical address is used as a hit cache line, and hit data is acquired from the hit cache line according to the offset domain of the request virtual address and returned to the processor core.
When the data loading request is in the data cache hit of the level, the hit data is determined in the hit cache line according to the offset domain of the virtual address of the request and returned to the processor core, so that the efficiency of the processor for accessing the data is effectively improved.
It should be noted that, the present embodiment is not limited to the manner of determining whether the data cache of the present level hits. As an example, a data hit mode combined with way prediction may also be used to avoid opening useless cache ways and reduce power consumption. Thus, in one example, the apparatus may further comprise:
The first path prediction module is used for determining a group in the data array of the data cache of the current level according to the index domain of the request virtual address; determining a predicted cache line in the set based on a way prediction technique;
The first mapping query module is used for querying address mapping storage according to the request virtual address to obtain a request physical address corresponding to the request virtual address;
the prediction identity comparing module is further configured to determine that the data loading request hits in the data cache of the present level if the identification field of the physical address of the request is the same as the identification field of the physical address of the prediction cache line, and the valid bit corresponding to the prediction cache line is a value representing validity; otherwise, the data loading request is judged to be missed in the data cache of the level.
Specifically, a set is determined in the data array of the data cache of the present level based on the index field of the requested virtual address, for example, assuming that the index field of the requested virtual address is 100000, the 32 nd set of the data array is selected. Based on a way prediction technology, predicting a way in which a cache line to be accessed is located in a determined group, thereby determining a predicted cache line; the method of obtaining the request physical address corresponding to the request virtual address is as described above, the identification domain of the physical address corresponding to the predicted cache line is obtained from the address array, the identification domain of the physical address of the predicted cache line is compared with the identification domain of the request physical address, if the identification domains are the same, the data cache hit of the present level is determined, and if the identification domains are not the same, the data cache miss of the present level is determined.
In this embodiment, a data hit method based on way prediction is used. In practical application, the way prediction usually uses the most recently used algorithm, determines the most frequently accessed way in each group based on the access record of the data array of the cache data, and after determining a group according to the index domain of the virtual address, determines the cache line to be accessed according to the most frequently accessed way in the group.
The present example effectively reduces power consumption by requesting an index field of a virtual address, determining a set in a data array, and determining a predicted cache line in the set using way prediction, avoiding comparing the identification fields of the physical addresses of multiple cache lines one by one.
Specifically, by adopting a data hit mode based on way prediction, when the data loading request is judged to hit in the data cache of the present level, corresponding data can be returned to the processor core. Accordingly, in one example, the data acquisition module may also be configured to:
If the data loading request hits in the data cache of the present level, obtaining predicted cache data from the predicted cache line according to the offset domain of the virtual address of the request;
the first result selection module 63 is further configured to return the prediction cache data to the processor core.
Specifically, after the data cache hit of the present level is determined, corresponding byte data is determined in the predicted cache line as predicted cache data according to the offset field of the request virtual address, and the predicted cache data is returned to the processor core.
When the data loading request is in the data cache hit of the level, the hit data is determined in the hit cache line according to the offset domain of the virtual address of the request and returned to the processor core, so that the efficiency of the processor for accessing the data is effectively improved.
In practical applications, the updating of the cache data also needs to be considered. For example, when the data cache of the present level is not hit, after the target data returned by the data cache of the next level is obtained, the target data needs to be written into the data cache of the present level to update the data in the cache. Further, to determine a location of a cache line in the level data cache where the target data is written, in one example, the first cache update module may be further configured to:
And determining the updated cache line from one group corresponding to the index domain of the request virtual address in the data array of the data cache of the current level based on a least recently used algorithm, and clearing the current data in the updated cache line.
Specifically, a set is determined in the data array of the data cache of the present level based on the index field of the requested virtual address, in the manner described above. Based on the least recently used algorithm, in the determined one group, a cache line that has not been used for the longest time is determined as an updated cache line to be written with the target data, and the current data in the updated cache line is cleared.
The cache line to be replaced is determined through the least recently used algorithm, and the locality principle is effectively utilized, so that cache data stored in the data cache of the present level is the most commonly accessed data or a memory area near the data, and the number of times that the processor kernel accesses the main memory is reduced.
In the cache processing device provided in this embodiment, a page identification bit is defined in a virtual address of a cache line to represent a page where the cache line is located, when a data loading request of a processor core is received, a request virtual address is obtained and a query is performed in a local level cache, if no hit occurs in the local level data cache, target data corresponding to the request and an identification field of a physical address thereof are obtained from a next level data cache, and returned to the processor core, and valid bits of each cache line are updated, that is, according to each bit of an index field of the request virtual address except the page identification bit, identification fields of a plurality of physical addresses are determined from an address array in the local level data cache, and valid bits of the cache line corresponding to the identification field of a physical address identical to the identification field of the target physical address are updated to represent an invalid value, and valid bits of the updated cache line in the local level cache, in which the target data is written are updated to represent valid values. In the scheme, the page identification bit is defined in the virtual address of the cache line, and the valid bit of each cache line in the data cache is updated based on the page identification bit, so that the valid bit management and maintenance under the condition of page crossing cache are accurately and rapidly realized, the index range of the data cache can be opened, the cache capacity is expanded, and the accuracy of data access is ensured.
Example III
Fig. 7 is a schematic structural diagram of a cache processing system according to an embodiment of the present application, as shown in fig. 7, where the cache processing system includes:
The first address calculation unit 61 is configured to receive an input data load instruction, calculate a request virtual address from an operand of the data load instruction by address calculation, and output the request virtual address to the first way prediction module for later access to the memory space. Assuming that the requested virtual address is VA [63:0], the index field of the requested virtual address is VA [12:7], the page identification bit of the requested virtual address is VA [12].
The first way prediction module is used for predicting which way of a group in a data array of the data cache of the current level of a cache line to be accessed. By adopting the way prediction method, the comparison of the identifications of all ways in a group can be avoided, so that the speed is improved, and the cache line to be read out can be determined as early as possible. Because of the condition that the prediction error exists in the road prediction, the prediction result needs to be confirmed and error remedied.
The first way prediction module is used for determining a group in a data array of the first-level data cache according to an index field VA [12:7] of the request virtual address. A predicted cache line is determined in the set based on the way prediction. And acquiring an identification domain of a physical address corresponding to the predicted cache line from the directory of the first-level data cache as a predicted identification.
An address translation fast table is a cache of address mappings in which recently used address translation entries are stored for fast translation of virtual addresses to physical addresses. The first mapping query module is configured to search, according to the request virtual address, a request physical address corresponding to the request virtual address in the address translation block table, and because a low 12 bits of the request virtual address are the same as a low 12 bits of the request physical address, a corresponding mapping relationship can be searched according to VA [63:12], so as to obtain an identification field of the request physical address, and if a required mapping is not in the address translation fast table, the required mapping needs to be fetched into the address translation fast table in a next-stage storage. Taking the identification domain of the physical address as the identification, assuming that the request physical address is PA [49:0], the identification domain of the request physical address is actually identified as PA [49:12]
The prediction identity comparing module is used for comparing the prediction identity and the actual identity, if the prediction identity and the actual identity are the same, the corresponding data are indicated to exist in the primary data cache, and the prediction cache data read from the prediction cache line are output to the first result selecting module 63; if not, the corresponding data is not found in the primary data cache, and the load request, the index field VA [12:7] of the request virtual address, and the actual identifier enter the first miss queue 62 to wait for data to be fetched from the secondary cache.
After receiving the data loading request, the second-level cache sends the returned data and the identification field (namely the returned identification) of the corresponding physical address to the return control module based on the index field VA [12:7] of the virtual address of the request and the actual identification.
The return control module is configured to send the return data to the first result selection module 63, so that the result can be output as early as possible. The return data is sent to the first-level cache data for updating the cache; and, a return identification is also sent to the first miss queue 62 for subsequent cache directory updates.
The first miss queue 62 is also used to hold one way of a cache set in a data array of a level one data cache determined by a least recently used algorithm. The first cache update module uses this information to write the second cache return data to the corresponding cache line in the first data cache. The first miss queue 62 is also used to send a return identification to the directory of the primary data cache and to a return identification comparison module 64. The return identification comparison module 64 is used to update the valid bit written to the return data cache line.
The return identifier comparing module 64 is further configured to perform an identifier comparison during data return, compare the return identifier with a comparison identifier that is the same as VA [11:7] in the primary data cache, and write the return identifier into the directory of the primary data cache, and mark the valid position 0 corresponding to the comparison identifier that has the same comparison result, so as to mark an alias item or an item that is mispredicted but can be matched, and compare the valid position 1 corresponding to the return data.
The first result selecting module 63 is configured to select a final result for outputting. And selecting the predicted cache data when the predicted identifier and the actual identifier are the same according to the comparison result output by the predicted identifier comparison module, and selecting the returned data when the predicted identifier and the actual identifier are different.
The specific parameters are given only for the convenience of description of the present solution, and the changes of these parameters do not affect the application and effect of the present solution, nor affect the protection scope of the present solution.
In the cache processing system provided in this embodiment, a page identification bit is defined in a virtual address of a cache line to represent a page where the cache line is located, when a data loading request of a processor core is received, a request virtual address is obtained, and a query is performed in a local level cache, if no hit occurs in the local level data cache, target data corresponding to the request and an identification field of a physical address thereof are obtained from a next level data cache, and returned to the processor core, and valid bits of each cache line are updated, that is, according to each bit of an index field of the request virtual address except for a page flag bit, identification fields of a plurality of physical addresses are determined from an address array in the local level data cache, and valid bits of a cache line corresponding to the identification field of a physical address identical to the identification field of the target physical address are updated to represent an invalid value, and valid bits of an updated cache line in the local level cache, in which the target data is written are updated to represent valid values. In the scheme, the page identification bit is defined in the virtual address of the cache line, and the valid bit of each cache line in the data cache is updated based on the page identification bit, so that the valid bit management and maintenance under the condition of page crossing cache are accurately and rapidly realized, the index range of the data cache can be opened, the cache capacity is expanded, and the accuracy of data access is ensured.
Example IV
FIG. 8 is a flowchart illustrating another cache processing method according to an embodiment of the present application. As shown in fig. 8, in order to further improve the cache processing performance, another cache processing method provided in this embodiment may include:
S801, obtaining a request virtual address according to a data loading request of a processor core; the index domain of the request virtual address comprises page identification bits, and the values of different page identification bits represent that the cache line corresponding to the request virtual address is positioned on different pages in the data array of the data cache of the level; the data array of the present level data cache comprises a plurality of pages;
S802, inquiring an address conversion cache according to the request virtual address to obtain a request physical address corresponding to the request virtual address; determining a group in the data array of the data cache of the current level according to the index field of the request virtual address; determining a predicted cache line in the set based on a way prediction technique;
S803, determining identification domains of a plurality of physical addresses from the address array of the data cache of the present level according to each bit except the page flag bit of the index domain of the request virtual address; and updating valid bits of the cache line corresponding to the identification domain of the physical address, which is the same as the identification domain of the request physical address and different from the identification domain of the predicted physical address of the predicted cache line, of which the page identification bit of the corresponding virtual address is different from the page identification bit of the request virtual address, to be a value representing invalidity.
In practical applications, the execution body of the embodiment may be a cache processing device applied to the present level data cache, where the device may be implemented by a computer program, for example, application software, etc.; or may be embodied as a medium storing a related computer program, e.g., a usb disk, a cloud disk, etc.; or may also be implemented by physical means, e.g. chips, servers, etc., integrated or installed with the relevant computer program.
Specifically, the data cache of the present level is a cache that currently receives a data loading request, where the data cache of the present level may be a first-level data cache, or may be a second-level data cache, or the like, specifically according to a data loading process and a storage structure; obtaining the request virtual address according to the data loading request of the processor core includes performing address calculation according to the data loading request, and calculating the request virtual address from an operand of the data loading request. The index field of the request virtual address comprises page identification bits, and different values of the page identification bits represent cache lines corresponding to the request virtual address and are positioned on different pages in the data array of the data cache of the level. For example, the request virtual address may include an upper field, an index field (index), and an offset field (offset), and the physical address may be divided into an identification field (tag) and a lower field. In practical application, the offset domain of the physical address is the same as the low-order domain of the virtual address, and a mapping relationship exists between the identification domain of the physical address and the address segments of the virtual address except for the offset domain, so that the physical address corresponding to a certain virtual address can be determined according to the mapping relationship.
Wherein the page identification bits are part of the bits in the index field. In one example, the page identification bit of the requesting virtual address is the most significant bit of the index field of the requesting virtual address. For example, assuming that the data array of the data cache of the present level includes 2 pages, the index field of the request virtual address is VA [12:6], the page identification bit defining the request virtual address is VA [12], VA [12] is 0 to indicate that the corresponding cache line of the request virtual address in the data array of the data cache of the present level is located on page 1, and VA [12] is 1 to indicate that the corresponding cache line of the request virtual address in the data array of the data cache of the present level is located on page 2. It should be noted that the foregoing is merely an example, the page identification bit may also be a plurality of bits, and specifically, the data length of the page identification bit may be determined according to the number of pages. Still in combination with the above example, assuming that the data array of the present level data cache includes 4 pages, the index field of the requested virtual address may be defined as VA [13:6], where VA [13:12] is the page identification bit. Specifically, VA [13:12] is 00 to represent that the corresponding cache line of the request virtual address in the data array of the data cache of the current level is positioned on page 1, VA [13:12] is 01 to represent that the corresponding cache line of the request virtual address in the data array of the data cache of the current level is positioned on page 2, VA [13:12] is 10 to represent that the corresponding cache line of the request virtual address in the data array of the data cache of the current level is positioned on page 3, and VA [13:12] is 11 to represent that the corresponding cache line of the request virtual address in the data array of the data cache of the current level is positioned on page 4.
This example facilitates finding aliased items caused by page crossing in the data cache by defining page identification bits for the virtual address, thereby eliminating the aliased items.
When the data array of the present level accesses data, a way prediction method may be adopted, specifically, a group is determined in the data array of the data cache of the present level according to the index field of the request virtual address, for example, assuming that the index field of the request virtual address is 100000, the 32 nd group of the data array is selected. Based on the way prediction technique, the way in which the cache line to be accessed is located is predicted in the determined one group, thereby determining the predicted cache line. In practical application, the way prediction usually uses the most recently used algorithm, determines the most frequently accessed way in each group based on the access record of the data array of the cache data, and after determining a group according to the index domain of the virtual address, determines the cache line to be accessed according to the most frequently accessed way in the group. For example, assuming that after the 0 th group is determined in the data array, based on the way prediction technology, the most recently used algorithm is used to obtain the 1 st way of the most recently accessed way in the 0 th group, the cache line corresponding to the 0 th group and the 1 st way in the data array is taken as the predicted cache line.
Since the prediction error may exist in the road prediction, the road prediction result needs to be checked. Specifically, the address mapping storage is queried for the request physical address corresponding to the request virtual address. And comparing the identification domain of the predicted physical address corresponding to the predicted cache line with the identification domain of the request physical address, if the identification domains are the same, indicating that the predicted result is correct, and if the identification domains are different, indicating that the predicted result is incorrect.
When the way prediction result is verified to be correct, in order to improve the data access efficiency, data can be obtained from the prediction cache line to be output as a result.
In one example, the method further comprises:
And if the identification domain of the predicted physical address corresponding to the predicted cache line is the same as the identification domain of the request physical address, acquiring predicted cache data from the pre-stored cache line according to the offset domain of the request virtual address, and returning the predicted cache data to the processor core.
Specifically, the identification domain of the predicted physical address corresponding to the predicted cache line is compared with the identification domain of the request physical address, if the identification domains are the same, the prediction result of the path is judged to be correct, and the predicted cache data is obtained from the predicted cache line according to the offset domain of the request virtual address. For example, assume a cache line located in the 32 nd, 3 rd way in the data array is taken as the predicted cache line. The offset field of the requested virtual address is 000000, then the Byte data of 0Byte in the predicted cache line is read as the predicted cache data, and the predicted cache data is returned to the processor core.
In this example, when the result of the way prediction is verified to be correct, the data in the predicted cache line is directly output, so that the data access speed is effectively improved.
When the prediction result is wrong, in order to avoid slower data access process caused by directly going to the next-level data cache to acquire data, matching can be performed in the present-level data cache to check whether other cache lines and the request physical address correspond to the same physical address. And screening out the identification domains of a plurality of physical addresses from the data array of the data cache according to the index domains of the request virtual address except the page identification bit, judging whether the identification domains are identical to the identification domains of the request physical address, if so, indicating that the physical address is identical to the request physical address, taking the identification domain of the physical address as the identification domain of the rereading physical address, and taking the cache line corresponding to the identification domain of the physical address as the rereading cache line. And acquiring the reread cache data from the reread cache line according to the offset field of the request virtual address, and returning the reread cache data to the processor core. The method for obtaining the reread cache data from the reread cache line may refer to obtaining the predicted cache data from the predicted cache line, which is not described herein.
Since the intra-page offset of the alias item is the same (i.e., the index field of the virtual address is the same for each bit other than the page identification bit) and the identification bit of the physical address is the same, the reread cache line may be divided into two cases, the first case being the reread cache line and the other cache line of the same set of predicted cache lines, and the second case being the alias item in the other set of the reread cache line data array. Further, comparing the page identification bit of the virtual address corresponding to the identification domain of the reread physical address with the page identification bit of the request virtual address, if the page identification bit of the virtual address is the same as the page identification bit of the request virtual address, indicating that the cache line corresponding to the identification domain of the physical address is in the same group as the predicted cache line in the data array of the data cache of the level, namely, the route prediction result is wrong, but other cache lines matched with the data loading request exist in the group; if the two types of data are different, the group of the cache line corresponding to the identification domain of the physical address in the data array of the data cache of the present level is different from the group of the predicted cache line, the cache line corresponding to the data loading request does not exist in the data cache of the present level, but the alias item corresponding to the data requested by the data loading request exists. And writing the re-read cache data in the re-read cache line into an updated cache line of the data cache of the present level, wherein the cache line data in the re-read cache line is written into the updated cache line of the data cache of the present level in practical application due to the minimum unit of cache line cache data processing, the identification field of the re-read physical address is written into the corresponding item in the address array of the data cache of the present level, and the valid bit corresponding to the re-read cache line is updated to a value representing validity, for example, written as 1, so that the data cache of the present level is updated. To eliminate the alias entry, the valid bit corresponding to the reread cache line is updated to a value that characterizes the invalidation, e.g., set to 0.
For example, assuming that the data array of the present level data cache adopts an 8-way set connection structure, which includes 94 sets and 8 ways, and a total of 94×8 cache lines, the size of the present level data cache is 94 kilobytes, the size of the cache line is 128 bytes, the data length of the requested virtual address includes 94 bits, i.e., VA [93:0], the data length of the requested physical address corresponding to the requested virtual address includes 50 bits, i.e., PA [49:0], the offset field of the requested virtual address is VA [6:0], the index field of the requested virtual address is VA [12:7], the page identification bit of the requested virtual address is VA [12], and the identification field of the requested physical address is PA [49:12]. The data cache of the present level comprises 2 physical pages with the size of 4 kilobytes, wherein all cache lines in the 0 th group to the 31 st group are positioned on the 1 st page, all cache lines in the 32 nd group to the 93 th group are positioned on the 2 nd page, namely when VA 12 is 0, the corresponding cache line in the data array of the data cache of the present level is positioned on the 1 st page, and when VA 12 is 1, the corresponding cache line in the data array of the data cache of the present level is positioned on the 2 nd page. And if the identification domain of the predicted physical address corresponding to the predicted cache line is different from the identification domain of the request physical address, namely, if the prediction result is wrong, determining the identification domains of a plurality of physical addresses in the 1 st group and the 33 st group of the data array of the data cache of the present level according to the index domain of the request virtual address except for the page zone bit, namely VA [11:7 ]. And taking the identification domain of the physical address which is the same as the identification domain of the request physical address as the identification domain of the rereading physical address, taking the cache line corresponding to the identification domain of the rereading physical address as the rereading cache line, and returning rereading cache data in the rereading cache line to the processor core. Assuming that the page identification bit VA 12 of the virtual address corresponding to the identification field of the reread physical address is 1, unlike the page identification bit VA 12=0 of the requested virtual address, the valid bit corresponding to the reread cache line is updated to 0. Subsequently, when a new data loading request sent by the processor core is received, searching the cache line through the index field of the virtual address output by the processor core, judging whether the cache line is valid or not according to the valid bit of the cache line, and if the cache line is invalid, skipping the cache line.
Therefore, in the example, when the cache data is accessed in a way of path prediction, when the path prediction result is wrong, whether corresponding data of a data loading request exist or not is continuously searched in the data cache of the present level by comparing the identification fields of the physical addresses, so that the data is prevented from being directly obtained from the data cache of the next level, and the data access efficiency is effectively improved; defining page identification bits in virtual addresses of cache lines, after corresponding data is found in the data cache of the present level, managing the cache line data and valid bits in the data cache of the present level based on the page identification bits, accurately and rapidly realizing management and maintenance of valid bits under the condition of page crossing cache while updating data of the data cache, and guaranteeing accuracy of data access while expanding cache capacity.
When determining the physical address corresponding to the virtual address, the address mapping of the virtual address needs to be obtained, and in practical application, the address mapping of the virtual address can be uniformly stored so as to be convenient to manage. As an example, it may be stored in an address mapping store. Alternatively, to speed up the address map lookup process, the address map store may employ a multi-level structure. Thus, in one example, the querying the address translation cache according to the request virtual address to obtain the request physical address corresponding to the request virtual address includes:
inquiring whether the address mapping of the request virtual address exists in the first-level address mapping storage; the address mapping characterizes the corresponding relation between the virtual address and the physical address;
If so, obtaining a request physical address corresponding to the request virtual address according to the address mapping of the request virtual address in the first-level address mapping storage;
If the virtual address does not exist, the address mapping of the request virtual address is obtained from the next-stage address mapping storage, the request physical address corresponding to the request virtual address is obtained according to the address mapping of the request virtual address in the next-stage address mapping storage, and the address mapping of the request virtual address is added into the first-stage address mapping storage; the address maps other than the last stage store address maps for recording part of the virtual addresses, and the last stage address map stores address maps for storing all the virtual addresses.
Specifically, it is assumed that the address mapping storage includes a two-level storage structure, where the first-level address mapping storage stores the address mapping of the most recently accessed virtual address, which may be an address translation fast table, and the first-level address mapping storage corresponds to the next-level address mapping storage, that is, the last-level address mapping storage, which may be a main storage. When the address mapping of the request virtual address is obtained, firstly, inquiring the address mapping of the request virtual address in a first-level address mapping storage, and if the address mapping of the request virtual address is inquired in the first-level address mapping storage, namely, the first-level address mapping storage hits, obtaining a request physical address corresponding to the request virtual address according to the address mapping of the request virtual address; if the address mapping of the request virtual address is not queried in the first-level address mapping storage, namely the first-level address mapping storage is missed, the address mapping of the request virtual address is acquired from the next-level address mapping storage and is written into the first-level address mapping storage. In the practical application, to further improve the efficiency of obtaining the physical address of the request, a more-level address mapping storage structure may be adopted, where each level of address mapping storage stores the address mapping that is accessed recently in the corresponding next level of address mapping storage, and the method is not limited herein.
The method and the device can effectively improve the efficiency of acquiring the address mapping through the cooperation of multi-level address mapping storage.
When the data corresponding to the data loading request does not exist in the data cache of the present level, the data can be acquired from the data cache of the next level. In one example, the method may further comprise:
if the identification domains of the request physical address are different from the identification domains of the physical addresses, sending the data loading request to a next-level data cache so that the next-level data cache returns target data according to the data loading request;
returning the target data to the processor core; and writing the target data into an updated cache line in the data cache of the current level, and updating the valid bit of the updated cache line to a value representing validity.
Specifically, when the identification domains of the request physical address and the identification domains of the plurality of physical addresses are different, it is determined that data to be accessed by the data loading request does not exist in the data cache of the present level, and the data loading request is sent to the data cache of the next level. For example, assuming that the present level data cache may be a level one data cache, the next level data cache is a level two data cache. Assuming that the next-level data cache is a physical cache (PHYSICALLY INDEXED PHYSICALLY TAGGED, PIPT), the data loading request sent to the next-level data cache further comprises a request physical address corresponding to the request virtual address, and the next-level data cache acquires target data according to the request physical address and returns the target data to the current-level data cache. Since the cache line is the minimum unit of cache data transmission, in practical application, the next level data cache returns the cache line data where the target data is located to the current level data cache. And returning the target data to the processor core, writing the cache line data in which the target data is positioned into an updating cache line in the data cache of the level, updating the valid bit of the updating cache line to a value representing the validity, for example, setting the valid bit to be 1, and updating the address item corresponding to the updating cache line in the address array to an identification domain of the request physical address.
In this example, when the data loading request lacks data requesting access in the data cache of the present level, the target data is acquired from the data cache of the next level, the data cache of the present level is updated in time based on the target data, and validity of the updated data is maintained.
In practical applications, the updating of the cache data also needs to be considered. To determine where the re-read cache data or return data is to be written to the cache line in the current level data cache, in one example, before writing the target data to the updated cache line in the current level data cache, the method further includes:
And determining the updated cache line from one group corresponding to the index domain of the request virtual address in the data array of the data cache of the current level based on a least recently used algorithm, and clearing the current data in the updated cache line.
Specifically, a set is determined in the data array of the data cache of the present level based on the index field of the requested virtual address, in the manner described above. For example, assuming that the index field of the third virtual address is VA [12:7], and the index field is 01111, group 31 is selected. Based on the least recently used algorithm, in the determined one group, a cache line that has not been used for the longest time is determined as an updated cache line to be written with the target data, and the current data in the updated cache line is cleared.
In practical applications, there are various implementations of the least recently used algorithm, and a matrix method is illustrated here as an example. Firstly, defining a matrix with the same number of rows and columns as the number of cache lines, and defining a 4×4 matrix assuming that the data cache of the present level comprises 4 cache lines. When one cache line in a certain cache line is accessed, setting all lines in a matrix corresponding to the line as 1, setting all columns in the matrix corresponding to the line as 0, and using the cache line corresponding to the cache line with the least number of lines in the matrix as an updated cache line.
The cache line to be replaced is determined through the least recently used algorithm, and the locality principle is effectively utilized, so that cache data stored in the data cache of the present level is the most commonly accessed data or a memory area near the data, and the number of times that the processor kernel accesses the main memory is reduced.
According to the cache processing method provided by the embodiment, a page identification bit is defined in a virtual address of a cache line to represent a page where the cache line is located, a request virtual address is obtained based on a data loading request of a processor core, and a request physical address corresponding to the request virtual address is queried in address mapping storage; determining a predicted cache line in the data cache of the level based on a way prediction technology; when the request physical address is different from the predicted physical address corresponding to the predicted cache line, determining an identification domain of the rereaded physical address which is the same as the identification domain of the request physical address from the data cache according to the index domain of the request virtual address except for the page identification bit, returning the cache line corresponding to the rereaded physical address to the processor core, comparing the page identification bit of the virtual address corresponding to the identification domain of the rereaded physical address with the page identification bit of the request virtual address, if the page identification bit of the virtual address is different from the page identification bit of the request virtual address, updating the cache, and updating the valid bit of the cache line corresponding to the identification domain of the rereaded physical address to be a value representing invalid. In the scheme, when the cache data is accessed in a way of path prediction, if the path prediction result is wrong, whether corresponding data exist or not is continuously searched in the data cache of the current level by comparing the identification fields of the physical addresses, so that the data are prevented from being directly acquired from the data cache of the next level, and the data access efficiency is effectively improved; defining page identification bits in virtual addresses of cache lines, after corresponding data of a data loading request is found in a data cache of the present level, managing the cache line data and valid bits in the data cache of the present level based on the page identification bits, and accurately and rapidly realizing management and maintenance of the valid bits under the condition of page crossing cache while updating data of the data cache, and guaranteeing accuracy of data access while expanding cache capacity.
Example five
Fig. 9 is a schematic structural diagram of another cache processing apparatus according to an embodiment of the present application. As shown in fig. 9, another cache processing apparatus 900 provided in this embodiment may include:
A second address calculating unit 91, configured to obtain a request virtual address according to a data loading request of the processor core; the index domain of the request virtual address comprises page identification bits, and the values of different page identification bits represent that the cache line corresponding to the request virtual address is positioned on different pages in the data array of the data cache of the level; the data array of the present level data cache comprises a plurality of pages;
the second mapping query module 92 is configured to query an address translation cache according to the request virtual address, and obtain a request physical address corresponding to the request virtual address;
A second cache query module 93, configured to determine a group in the data array of the present level data cache according to the index field of the requested virtual address; determining a predicted cache line in the set based on a way prediction technique;
A rereading module 94, configured to determine, from the address array of the present level data cache, an identification field of a plurality of physical addresses according to each bit of the index field of the request virtual address except the page flag bit, if the identification field of the predicted physical address corresponding to the predicted cache line is different from the identification field of the request physical address; determining the identification domain of the rereading physical address in the identification domains of the physical addresses, wherein the identification domain of the rereading physical address is the same as the identification domain of the request physical address, taking a cache line corresponding to the identification domain of the rereading physical address as a rereading cache line, and acquiring rereading cache data from the rereading cache line;
a second result selection module 95, configured to return the reread cache data to the processor core;
A second cache update module 96, configured to write the reread cache data into an update cache line in the present level data cache if a page identification bit of a virtual address corresponding to the identification field of the reread physical address is different from a page identification bit of the requested virtual address;
an identifier comparing module 97, configured to update the valid bit of the updated cache line to a value that characterizes validity; and updating the valid bit of the reread cache line to a value representing invalidation.
In practical application, the cache processing device may be implemented by a computer program, for example, application software or the like; or may be embodied as a medium storing a related computer program, e.g., a usb disk, a cloud disk, etc.; or may also be implemented by physical means, e.g. chips, servers, etc., integrated or installed with the relevant computer program.
Specifically, the data cache of the present level is a cache that currently receives a data loading request, where the data cache of the present level may be a first-level data cache, or may be a second-level data cache, or the like, specifically according to a data loading process and a storage structure; obtaining the request virtual address according to the data loading request of the processor core includes performing address calculation according to the data loading request, and calculating the request virtual address from an operand of the data loading request. The index field of the request virtual address comprises page identification bits, and different values of the page identification bits represent cache lines corresponding to the request virtual address and are positioned on different pages in the data array of the data cache of the level. For example, the request virtual address may include an upper field, an index field (index), and an offset field (offset), and the physical address may be divided into an identification field (tag) and a lower field. In practical application, the offset domain of the physical address is the same as the low-order domain of the virtual address, and a mapping relationship exists between the identification domain of the physical address and the address segments of the virtual address except for the offset domain, so that the physical address corresponding to a certain virtual address can be determined according to the mapping relationship.
Specifically, the page identification bits are part of the bits in the index field. In one example, the page identification bit of the requesting virtual address is the most significant bit of the index field of the requesting virtual address. For example, assuming that the data array of the data cache of the present level includes 2 pages, the index field of the request virtual address is VA [12:6], the page identification bit defining the request virtual address is VA [12], VA [12] is 0 to indicate that the corresponding cache line of the request virtual address in the data array of the data cache of the present level is located on page 1, and VA [12] is 1 to indicate that the corresponding cache line of the request virtual address in the data array of the data cache of the present level is located on page 2. It should be noted that the foregoing is merely an example, the page identification bit may also be a plurality of bits, and specifically, the data length of the page identification bit may be determined according to the number of pages. Still in combination with the above example, assuming that the data array of the present level data cache includes 4 pages, the index field of the requested virtual address may be defined as VA [13:6], where VA [13:12] is the page identification bit. Specifically, VA [13:12] is 00 to represent that the corresponding cache line of the request virtual address in the data array of the data cache of the current level is positioned on page 1, VA [13:12] is 01 to represent that the corresponding cache line of the request virtual address in the data array of the data cache of the current level is positioned on page 2, VA [13:12] is 10 to represent that the corresponding cache line of the request virtual address in the data array of the data cache of the current level is positioned on page 3, and VA [13:12] is 11 to represent that the corresponding cache line of the request virtual address in the data array of the data cache of the current level is positioned on page 4. This example facilitates finding aliased items caused by page crossing in the data cache by defining page identification bits for the virtual address, thereby eliminating the aliased items.
The data cache of the present level comprises a data array, a valid bit array and an address array corresponding to the data array. The data array (DATA ARRAY) may be considered as storing data in the form of a matrix, a row of the matrix representing a set, a list of the matrix representing a way, wherein each element is a cache line (CACHE LINE), and the location of each cache line in the data array may be uniquely determined based on the set and way in which the cache line is located. Each cache line in the data array has a valid bit item and an address item corresponding to the valid bit item and the address item in the valid bit array and the address array, and valid bits corresponding to the cache line and an identification domain of a physical address are respectively stored. When the data array of the present level accesses data, a way prediction method may be adopted, specifically, a group is determined in the data array of the data cache of the present level according to the index field of the request virtual address, for example, assuming that the index field of the request virtual address is 100000, the 32 nd group of the data array is selected. Based on the way prediction technique, the way in which the cache line to be accessed is located is predicted in the determined one group, thereby determining the predicted cache line. In practical application, the way prediction usually uses the most recently used algorithm, determines the most frequently accessed way in each group based on the access record of the data array of the cache data, and after determining a group according to the index domain of the virtual address, determines the cache line to be accessed according to the most frequently accessed way in the group. For example, assuming that after the 0 th group is determined in the data array, based on the way prediction technology, the most recently used algorithm is used to obtain the 1 st way of the most recently accessed way in the 0 th group, the cache line corresponding to the 0 th group and the 1 st way in the data array is taken as the predicted cache line.
Since the prediction error may exist in the road prediction, the road prediction result needs to be checked. Specifically, the address mapping storage is queried for the request physical address corresponding to the request virtual address. And comparing the identification domain of the predicted physical address corresponding to the predicted cache line with the identification domain of the request physical address, if the identification domains are the same, indicating that the predicted result is correct, and if the identification domains are different, indicating that the predicted result is incorrect.
When the way prediction result is verified to be correct, in order to improve the data access efficiency, data can be obtained from the prediction cache line to be output as a result.
In one example, the apparatus further comprises a data reading module operable to:
If the identification domain of the predicted physical address corresponding to the predicted cache line is the same as the identification domain of the request physical address, acquiring predicted cache data from the pre-stored cache line according to the offset domain of the request virtual address;
The second result selection module 95 is further configured to return the prediction cache data to the processor core.
Specifically, the identification domain of the predicted physical address corresponding to the predicted cache line is compared with the identification domain of the request physical address, if the identification domains are the same, the prediction result of the path is judged to be correct, and the predicted cache data is obtained from the predicted cache line according to the offset domain of the request virtual address. For example, assume a cache line located in the 32 nd, 3 rd way in the data array is taken as the predicted cache line. The offset field of the requested virtual address is 000000, then the Byte data of 0Byte in the predicted cache line is read as the predicted cache data, and the predicted cache data is returned to the processor core.
In this example, when the result of the way prediction is verified to be correct, the data in the predicted cache line is directly output, so that the data access speed is effectively improved.
When the prediction result is wrong, in order to avoid slower data access process caused by directly going to the next-level data cache to acquire data, matching can be performed in the present-level data cache to check whether other cache lines and the request physical address correspond to the same physical address. And screening out the identification domains of a plurality of physical addresses from the data array of the data cache according to the index domains of the request virtual address except the page identification bit, judging whether the identification domains are identical to the identification domains of the request physical address, if so, indicating that the physical address is identical to the request physical address, taking the identification domain of the physical address as the identification domain of the rereading physical address, and taking the cache line corresponding to the identification domain of the physical address as the rereading cache line. And acquiring the reread cache data from the reread cache line according to the offset field of the request virtual address, and returning the reread cache data to the processor core. The method for obtaining the reread cache data from the reread cache line may refer to obtaining the predicted cache data from the predicted cache line, which is not described herein. The number of the identification fields of the rereading physical addresses and the rereading cache lines can be multiple, and rereading cache data can be obtained from any rereading cache line and returned.
Since the intra-page offset of the alias item is the same (i.e., the index field of the virtual address is the same for each bit other than the page identification bit) and the identification bit of the physical address is the same, the reread cache line may be divided into two cases, the first case being the reread cache line and the other cache line of the same set of predicted cache lines, and the second case being the alias item in the other set of the reread cache line data array. Further, comparing the page identification bit of the virtual address corresponding to the identification domain of the reread physical address with the page identification bit of the request virtual address, if the page identification bit of the virtual address is the same as the page identification bit of the request virtual address, indicating that the cache line corresponding to the identification domain of the physical address is in the same group as the predicted cache line in the data array of the data cache of the level, namely, the route prediction result is wrong, but other cache lines matched with the data loading request exist in the group; if the two types of the data are different, the group of the cache line corresponding to the identification domain of the physical address in the data array of the data cache of the present level is different from the group of the predicted cache line, namely the route prediction result is wrong, the cache line corresponding to the data loading request does not exist in the data cache of the present level, but the alias item corresponding to the data requested by the data loading request exists. And writing the re-read cache data in the re-read cache line into an updated cache line of the data cache of the present level, wherein the cache line data in the re-read cache line is written into the updated cache line of the data cache of the present level in practical application due to the minimum unit of cache line cache data processing, the identification field of the re-read physical address is written into the corresponding item in the address array of the data cache of the present level, and the valid bit corresponding to the re-read cache line is updated to a value representing validity, for example, written as 1, so that the data cache of the present level is updated. To eliminate the alias entry, the valid bit corresponding to the reread cache line is updated to a value that characterizes the invalidation, e.g., set to 0.
Therefore, in the example, when the cache data is accessed in a way of path prediction, when the path prediction result is wrong, whether corresponding data of a data loading request exist or not is continuously searched in the data cache of the present level by comparing the identification fields of the physical addresses, so that the data is prevented from being directly obtained from the data cache of the next level, and the data access efficiency is effectively improved; defining page identification bits in virtual addresses of cache lines, after corresponding data is found in the data cache of the present level, managing the cache line data and valid bits in the data cache of the present level based on the page identification bits, accurately and rapidly realizing management and maintenance of valid bits under the condition of page crossing cache while updating data of the data cache, and guaranteeing accuracy of data access while expanding cache capacity.
When determining the physical address corresponding to the virtual address, the address mapping of the virtual address needs to be obtained, and in practical application, the address mapping of the virtual address can be uniformly stored so as to be convenient to manage. As an example, it may be stored in an address mapping store. Alternatively, to speed up the address map lookup process, the address map store may employ a multi-level structure. So in one example, the second mapping query module 92 may specifically be configured to:
inquiring whether the address mapping of the request virtual address exists in the first-level address mapping storage; the address mapping characterizes the corresponding relation between the virtual address and the physical address;
If so, obtaining a request physical address corresponding to the request virtual address according to the address mapping of the request virtual address in the first-level address mapping storage;
If the virtual address does not exist, the address mapping of the request virtual address is obtained from the next-stage address mapping storage, the request physical address corresponding to the request virtual address is obtained according to the address mapping of the request virtual address in the next-stage address mapping storage, and the address mapping of the request virtual address is added into the first-stage address mapping storage; the address maps other than the last stage store address maps for recording part of the virtual addresses, and the last stage address map stores address maps for storing all the virtual addresses.
Specifically, it is assumed that the address mapping storage includes a two-level storage structure, where the first-level address mapping storage stores the address mapping of the most recently accessed virtual address, which may be an address translation fast table, and the first-level address mapping storage corresponds to the next-level address mapping storage, that is, the last-level address mapping storage, which may be a main storage. When the address mapping of the request virtual address is obtained, firstly, inquiring the address mapping of the request virtual address in a first-level address mapping storage, and if the address mapping of the request virtual address is inquired in the first-level address mapping storage, namely, the first-level address mapping storage hits, obtaining a request physical address corresponding to the request virtual address according to the address mapping of the request virtual address; if the address mapping of the request virtual address is not queried in the first-level address mapping storage, namely the first-level address mapping storage is missed, the address mapping of the request virtual address is acquired from the next-level address mapping storage and is written into the first-level address mapping storage. In practical application, in order to further improve the efficiency of obtaining the physical address of the request, a more-level address mapping storage structure may be adopted, where each level of address mapping storage stores the address mapping that is accessed recently in the corresponding next level of address mapping storage, and the address mapping is not limited herein.
The method and the device can effectively improve the efficiency of acquiring the address mapping through the cooperation of multi-level address mapping storage.
When the data corresponding to the data loading request does not exist in the data cache of the present level, the data can be acquired from the data cache of the next level. In one example, the apparatus may further include:
The second missing queue is used for sending the data loading request to a next-level data cache if the identification domains of the request physical address and the plurality of physical addresses are different, so that the next-level data cache returns target data according to the data loading request;
The second result selection module 95 is further configured to return the target data to the processor core;
the second cache update module 96 is further configured to write the target data into an updated cache line in the current level data cache;
The second miss queue is further configured to update a valid bit of the update cache line to a value that characterizes a validity.
Specifically, when the identification domains of the request physical address and the identification domains of the plurality of physical addresses are different, it is determined that data to be accessed by the data loading request does not exist in the data cache of the present level, and the data loading request is sent to the data cache of the next level. For example, assuming that the present level data cache may be a level one data cache, the next level data cache is a level two data cache. Assuming that the next-level data cache is a physical cache (PHYSICALLY INDEXED PHYSICALLY TAGGED, PIPT), the data loading request sent to the next-level data cache further comprises a request physical address corresponding to the request virtual address, and the next-level data cache acquires target data according to the request physical address and returns the target data to the current-level data cache. Since the cache line is the minimum unit of cache data transmission, in practical application, the next level data cache returns the cache line data where the target data is located to the current level data cache. And returning the target data to the processor core, writing the cache line data in which the target data is positioned into an updating cache line in the data cache of the level, updating the valid bit of the updating cache line to a value representing the validity, for example, setting the valid bit to be 1, and updating the address item corresponding to the updating cache line in the address array to an identification domain of the request physical address.
In this example, when the data loading request lacks data requesting access in the data cache of the present level, the target data is acquired from the data cache of the next level, the data cache of the present level is updated in time based on the target data, and validity of the updated data is maintained.
In practical applications, the updating of the cache data also needs to be considered. For example, when the data cache of the present level does not have the data required by the data loading request, after the target data returned by the data cache of the next level is obtained, the target data needs to be written into the data cache of the present level so as to update the data in the cache. Further, to determine where a cache line in the level data cache is located to write target data, in one example, the second cache update module 96 may be further configured to:
And determining the updated cache line from one group corresponding to the index domain of the request virtual address in the data array of the data cache of the current level based on a least recently used algorithm, and clearing the current data in the updated cache line.
Specifically, a set is determined in the data array of the data cache of the present level based on the index field of the requested virtual address, in the manner described above. For example, assuming that the index field of the third virtual address is VA [12:7], and the index field is 01111, group 31 is selected. Based on the least recently used algorithm, in the determined one group, a cache line that has not been used for the longest time is determined as an updated cache line to be written with the target data, and the current data in the updated cache line is cleared.
In practical applications, there are various implementations of the least recently used algorithm, and a matrix method is illustrated here as an example. Firstly, defining a matrix with the same number of rows and columns as the number of cache lines, and defining a 4×4 matrix assuming that the data cache of the present level comprises 4 cache lines. When one cache line in a certain cache line is accessed, setting all lines in a matrix corresponding to the line as 1, setting all columns in the matrix corresponding to the line as 0, and using the cache line corresponding to the cache line with the least number of lines in the matrix as an updated cache line.
The cache line to be replaced is determined through the least recently used algorithm, and the locality principle is effectively utilized, so that cache data stored in the data cache of the present level is the most commonly accessed data or a memory area near the data, and the number of times that the processor kernel accesses the main memory is reduced.
In the cache processing device provided by the embodiment, a page identification bit is defined in a virtual address of a cache line to characterize a page where the cache line is located, a request virtual address is obtained based on a data loading request of a processor core, and a request physical address corresponding to the request virtual address is queried in an address mapping storage; determining a predicted cache line in the data cache of the level based on a way prediction technology; when the request physical address is different from the predicted physical address corresponding to the predicted cache line, determining an identification domain of the rereaded physical address which is the same as the identification domain of the request physical address from the data cache according to the index domain of the request virtual address except for the page identification bit, returning the cache line corresponding to the rereaded physical address to the processor core, comparing the page identification bit of the virtual address corresponding to the identification domain of the rereaded physical address with the page identification bit of the request virtual address, if the page identification bit of the virtual address is different from the page identification bit of the request virtual address, updating the cache, and updating the valid bit of the cache line corresponding to the identification domain of the rereaded physical address to be a value representing invalid. In the scheme, when the cache data is accessed in a way of path prediction, if the path prediction result is wrong, whether corresponding data exist or not is continuously searched in the data cache of the current level by comparing the identification fields of the physical addresses, so that the data are prevented from being directly acquired from the data cache of the next level, and the data access efficiency is effectively improved; defining page identification bits in virtual addresses of cache lines, after corresponding data of a data loading request is found in a data cache of the present level, managing the cache line data and valid bits in the data cache of the present level based on the page identification bits, and accurately and rapidly realizing management and maintenance of the valid bits under the condition of page crossing cache while updating data of the data cache, and guaranteeing accuracy of data access while expanding cache capacity.
Example six
Fig. 10 is a schematic structural diagram of another cache processing system according to an embodiment of the present application, as shown in fig. 10, where the cache processing system includes:
The second address calculation unit 91 is configured to receive an input data load instruction, calculate a request virtual address from an operand of the data load instruction by address calculation, and output the request virtual address to the second way prediction module for later access to the memory space. Assuming that the requested virtual address is VA [93:0], the index field of the requested virtual address is VA [12:7], the page identification bit of the requested virtual address is VA [12].
The second way prediction module is used for predicting which way of a group in the data array of the data cache of the present level the cache line to be accessed is. By adopting the way prediction method, the comparison of the cache line identifications of all ways in a group can be avoided, so that the speed is improved, and the cache line to be read out can be determined as early as possible. Because of the condition that the prediction error exists in the road prediction, the prediction result needs to be confirmed and error remedied.
The second mapping query module 92 is configured to determine a set in the data array of the first level data cache according to the index field VA [12:7] of the requested virtual address, and determine a predicted cache line in the set according to the way prediction result. And acquiring an identification domain of a physical address corresponding to the predicted cache line from the directory of the first-level data cache as a predicted identification.
An address translation fast table is a cache of address mappings in which recently used address translation entries are stored for fast translation of virtual addresses to physical addresses.
The second mapping query module 92 is configured to search, according to the request virtual address, a request physical address corresponding to the request virtual address in the address conversion block table, and because the low 12 bits of the request virtual address are the same as the low 12 bits of the request physical address, the corresponding mapping relationship can be searched according to VA [93:12], so as to obtain an identification field of the request physical address, and if the required mapping is not in the address conversion fast table, the required mapping needs to be fetched into the fast table in the next storage. Taking the identification domain of the physical address as the identification, assuming that the request physical address is PA [49:0], the identification domain of the request physical address is actually identified as PA [49:12]
The tag comparison module 97 is configured to compare the predicted tag with the actual tag, and if the predicted location is the same, that is, the predicted location is successfully compared, determine that the predicted result of the way is correct, and output the predicted cache data read from the predicted cache line to the second result selection module 95; if the result is not the same, judging that the result of the path prediction is wrong, and re-reading and caching in the first-level data cache.
The rereading module 94 is configured to determine, as the comparison identifier, an identifier field of a plurality of physical addresses in the directory of the primary data cache according to an index field of the requested virtual address other than the page identifier bit.
The identifier comparing module 97 is further configured to compare the plurality of comparison identifiers with the actual identifier, and if the comparison identifier that is the same as the actual identifier, that is, the non-predicted location is successfully compared, take the comparison identifier as a reread identifier, take a cache line corresponding to the reread identifier as a reread cache line, send reread cache data in the reread cache line to the second result selecting module 95, and compare a page identifier bit of a virtual address corresponding to the reread identifier with a page identifier bit of the requested virtual address.
The second cache update module 96 is configured to write the reread cache data into an updated cache line in the first level data cache if the page identification bit of the virtual address corresponding to the reread identifier is different from the page identification bit of the requested virtual address.
The identifier comparing module 97 is further configured to update the valid bit of the update cache line written into the re-read cache data by the primary data cache directory to a value indicating validity, and update the valid bit corresponding to the re-read cache line to a value indicating invalidity to mark the alias item and update the way prediction when the page identifier bit of the virtual address corresponding to the re-read identifier is different from the page identifier bit of the requested virtual address.
If the comparison identifier which is the same as the actual identifier does not exist, the loading request, the index field VA [12:7] of the request virtual address and the actual identifier enter a second miss queue to wait for data to be acquired from the second-level cache.
After receiving the data loading request, the second-level cache sends the returned data and the identification field (namely the returned identification) of the corresponding physical address to the return control module based on the index field VA [12:7] of the virtual address of the request and the actual identification. The return control module sends the return data to the second result selection module 95, so that the result can be output as early as possible. And sending the returned data to the first-level cache data for updating the cache. In addition, a return identification is sent to the second missing queue for subsequent cache directory updates.
In the second miss queue, one way of a cache set in the data array of the first level data cache determined by the least recently used algorithm is saved, wherein the cache set is determined according to an index field of the requested virtual address. After the second level cache returns the data and the non-predicted location comparison is successful, the second cache update module 96 uses this information to write the data to the corresponding cache line in the first level data cache. In addition, the second miss queue sends a return identifier to the directory of the primary data cache, and updates the valid bit of the update cache line written with the return data after the secondary cache returns the data.
The second result selecting module 95 is configured to select a final result for outputting. According to the comparison result output by the identification comparison module 97, the predicted cache data is selected when the comparison of the predicted positions is successful, the reread cache data is selected when the comparison of the non-predicted positions is successful, and the return data is selected when the comparison is failed. The specific parameters are given only for the convenience of description of the present solution, and the changes of these parameters do not affect the application and effect of the present solution, nor affect the protection scope of the present solution.
In the cache processing system provided by the embodiment, a page identification bit is defined in a virtual address of a cache line to characterize a page where the cache line is located, a request virtual address is obtained based on a data loading request of a processor core, and a request physical address corresponding to the request virtual address is queried in an address mapping storage; determining a predicted cache line in the data cache of the level based on a way prediction technology; when the request physical address is different from the predicted physical address corresponding to the predicted cache line, determining an identification domain of the rereaded physical address which is the same as the identification domain of the request physical address from the data cache according to the index domain of the request virtual address except for the page identification bit, returning the cache line corresponding to the rereaded physical address to the processor core, comparing the page identification bit of the virtual address corresponding to the identification domain of the rereaded physical address with the page identification bit of the request virtual address, if the page identification bit of the virtual address is different from the page identification bit of the request virtual address, updating the cache, and updating the valid bit of the cache line corresponding to the identification domain of the rereaded physical address to be a value representing invalid. In the scheme, when the cache data is accessed in a way of path prediction, if the path prediction result is wrong, whether corresponding data exist or not is continuously searched in the data cache of the current level by comparing the identification fields of the physical addresses, so that the data are prevented from being directly acquired from the data cache of the next level, and the data access efficiency is effectively improved; defining page identification bits in virtual addresses of cache lines, after corresponding data of a data loading request is found in a data cache of the present level, managing the cache line data and valid bits in the data cache of the present level based on the page identification bits, and accurately and rapidly realizing management and maintenance of the valid bits under the condition of page crossing cache while updating data of the data cache, and guaranteeing accuracy of data access while expanding cache capacity.
Example seven
Fig. 11 is a schematic structural diagram of an electronic device provided in an embodiment of the disclosure, as shown in fig. 11, where the electronic device includes:
A processor 291, the electronic device further comprising a memory 292; a communication interface (Communication Interface) 293 and bus 294 may also be included. The processor 291, the memory 292, and the communication interface 293 may communicate with each other via the bus 294. Communication interface 293 may be used for information transfer. The processor 291 may call logic instructions in the memory 292 to perform the methods of the above-described embodiments.
Further, the logic instructions in memory 292 described above may be implemented in the form of software functional units and stored in a computer-readable storage medium when sold or used as a stand-alone product.
The memory 292 is a computer-readable storage medium that may be used to store a software program, a computer-executable program, and program instructions/modules corresponding to the methods in the embodiments of the present disclosure. The processor 291 executes functional applications and data processing by running software programs, instructions and modules stored in the memory 292, i.e., implements the methods of the method embodiments described above.
Memory 292 may include a storage program area that may store an operating system, at least one application program required for functionality, and a storage data area; the storage data area may store data created according to the use of the terminal device, etc. Further, memory 292 may include high-speed random access memory, and may also include non-volatile memory.
The disclosed embodiments provide a non-transitory computer readable storage medium having stored therein computer-executable instructions that, when executed by a processor, are configured to implement the method of the previous embodiments.
Example eight
The disclosed embodiments provide a computer program product comprising a computer program which, when executed by a processor, implements the method provided by any of the embodiments of the disclosure described above.
Other embodiments of the application will be apparent to those skilled in the art from consideration of the specification and practice of the application disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It is to be understood that the application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (11)

1. The cache processing method is characterized by comprising the following steps:
obtaining a request virtual address according to a data loading request of a processor core; the index domain of the request virtual address comprises page identification bits, and the values of different page identification bits represent that the cache line corresponding to the request virtual address is positioned on different pages in the data array of the data cache of the level; the data array of the present level data cache comprises a plurality of pages;
If the data loading request is not hit in the data cache of the current level, the index domain of the virtual address of the request is saved, and the data loading request is sent to the data cache of the next level, so that the data cache of the next level returns the target data and the identification domain of the target physical address of the target data according to the data loading request;
returning the target data to the processor core, and determining identification domains of a plurality of physical addresses from an address array of the data cache of the present level according to each bit except the page identification bit of the index domain of the request virtual address;
updating valid bits of cache lines corresponding to the identification domains of the physical addresses in the identification domains of the physical addresses which are the same as the identification domains of the target physical addresses to be invalid values; and writing the target data into an updated cache line in the data cache of the level, and updating the valid bit of the updated cache line to a value representing validity.
2. The method according to claim 1, wherein the method further comprises:
determining a group in the data array of the data cache of the current level according to the index field of the request virtual address; determining a predicted cache line in the set based on a way prediction technique;
Inquiring address mapping storage according to the request virtual address to obtain a request physical address corresponding to the request virtual address;
if the identification domain of the request physical address is the same as the identification domain of the physical address of the prediction cache line and the valid bit corresponding to the prediction cache line is a value representing validity, determining that the data loading request hits in the data cache of the present level; otherwise, the data loading request is judged to be missed in the data cache of the level.
3. The method according to claim 2, wherein the method further comprises:
If the data loading request hits in the data cache of the present level, according to the offset domain of the request virtual address, obtaining predicted cache data from the predicted cache line, and returning the predicted cache data to the processor core.
4. The method according to claim 1, wherein the method further comprises:
Determining a plurality of first cache lines in a data array of the data cache according to the index domain of the request virtual address, wherein the index domain of the virtual address corresponding to the first cache lines is the same as the index domain of the request virtual address, and the corresponding valid bit is a value representing validity;
Inquiring address mapping storage according to the request virtual address to obtain a request physical address corresponding to the request virtual address;
If the identification domain of the request physical address is the same as the identification domain of the physical address of any first cache line, judging that the data loading request hits in the data cache of the level; otherwise, judging that the data loading request is not hit in the data cache of the present level.
5. The method according to claim 4, wherein the method further comprises:
If the data loading request hits in the data cache of the present level, taking the first cache line with the same identification domain of the physical address and the identification domain of the request physical address in the plurality of first cache lines as a hit cache line;
And acquiring hit data from the hit cache line according to the offset field of the request virtual address, and returning the hit data to the processor core.
6. The method according to any one of claims 2-5, wherein the querying an address mapping store according to the request virtual address to obtain a request physical address corresponding to the request virtual address includes:
inquiring whether the address mapping of the request virtual address exists in the first-level address mapping storage; the address mapping characterizes the corresponding relation between the virtual address and the physical address;
If so, obtaining a request physical address corresponding to the request virtual address according to the address mapping of the request virtual address in the first-level address mapping storage;
If the virtual address does not exist, the address mapping of the request virtual address is obtained from the next-stage address mapping storage, the request physical address corresponding to the request virtual address is obtained according to the address mapping of the request virtual address in the next-stage address mapping storage, and the address mapping of the request virtual address is added into the first-stage address mapping storage; the address maps other than the last stage store address maps for recording part of the virtual addresses, and the last stage address map stores address maps for storing all the virtual addresses.
7. The method of any of claims 1-5, wherein prior to writing the target data to the updated cache line in the level data cache, further comprising:
And determining the updated cache line from one group corresponding to the index domain of the request virtual address in the data array of the data cache of the current level based on a least recently used algorithm, and clearing the current data in the updated cache line.
8. The method of any of claims 1-5, wherein the page identification bit of the request virtual address is a most significant bit of an index field of the request virtual address.
9. A cache processing apparatus, comprising:
The first address calculation component is used for obtaining a request virtual address according to a data loading request of the processor core; the index domain of the request virtual address comprises page identification bits, and the values of different page identification bits represent that the cache line corresponding to the request virtual address is positioned on different pages in the data array of the data cache of the level; the data array of the present level data cache comprises a plurality of pages;
The first missing queue is used for storing an index domain of the request virtual address if the data loading request is not hit in the data cache of the current level, and sending the data loading request to the data cache of the next level so that the data cache of the next level returns the target data and an identification domain of a target physical address of the target data according to the data loading request;
the first result selection module is used for returning the target data to the processor kernel;
The return identification comparison module is used for determining identification domains of a plurality of physical addresses from the address array of the data cache of the present level according to each bit except the page identification bit of the index domain of the request virtual address; updating valid bits of cache lines corresponding to the identification domains of the physical addresses in the identification domains of the physical addresses which are the same as the identification domains of the target physical addresses to be invalid values;
the first cache updating module is used for writing the target data into an updated cache line in the data cache of the level;
the return identifier comparison module is further configured to update the valid bit of the updated cache line to a value that characterizes validity.
10. An electronic device, comprising: a processor, and a memory communicatively coupled to the processor;
the memory stores computer-executable instructions;
the processor executes computer-executable instructions stored in the memory to implement the method of any one of claims 1-8.
11. A computer readable storage medium having stored therein computer executable instructions which when executed by a processor are adapted to carry out the method of any one of claims 1-8.
CN202311316175.9A 2023-10-11 2023-10-11 Cache processing method, device, electronic equipment and medium Active CN117331853B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311316175.9A CN117331853B (en) 2023-10-11 2023-10-11 Cache processing method, device, electronic equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311316175.9A CN117331853B (en) 2023-10-11 2023-10-11 Cache processing method, device, electronic equipment and medium

Publications (2)

Publication Number Publication Date
CN117331853A CN117331853A (en) 2024-01-02
CN117331853B true CN117331853B (en) 2024-04-16

Family

ID=89292755

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311316175.9A Active CN117331853B (en) 2023-10-11 2023-10-11 Cache processing method, device, electronic equipment and medium

Country Status (1)

Country Link
CN (1) CN117331853B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0919927A2 (en) * 1997-11-26 1999-06-02 Digital Equipment Corporation Dynamic memory allocation technique for maintaining an even distribution of cache page addresses within an address space
US7015921B1 (en) * 2001-12-31 2006-03-21 Apple Computer, Inc. Method and apparatus for memory access
CN108463811A (en) * 2016-01-20 2018-08-28 Arm有限公司 Record group indicator
CN109952565A (en) * 2016-11-16 2019-06-28 华为技术有限公司 Internal storage access technology
CN115774683A (en) * 2022-11-02 2023-03-10 平头哥(上海)半导体技术有限公司 Method for acquiring physical address in super user mode and corresponding processor

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102151180B1 (en) * 2017-11-20 2020-09-02 삼성전자주식회사 System and methods for efficient virtually-tagged cache implementation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0919927A2 (en) * 1997-11-26 1999-06-02 Digital Equipment Corporation Dynamic memory allocation technique for maintaining an even distribution of cache page addresses within an address space
US7015921B1 (en) * 2001-12-31 2006-03-21 Apple Computer, Inc. Method and apparatus for memory access
CN108463811A (en) * 2016-01-20 2018-08-28 Arm有限公司 Record group indicator
CN109952565A (en) * 2016-11-16 2019-06-28 华为技术有限公司 Internal storage access technology
CN115774683A (en) * 2022-11-02 2023-03-10 平头哥(上海)半导体技术有限公司 Method for acquiring physical address in super user mode and corresponding processor

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A New Virtual-Address-Mapping Mechanism for Low-Energy I-Cache;QuanSheng Yang.etc;《2010 International Conference on Computational Intelligence and software Engineering》;20101231;1-4 *
片上缓存二维定位数据预取技术;崔凯朝;《中国优秀硕士学位论文全文数据库》;20221031;I137-53 *

Also Published As

Publication number Publication date
CN117331853A (en) 2024-01-02

Similar Documents

Publication Publication Date Title
US11314647B2 (en) Methods and systems for managing synonyms in virtually indexed physically tagged caches
US10083126B2 (en) Apparatus and method for avoiding conflicting entries in a storage structure
US10810134B2 (en) Sharing virtual and real translations in a virtual cache
US11775445B2 (en) Translation support for a virtual cache
US11403222B2 (en) Cache structure using a logical directory
CN112540939A (en) Storage management device, storage management method, processor and computer system
CN111858404A (en) Cache data positioning system
CN115617709A (en) Cache management method and device, cache device, electronic device and medium
CN109074313B (en) Caching and method
US6990551B2 (en) System and method for employing a process identifier to minimize aliasing in a linear-addressed cache
US8051271B2 (en) Translation of virtual to physical addresses
CN117331853B (en) Cache processing method, device, electronic equipment and medium
CN117331854B (en) Cache processing method, device, electronic equipment and medium
CN109992535B (en) Storage control method, device and system
US9158682B2 (en) Cache memory garbage collector
CN117331854A (en) Cache processing method, device, electronic equipment and medium
US20110283041A1 (en) Cache memory and control method thereof
US8271733B2 (en) Line allocation in multi-level hierarchical data stores
CN114090080A (en) Instruction cache, instruction reading method and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant