CN110297783B - Distributed cache structure based on real-time dynamic migration mechanism - Google Patents

Distributed cache structure based on real-time dynamic migration mechanism Download PDF

Info

Publication number
CN110297783B
CN110297783B CN201910595908.4A CN201910595908A CN110297783B CN 110297783 B CN110297783 B CN 110297783B CN 201910595908 A CN201910595908 A CN 201910595908A CN 110297783 B CN110297783 B CN 110297783B
Authority
CN
China
Prior art keywords
migration
data
unit
access
cache
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910595908.4A
Other languages
Chinese (zh)
Other versions
CN110297783A (en
Inventor
山蕊
刘阳
朱筠
蒋林
冯雅妮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian University of Posts and Telecommunications
Original Assignee
Xian University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian University of Posts and Telecommunications filed Critical Xian University of Posts and Telecommunications
Priority to CN201910595908.4A priority Critical patent/CN110297783B/en
Publication of CN110297783A publication Critical patent/CN110297783A/en
Application granted granted Critical
Publication of CN110297783B publication Critical patent/CN110297783B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0811Multiuser, multiprocessor or multiprocessing cache systems with multilevel cache hierarchies
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0866Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches for peripheral storage systems, e.g. disk cache
    • G06F12/0868Data transfer between cache memory and other subsystems, e.g. storage devices or host systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a distributed cache structure based on a real-time dynamic migration mechanism, which is based on the phenomenon that structural units of a reconfigurable array processor in the prior art are simple and densely arranged, and aims at the characteristics of obvious locality and high parallelism of access data.

Description

Distributed cache structure based on real-time dynamic migration mechanism
Technical Field
The invention belongs to the technical field of integrated circuit design, and particularly relates to a distributed cache structure and a reconfigurable array processor based on a real-time dynamic migration mechanism.
Background
With the infinite application of computation intensive type, storage intensive type and the like, a reconfigurable array processor is produced on the background of considering both computation efficiency and programming flexibility, a storage unit is used as a core component of the reconfigurable computation processor, and the current design has the conditions of serious shortage of storage bandwidth and huge access and storage expenses.
Aiming at the requirements of reconfigurable computing on high bandwidth and low delay of storage, a scheme that a distributed cache structure is designed to improve the access parallelism and solve the problem of serious shortage of storage bandwidth is needed, so that the access speed is improved and the access power consumption is reduced.
Disclosure of Invention
Aiming at the problems, the invention provides a distributed cache structure based on a real-time dynamic migration mechanism and a reconfigurable array processor, wherein the structure is provided with the real-time dynamic migration mechanism, so that the problems of high cache failure rate and long access delay caused by frequent external memory access in the existing distributed cache design in the reconfigurable array processor are solved.
In order to achieve the purpose, the invention adopts the main technical scheme that:
in a first aspect, the present invention provides a distributed cache structure based on a real-time live migration mechanism, including:
the system comprises a searching and comparing unit, an access recording unit, a migration output unit, a migration interconnection unit, a control unit and a data storage unit;
the search comparison unit, the access recording unit, the migration output unit, the migration interconnection unit and the data storage unit are all connected with the control unit;
the access recording unit is used for recording frequency information of data access stored in the local cache in the data storage unit and determining whether the data in the local cache is in a high-frequency access state or not according to the frequency information; the frequency information is the frequency accessed by each PE;
a migration output unit for receiving the information of the access recording unit and the data storage unit and preparing data when migrating the data in a high-frequency access state;
a migration interconnection unit configured to locate a migration destination of the data in the high-frequency access state according to data preparation by the migration output unit, migrate the data in the high-frequency access state to the migration destination, and set the data whose migration is cancelled to be invalid according to a migration cancellation signal;
the searching and comparing unit is used for tracking and marking the data migrated by the migration interconnection unit so as to obtain the data on the shortest path when each PE accesses the data;
and the data storage unit is used for storing the data in the local cache.
Optionally, the search comparing unit is further configured to perform state recording of 0 or 1 in an internal migration lookup table according to the data and the flag information provided by the migration interconnecting unit, and if the data is migrated and the migration is not cancelled, set the data state corresponding to the address to be accessed by the remote PE to 1, and record the original location information of the data corresponding to the address to be accessed by the remote PE through the flag bit; when the migration cancellation enable of any data is received to be high, the state of the migration cancelled data is recorded as 0, so that the migration cancelled data is searched in situ.
Optionally, the search comparison unit is specifically configured to
1) Migrated state lookup: after receiving a read-write access request of the PE, searching for effective state data in a migration lookup table, and if the searched data state bit is 1, indicating that the searched data is effective; if the searched data state bit is 0, the searched data is invalid;
2) migrated state comparison: collecting effective data flag information of all status bits, comparing read-write address information in the PE access request with the flag bits of the collected effective data, if the flag bits are equal to the read-write address information of the PE, migrating the data accessed by the PE, directly accessing the data without requesting to access the original static mapping position of the effective data, and ending the read-write operation in the PE access request; if the flag bit is not equal to the read-write address information of the PE, the access request is transmitted to a four-level access full-interconnection structure, and the data statically mapped in the cache is accessed through a real-time dynamic data migration mechanism;
3) updating migration data: the local cache records the data access frequency in the data storage unit through the access recording unit, if the access recording unit sends a migration data enable at the moment, the search comparing unit updates the zone bit of the migration data according to the address information carried by the migration enable and the accessed PE identifier, and sets the migration data as valid;
4) and (3) updating the migration cancellation: the local cache records the data access frequency in the data storage unit through the access recording unit, and if the access recording unit sends the migration cancellation enable at the moment, the local cache searches for the address information carried by the comparison unit according to the migration cancellation enable and sets the data of the migration cancellation enable as invalid.
Optionally, the access recording unit is specifically configured to:
counting the access condition of each PE to the data in the cache inside each cache, performing access counting through a built-in counter, comparing the value in the counter with the preset parameter B in the counting process, and sending a migration control signal to a migration output unit and a data storage unit if the value is greater than or equal to the parameter B; if the parameter is less than the parameter B, sending a migration cancel control signal to a migration output unit;
when the migration control signal is sent to the migration output unit, the migration control signal needs to be sent to the data storage unit, and the data storage unit is controlled to return the corresponding migrated data to the migration output unit according to cache address information carried by the migration control signal.
Optionally, access the recording unit, also for
Setting an internal counter for each data in the data storage unit, receiving PE identification information and read-write address information from read-write access in cache, operating the counter according to the access address information, if the PE identification information and the read-write address information are continuously accessed, adding 1 to the counter, if the PE identification information and the read-write address information are not continuously accessed, clearing 0 to the counter, and enabling the read-write access operation to be consistent.
Optionally, the migration output unit is specifically configured to:
operating a migration cancel read-write control signal and a migration read-write control signal sent by an access recording unit; when the migration read-write control signal comes, detecting whether data feedback sent by the data storage unit comes, and after the feedback is received, the migration output unit sends migration control information and migration data to the migration interconnection unit; and when receiving the migration cancel read-write control signal, merging the read-write information and outputting a group of migration cancel signals.
Optionally, the migration interconnect unit is specifically configured to:
receiving a migration control signal and a migration cancel control signal of a migration output unit, firstly locking an output position in a migration interconnection unit according to a PE (provider edge) identifier carried by the migration control signal, and finally inputting a migration data signal to a search comparison unit;
if the conflict exists in the migration interconnection unit, the migration data in the farthest cache is successfully migrated through the farthest priority arbitration mechanism.
In a second aspect, the present invention provides a reconfigurable array processor comprising:
the global controller is used for realizing the control and management of computing resources in the reconfigurable array processor;
a data input memory block DIM for storing the originally input data;
the data output storage block DOM is used for storing output result data;
each PE comprises a local real-time dynamic migration distributed cache; each PE accesses the cache of a remote area and a local area through a four-level access full-interconnection structure with a priority area;
each PE has a distributed cache structure, the distributed cache structure is used for recording frequency information of data accessed by the PE, and if the frequency in the frequency information is greater than a preset threshold value, the data is migrated according to the PE identifier recorded in the frequency information so as to be migrated to a local cache to which the PE identifier belongs;
wherein, the cache structure is a distributed cache structure based on the real-time dynamic migration mechanism in any one of the above first aspect.
Has the advantages that:
the invention provides a physical distribution and logic unification distributed cache structure aiming at the characteristics of large access and storage data volume, high data parallelism requirement, less data global reuse and obvious locality of a reconfigurable array processor.
The reconfigurable array processor has read-write access permission on both local cache and remote cache, the local cache has high priority, the remote cache accesses the local cache through a high-efficiency row-column cross switch to perform destination indexing, and when the reconfigurable array processor interacts with an external memory, a polling arbitration mechanism is designed to arbitrate a path of signals and then perform data transmission through a router. The method comprises the steps that a read-write request sent by a PE through an interconnection mechanism is received by the control unit in a cache accessed by the PE, the read-write request is sent to a state register module and a mark register module in the cache, whether the read-write access is hit or not is judged according to data fed back by the two register modules, whether a latest and longest unused strategy is adopted to replace a data block is selected in a write replacement strategy module in the cache according to the hit condition and state register information, dirty and dirty control information is sent to a data storage unit according to a mechanism of a write-back strategy, and finally the data storage unit returns and moves the data block between the internal storage and an external storage of the cache through a built-in 4-bit counter according to the write replacement strategy control information.
Meanwhile, because the consistency access structure of the reconfigurable array processor has the problems of long delay and the like, a distributed cache structure capable of performing a Dynamic Migration Mechanism (RDMM) in real time is provided, the RDMM can perform global Migration on data in the cache according to access frequency information, records the access times of each reconfigurable array processor in each cache block according to the characteristics of the reconfigurable array processor for accessing the data in real time, divides the data corresponding to the address to be accessed by the remote PE into transferable data or non-transferable data access according to the recording information in each cache, and can migrate the data into a source request reconfigurable processing element (namely PE) when the independent access times of the same processing element reach a critical point. After the real-time migration, the RDMM marks the data corresponding to the address to be accessed by the remote PE according to the mark, so that the data can be accurately acquired on the shortest path according to the information of the tracking mark when the remote PE accesses the data again, and the access delay is reduced to a great extent.
In conclusion, the RDMM can effectively reduce the waiting time of the PE based on the real-time dynamic migration characteristic of the access frequency, thereby reducing the access delay of the whole system and improving the access efficiency.
Drawings
FIG. 1 is a schematic diagram of a distributed cache real-time dynamic migration mechanism of a reconfigurable array processor according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating a distributed cache hardware structure according to an embodiment of the present invention;
FIG. 3 is a diagram of a lookup comparison unit according to an embodiment of the present invention;
fig. 4 is a diagram of an access recording unit structure according to an embodiment of the present invention;
FIG. 5 is a diagram of a migration output unit according to an embodiment of the present invention;
FIG. 6 is a block diagram of a migration interconnect unit provided by an embodiment of the present invention;
FIG. 7 is a diagram illustrating a distributed cache real-time live migration mechanism according to the present invention.
Detailed Description
For the purpose of better explaining the present invention and to facilitate understanding, the present invention will be described in detail by way of specific embodiments with reference to the accompanying drawings.
Referring to fig. 1, the left-side overall architecture of fig. 1 may be a reconfigurable array processor architecture of the present application, comprising: a Global controller (Global controller), a data input storage block DIM, a data output storage block DOM, a plurality of PEs and a distributed cache of each PE;
the global controller is mainly used for controlling and managing array computing resources, and comprises broadcasting of operation instructions and distribution of calling instructions;
DIM is the raw data input memory block and DOM is the result data output memory block.
A distributed cache structure based on the RDMM is fused with a reconfigurable array processor, wherein each PE has a local real-time dynamic migration distributed cache.
As shown in fig. 1, the right side of the dynamic migration structure based on the distributed cache records, by using the local cache, the frequency of the data (data corresponding to the address to be accessed by the remote PE) accessed by the PE, and if the data (i.e., the data corresponding to the address to be accessed by the remote PE) belongs to the high-frequency access, the data accessed at the high frequency is migrated according to the recorded PE number (i.e., the PE identifier), and is migrated to the local cache of the PE number.
As shown in fig. 7, each PE may access a cache in a remote area and a cache in a local area through a four-level access full-interconnect structure with a priority in area, and due to the existence of the migration mechanism, a special search mechanism is required to accurately locate the migration data, so that a search comparison structure needs to be designed before the PE access request is sent to the four-level full-interconnect structure, and when the data (i.e., data corresponding to an address to be accessed by a remote PE) is accessed again, the data after migration is quickly searched according to the characteristics of the migration mechanism, and the data (i.e., data corresponding to the address to be accessed by the remote PE) does not need to be accessed to the original fixedly mapped remote area.
The special search mechanism is that when the PE accesses the remote data, the PE first finds the data to be accessed in the migration lookup table, and if the migration lookup table does not have the data to be accessed, the PE accesses the corresponding remote cache according to the access address and searches the data (i.e., the data corresponding to the address to be accessed by the remote PE) in the remote cache.
In fig. 7, the existing modules inside the cache, such as the state register module, the flag register module, and the write-replace policy module, are shown, and in this application, the functions and structures of the existing modules inside the cache are not changed, and the above units are added to the existing structure.
Referring to fig. 2, the distributed cache hardware structure diagram based on the real-time dynamic migration mechanism is mainly composed of a lookup comparison unit (Tag _ find), an access recording unit (record), a migration output unit (Mig _ out), a migration interconnection unit (Mig _ con), a control unit (ctrl) and a cache Data storage unit (Data _ cache).
Firstly, an access recording unit records whether data of a local cache is frequently accessed by each PE (provider edge) or not, namely whether the access frequency is more than or equal to a preset parameter B or not, if the access frequency is not more than the parameter B, the data (namely the data corresponding to an address to be accessed by a remote PE) belongs to a non-high-frequency access state, and an established dynamic migration link is cancelled; if the access frequency is larger than the access frequency, the data belongs to a high-frequency access state, and data migration is required according to the result. At this time, the migration output unit receives the information of the access recording unit and the data storage unit, data preparation is carried out in the unit, finally, the migration interconnection unit quickly locates the migration destination cache according to the data and the index information of the migration output unit, and the migration output data is placed in the cache.
Referring to fig. 3, a structure diagram of a lookup comparison unit is mainly used for tracking and marking data after migration so as to ensure that the data can be acquired on the shortest path when the data is accessed again. The following lookup comparison unit specifically records a state of 0 or 1 in the migration lookup table according to the data and flag information provided by the migration interconnection unit, and if the data is migrated and the migration is not cancelled, the state of the data is 1, and the original position information of the data is recorded by the flag bit. That is, the data corresponding to the address to be accessed by the remote PE is migrated without canceling the migration, the state of the data corresponding to the address to be accessed by the remote PE is set to 1.
If the lookup comparison unit receives that the migration cancellation enable of the data (i.e., the data corresponding to the address to be accessed by the remote PE) is high, then only the state of the data is marked as 0, which indicates that the data (i.e., the data corresponding to the address to be accessed by the remote PE) is not migrated, and the data (i.e., the data corresponding to the address to be accessed by the remote PE) can be found in the original location.
The main functions are defined as follows:
1) migrated state lookup: after receiving a read-write access request of the PE, searching for effective state data in a lookup table, wherein if the state bit of the data is 1, the data is effective; if the data status bit is 0, the data is invalid. That is, if the status bit of the data corresponding to the address to be accessed by the remote PE is 1, the data accessed this time is valid; and if the state bit of the data corresponding to the address to be accessed by the remote PE is 0, the data accessed this time is invalid.
2) Migrated state comparison: collecting the valid data flag information of all the status bits, comparing the read-write address information of the PE with the flag bits of the valid data, if the flag bits are equal to the address information of the PE, migrating the data accessed by the PE, directly accessing the data at the moment without requesting to access the original statically mapped position of the data (namely the data corresponding to the address to be accessed by the remote PE), and ending the read-write operation; and if the flag bit is not equal to the address information of the flag bit, transparently transmitting the request to a four-level access full-interconnection structure, and accessing the data statically mapped in the cache through the mechanism.
3) Updating migration data: the local cache records the access frequency of the data (namely the data corresponding to the address to be accessed by the remote PE) through the access recording unit, if the access recording unit sends migration data enable at the moment, the search comparing unit updates the flag bit of the data according to the address information carried by the migration enable and the accessed PE number, and validates the data, namely the flag bit of the data corresponding to the address to be accessed by the remote PE is updated, and the access data is set to be valid.
4) And (3) updating the migration cancellation: the local cache records the access frequency of the data (namely the data corresponding to the address to be accessed by the remote PE) through the access recording unit, and if the access recording unit sends the migration cancellation enable at the moment, the search comparing unit sets the data as invalid according to the address information carried by the migration cancellation enable.
Referring to fig. 4, a structure diagram of an access recording unit is mainly used for counting the access situation of each PE to data in a cache inside each cache, performing access counting through a built-in counter, and comparing the value in the counter with the size of a preset parameter B in the counting process, if the value is greater than or equal to the parameter B, sending a migration control signal to a migration output unit and a cache data storage unit, and if the value is less than the parameter B, sending a migration cancel control signal to the migration output unit. When the migration control signal is sent to the migration output unit, the signal needs to be sent to the cache data storage unit, and the cache data storage unit is controlled to return the migrated data to the migration output unit according to cache address information carried by the migration control signal.
The main functions of accessing the recording unit are defined as follows:
1) reading and writing records: setting an internal counter for each data, receiving PE number information and read-write address information of read-write access from the cache, operating the counter according to the access address information, if the PE number information and the read-write address information are continuously accessed, adding 1 to the counter, and if the PE number information and the read-write address information are not continuously accessed, clearing 0 to the counter, and enabling the read-write access operation to be consistent.
2) And B, parameter comparison: and comparing the counter of each datum with the parameter B, if the value in the counter of the datum (namely, the datum corresponding to the address to be accessed by the remote PE) is greater than or equal to the parameter B, migrating the datum (namely, the datum corresponding to the address to be accessed by the remote PE), and sending a migration control signal, and if the value in the counter is not greater than the parameter B, sending a migration cancellation control signal.
Fig. 5 is a structure diagram of a migration output unit, which mainly operates on a migration cancel read/write control signal and a migration read/write control signal issued in an access recording unit. When the migration read-write control signal comes, detecting whether data feedback sent by the data storage unit comes, and after the feedback is received, the migration output unit sends migration control information and migration data to the migration interconnection unit; and when receiving the migration cancel read-write control signal, merging the read-write information and outputting a group of migration cancel signals.
Referring to fig. 6, a structure diagram of a migration interconnection unit is shown, which mainly functions to receive a migration control signal and a migration cancel control signal of a migration output unit, and first, according to a PE number carried by migration control information, lock an output position in the migration interconnection unit, and finally, input a migration data signal to a search comparison unit; if conflict exists in the migration interconnection unit, migrating the migration data in the farthest cache successfully through a farthest priority arbitration mechanism; the operation of migrating the cancel control information is similar to the above.
In order to better understand the meaning of each interface in fig. 1 to 6 of the present application in the drawings, each interface is explained below. As shown in connection with fig. 3, fig. 3 shows the respective interfaces of the lookup comparison unit.
The interface signal definition and the functional description of the lookup comparison unit are shown in table 1.
Table 1:
Figure GDA0002809164810000101
Figure GDA0002809164810000111
fig. 4 shows the interface for accessing the recording unit, which is defined and functionally specified in table 2.
Table 2:
Figure GDA0002809164810000112
Figure GDA0002809164810000121
fig. 5 shows the interface of the migration output unit, whose definition and functional description are as in table 3.
Table 3:
Figure GDA0002809164810000122
Figure GDA0002809164810000131
Figure GDA0002809164810000141
fig. 6 shows the interface of the migration interconnect unit, whose definition and functional description are as in table 4.
Table 4:
Figure GDA0002809164810000142
Figure GDA0002809164810000151
fig. 2 shows an interface whose interface signal definitions and functional specifications are shown in table 5.
Table 5:
Figure GDA0002809164810000152
Figure GDA0002809164810000161
Figure GDA0002809164810000171
it should be understood that the above description of specific embodiments of the present invention is only for the purpose of illustrating the technical lines and features of the present invention, and is intended to enable those skilled in the art to understand the contents of the present invention and to implement the present invention, but the present invention is not limited to the above specific embodiments. It is intended that all such changes and modifications as fall within the scope of the appended claims be embraced therein.

Claims (8)

1. A distributed cache structure based on a real-time dynamic migration mechanism is characterized by comprising the following components:
the system comprises a searching and comparing unit, an access recording unit, a migration output unit, a migration interconnection unit, a control unit and a data storage unit;
the search comparison unit, the access recording unit, the migration output unit, the migration interconnection unit and the data storage unit are all connected with the control unit;
the access recording unit is used for recording frequency information of data access stored in the local cache in the data storage unit and determining whether the data in the local cache is in a high-frequency access state or not according to the frequency information; the frequency information is the frequency accessed by each PE;
a migration output unit for receiving the information of the access recording unit and the data storage unit and preparing data when migrating the data in a high-frequency access state;
a migration interconnection unit configured to locate a migration destination of the data in the high-frequency access state according to data preparation by the migration output unit, migrate the data in the high-frequency access state to the migration destination, and set the data whose migration is cancelled to be invalid according to a migration cancellation signal;
the searching and comparing unit is used for tracking and marking the data migrated by the migration interconnection unit so as to obtain the data on the shortest path when each PE accesses the data;
and the data storage unit is used for storing the data in the local cache.
2. The structure of claim 1,
the search comparison unit is further configured to perform state recording of 0 or 1 in an internal migration lookup table according to the data and the flag information provided by the migration interconnection unit, and if the data is migrated and the migration is not cancelled, set the data state corresponding to the address to be accessed by the remote PE to 1, and record the original location information of the data corresponding to the address to be accessed by the remote PE through the flag bit; when the migration cancellation enable of any data is received to be high, the state of the migration cancelled data is recorded as 0, so that the migration cancelled data is searched in situ.
3. Arrangement according to claim 2, characterized in that the look-up comparison unit is in particular adapted for
1) Migrated state lookup: after receiving a read-write access request of the PE, searching for effective state data in a migration lookup table, and if the searched data state bit is 1, indicating that the searched data is effective; if the searched data state bit is 0, the searched data is invalid;
2) migrated state comparison: collecting effective data flag information of all status bits, comparing read-write address information in the PE access request with the flag bits of the collected effective data, if the flag bits are equal to the read-write address information of the PE, migrating the data accessed by the PE, directly accessing the data without requesting to access the original static mapping position of the effective data, and ending the read-write operation in the PE access request; if the flag bit is not equal to the read-write address information of the PE, the access request is transmitted to a four-level access full-interconnection structure, and the data statically mapped in the cache is accessed through a real-time dynamic data migration mechanism;
3) updating migration data: the local cache records the data access frequency in the data storage unit through the access recording unit, if the access recording unit sends a migration data enable at the moment, the search comparing unit updates the zone bit of the migration data according to the address information carried by the migration enable and the accessed PE identifier, and sets the migration data as valid;
4) and (3) updating the migration cancellation: the local cache records the data access frequency in the data storage unit through the access recording unit, and if the access recording unit sends the migration cancellation enable at the moment, the local cache searches for the address information carried by the comparison unit according to the migration cancellation enable and sets the data of the migration cancellation enable as invalid.
4. The architecture of claim 3, wherein the access recording unit is specifically configured to:
counting the access condition of each PE to the data in the cache inside each cache, performing access counting through a built-in counter, comparing the value in the counter with the preset parameter B in the counting process, and sending a migration control signal to a migration output unit and a data storage unit if the value is greater than or equal to the parameter B; if the parameter is less than the parameter B, sending a migration cancel control signal to a migration output unit;
when the migration control signal is sent to the migration output unit, the migration control signal needs to be sent to the data storage unit, and the data storage unit is controlled to return the corresponding migrated data to the migration output unit according to cache address information carried by the migration control signal.
5. The arrangement of claim 4, wherein the access record unit is further configured to access the record unit
Setting an internal counter for each data in the data storage unit, receiving PE identification information and read-write address information from read-write access in cache, operating the counter according to the access address information, if the PE identification information and the read-write address information are continuously accessed, adding 1 to the counter, if the PE identification information and the read-write address information are not continuously accessed, clearing 0 to the counter, and enabling the read-write access operation to be consistent.
6. The structure of claim 5, wherein the migration output unit is specifically configured to:
operating a migration cancel read-write control signal and a migration read-write control signal sent by an access recording unit; when the migration read-write control signal comes, detecting whether data feedback sent by the data storage unit comes, and after the feedback is received, the migration output unit sends migration control information and migration data to the migration interconnection unit; and when receiving the migration cancel read-write control signal, merging the read-write information and outputting a group of migration cancel signals.
7. The structure of claim 6, wherein the migration interconnect unit is specifically configured to:
receiving a migration control signal and a migration cancel control signal of a migration output unit, firstly locking an output position in a migration interconnection unit according to a PE (provider edge) identifier carried by the migration control signal, and finally inputting a migration data signal to a search comparison unit;
if the conflict exists in the migration interconnection unit, the migration data in the farthest cache is successfully migrated through the farthest priority arbitration mechanism.
8. A reconfigurable array processor, comprising:
the global controller is used for realizing the control and management of computing resources in the reconfigurable array processor;
a data input memory block DIM for storing the originally input data;
the data output storage block DOM is used for storing output result data;
each PE comprises a local cache with a real-time dynamic migration mechanism; each PE accesses the cache of a remote area and a local area through a four-level access full-interconnection structure with a priority area;
the local cache of each PE is used for recording frequency information of data accessed by the PE, and if the frequency information is more than a preset threshold, the data is migrated according to the PE identifier recorded in the frequency information so as to be migrated to the local cache to which the PE identifier belongs;
wherein, the local caches of a plurality of PEs form a distributed cache structure based on the real-time dynamic migration mechanism as claimed in any one of claims 1 to 7.
CN201910595908.4A 2019-07-03 2019-07-03 Distributed cache structure based on real-time dynamic migration mechanism Active CN110297783B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910595908.4A CN110297783B (en) 2019-07-03 2019-07-03 Distributed cache structure based on real-time dynamic migration mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910595908.4A CN110297783B (en) 2019-07-03 2019-07-03 Distributed cache structure based on real-time dynamic migration mechanism

Publications (2)

Publication Number Publication Date
CN110297783A CN110297783A (en) 2019-10-01
CN110297783B true CN110297783B (en) 2021-01-15

Family

ID=68030067

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910595908.4A Active CN110297783B (en) 2019-07-03 2019-07-03 Distributed cache structure based on real-time dynamic migration mechanism

Country Status (1)

Country Link
CN (1) CN110297783B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111930315A (en) * 2020-08-21 2020-11-13 北京天融信网络安全技术有限公司 Data access method, data access device and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103139302A (en) * 2013-02-07 2013-06-05 浙江大学 Real-time copy scheduling method considering load balancing
CN104035823A (en) * 2014-06-17 2014-09-10 华为技术有限公司 Load balancing method and device
CN108139974A (en) * 2015-10-21 2018-06-08 华为技术有限公司 distributed caching dynamic migration

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102103544A (en) * 2009-12-16 2011-06-22 腾讯科技(深圳)有限公司 Method and device for realizing distributed cache
CN103078927B (en) * 2012-12-28 2015-07-22 合一网络技术(北京)有限公司 Key-value data distributed caching system and method thereof
US10073779B2 (en) * 2012-12-28 2018-09-11 Intel Corporation Processors having virtually clustered cores and cache slices
US9684596B2 (en) * 2015-02-25 2017-06-20 Microsoft Technology Licensing, Llc Application cache replication to secondary application(s)
US10031883B2 (en) * 2015-10-16 2018-07-24 International Business Machines Corporation Cache management in RDMA distributed key/value stores based on atomic operations
CN109284258A (en) * 2018-08-13 2019-01-29 华东计算技术研究所(中国电子科技集团公司第三十二研究所) Distributed multi-level storage system and method based on HDFS

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103139302A (en) * 2013-02-07 2013-06-05 浙江大学 Real-time copy scheduling method considering load balancing
CN104035823A (en) * 2014-06-17 2014-09-10 华为技术有限公司 Load balancing method and device
CN108139974A (en) * 2015-10-21 2018-06-08 华为技术有限公司 distributed caching dynamic migration

Also Published As

Publication number Publication date
CN110297783A (en) 2019-10-01

Similar Documents

Publication Publication Date Title
US20210374056A1 (en) Systems and methods for scalable and coherent memory devices
US8423715B2 (en) Memory management among levels of cache in a memory hierarchy
JP3132749B2 (en) Multiprocessor data processing system
US20050240736A1 (en) System and method for coherency filtering
CN109815163A (en) The system and method for efficient cache row processing based on prediction
CN101866318B (en) Management system and method for cache replacement strategy
WO2019012291A1 (en) Range-based memory system
CN104899160A (en) Cache data control method, node controller and system
CN101236527A (en) Line swapping scheme to reduce back invalidations in a snoop filter
US20110320720A1 (en) Cache Line Replacement In A Symmetric Multiprocessing Computer
CN103559319A (en) Cache synchronization method and equipment for distributed cluster file system
CN103076992A (en) Memory data buffering method and device
US11550720B2 (en) Configurable cache coherency controller
KR20180122969A (en) A multi processor system and a method for managing data of processor included in the system
WO2024036985A1 (en) Storage system, computational storage processor and solid-state drive thereof, and data reading method and data writing method therefor
US8930640B2 (en) Multiprocessor computer system with reduced directory requirement
CN110297783B (en) Distributed cache structure based on real-time dynamic migration mechanism
CN1704912B (en) Address translator and address translation method
CN104408069A (en) Consistency content design method based on Bloom filter thought
CN107710172A (en) The access system and method for memory
CN103077099B (en) A kind of piece of level fast photographic system and the user writable method based on this system
CN109478164A (en) For storing the system and method for being used for the requested information of cache entries transmission
CN107003932A (en) The CACHE DIRECTORY processing method and contents controller of multi-core processor system
CN109992535B (en) Storage control method, device and system
CN112148639A (en) High-efficiency small-capacity cache memory replacement method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant