CN110597656B - Check list error processing method of secondary cache tag array - Google Patents
Check list error processing method of secondary cache tag array Download PDFInfo
- Publication number
- CN110597656B CN110597656B CN201910859104.0A CN201910859104A CN110597656B CN 110597656 B CN110597656 B CN 110597656B CN 201910859104 A CN201910859104 A CN 201910859104A CN 110597656 B CN110597656 B CN 110597656B
- Authority
- CN
- China
- Prior art keywords
- request
- hit
- check
- error
- result
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/08—Error detection or correction by redundancy in data representation, e.g. by using checking codes
- G06F11/10—Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
- G06F11/1008—Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
- G06F11/1044—Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices with specific ECC/EDC distribution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Responding to the occurrence of a fault, e.g. fault tolerance
- G06F11/08—Error detection or correction by redundancy in data representation, e.g. by using checking codes
- G06F11/10—Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
- G06F11/1008—Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
- G06F11/1064—Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices in cache or content addressable memories
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0811—Multiuser, multiprocessor or multiprocessing cache systems with multilevel cache hierarchies
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0877—Cache access modes
- G06F12/0884—Parallel mode, e.g. in parallel with main memory or CPU
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Quality & Reliability (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
The invention relates to a check list error processing method of a secondary cache mark array, which comprises the following steps: performing read data and hit judgment of the check mark array in parallel; comprehensively judging whether the request can be normally executed according to the checking result and the hit result; and for the request which cannot be normally executed, entering a retry processing flow to obtain correct data. The invention can improve the design frequency of the chip.
Description
Technical Field
The invention relates to the technical field of secondary cache access processes, in particular to a check list error processing method of a secondary cache mark array.
Background
Current general-purpose processors employ a layered Cache (Cache) architecture to mitigate the difference between the increasingly expanding computing performance and the host supply performance of the processor. The marking array of the secondary Cache stores the physical address corresponding to the Cache block, so that the data in the secondary Cache and the data in the main memory can be mapped in a group-associated mode. Due to manufacturing process imperfections, sporadic errors may occur when reading and writing the tag array, which has a certain impact on the correctness of the processor. In order to ensure the correctness of the read-write data of the two-level Cache marked array (STAG for short), the main stream is to use a Hamming code algorithm to generate ECC check codes and write the ECC check codes together before writing, and to check the read data in combination with the ECC check codes when reading the array. When multiple errors occur in verification, the hardware cannot be corrected, and the error is processed by software through error reporting means. When the mark array check errors occur, if the check list errors occur, the hardware automatically corrects the check list errors and uses the check list errors, and the error times are recorded for early warning; all requests need to be checked to see if an ECC single or multiple error has occurred.
However, the probability of error occurrence of the mark array check counted in the actual chip is very small, and most of them are ECC single errors. When an error occurs in checking the tag array, the conventional processing situation is as follows, the access situation of the request on the pipeline of the second-level Cache control unit is as shown in fig. 1, the request passes through the pipeline and the read-write interface of the STAG is as shown in fig. 2, and the specific operations are as follows:
1) Requesting to arbitrate a pipeline of a second-level Cache control component at a station 0 to obtain a port authority for accessing the STAG;
2) At station 1, requesting to read the STAG, obtaining read data;
3) At station 2, the read data is checked, as may be the case;
a) If multiple errors occur, the hardware cannot correct, and the machine checking errors are processed by the operating system;
b) If single error occurs, the hardware automatically corrects to obtain correct data;
c) No checking error occurs, and the read data is correct data;
4) At station 2, after the verification is completed, judging whether STAG is hit or not by using correct data;
5) At station 3, if STAG is hit, data is sent to the requestor.
It follows that in the case of conventional processing for requesting access to the STAG, whether or not a check error occurs, it is necessary to wait for the check to complete before obtaining a hit result. Whether the read result is checked or hit judged, the delay in physical implementation is relatively large. If both are implemented in one station at the same time, then this station tends to be the station with the greatest full core delay. This directly determines the highest frequency at which the chip can operate correctly. Therefore, chips designed by using the conventional STAG single-error processing method often have a frequency of about 1.0GHz at which the chips can actually operate correctly.
Disclosure of Invention
The invention aims to provide a check list error processing method of a secondary cache mark array, which improves the design frequency of a chip.
The technical scheme adopted for solving the technical problems is as follows: the utility model provides a check list error processing method of a second-level cache mark array, which comprises the following steps:
(1) Performing read data and hit judgment of the check mark array in parallel;
(2) Comprehensively judging whether the request can be normally executed according to the checking result and the hit result;
(3) And for the request which cannot be normally executed, entering a retry processing flow to obtain correct data.
The judging mode in the step (2) is as follows:
(a) If the verification finds multiple errors, the hardware cannot correct, and the machine checking errors are processed by the operating system;
(b) If the check finds a single error, the request is ended, and the processing is carried out according to the hit result;
the hit result in the step (b) is no hit or a single miss hit way, which indicates that the hit result is unreliable and cannot be normally executed; and (c) the hit result in the step (b) is a single error miss path, which indicates that the hit result is reliable and can be normally executed.
The step (3) of retrying the request with the set request for buffering the erroneous request allows re-arbitration of the pipeline of the secondary cache management unit to obtain the correct tag array data.
The request retry buffer is a single entry buffer and arbitrates the pipeline of the second level cache management unit with the highest priority.
Advantageous effects
Due to the adoption of the technical scheme, compared with the prior art, the invention has the following advantages and positive effects: the invention uses the idea of parallel execution to change the hit judgment and read data verification executed in series in the traditional design into parallel execution, thus shortening the time for obtaining the result. Because whether the read result is correct or not is not known in the process of judging the hit, whether the result is credible or not cannot be determined, the real hit result needs to be comprehensively judged by combining the verification result and the hit result. For data with single errors, hardware automatically corrects the data according to a pipeline station, and for a request with single errors, whether the data need to enter single error retry to obtain correct data is determined according to a judgment rule. The design frequency that the control component pipeline of the secondary Cache can realize is increased due to the reduction of the maximum delay of the platform 2.
Drawings
FIG. 1 is a schematic diagram of a prior art pipelined station for requests;
FIG. 2 is a schematic diagram of a prior art STAG interface requesting access through a pipeline;
FIG. 3 is a schematic diagram of a pipeline station for a single error retry procedure in accordance with the present invention;
FIG. 4 is a schematic diagram of a STAG interface for requesting access through a pipeline in accordance with the present invention;
FIG. 5 is a flow chart showing a method for determining whether a single error retry is required in the present invention.
Detailed Description
The invention will be further illustrated with reference to specific examples. It is to be understood that these examples are illustrative of the present invention and are not intended to limit the scope of the present invention. Further, it is understood that various changes and modifications may be made by those skilled in the art after reading the teachings of the present invention, and such equivalents are intended to fall within the scope of the claims appended hereto.
The embodiment of the invention relates to a check list error processing method of a secondary cache mark array, which is shown in fig. 1 and comprises the following steps: performing read data and hit judgment of the check mark array in parallel; comprehensively judging whether the request can be normally executed according to the checking result and the hit result; and for the request which cannot be normally executed, entering a retry processing flow to obtain correct data.
The embodiment improves the existing STAG check single error processing method, the access condition of the request on the pipeline of the second-level Cache control unit is shown in fig. 3, the request passes through the pipeline and the read-write interface of the STAG is shown in fig. 4, and the specific operation is as follows:
1) Requesting to arbitrate a pipeline of a second-level Cache control component at a station 0 to obtain a port authority for accessing the STAG;
2) At station 1, requesting to read the STAG, obtaining read data;
3) At the platform 2, checking the read data, if single errors exist, generating corrected data, and judging hit secondary Cache;
4) At station 3, the execution of the request is determined by combining the check result and the hit result. The determination can be made according to the flow shown in fig. 5:
a) If the verification finds multiple errors, the hardware cannot correct, and the machine checking errors are processed by the operating system. At this point, the request is considered to be ended;
b) If the check finds a single error, the request is ended, and the processing is carried out according to the hit result;
c) If the check finds single error and does not hit, the hit result is not credible, and a single error retry flow is required to be entered;
d) If the check finds a single error and the single error is on the hit way, the hit result is not credible, and a single error retry flow is required to be entered;
e) If the check finds a single error and the single error is not in the hit way, the hit result is reliable, and the request ends normal execution. The hardware corrects the erroneous data write to STAG.
5) The Cache routing hardware with single error rewrites the corrected correct data into the STAG at the station 4;
6) If a single error retry procedure is requested, a retry buffer is written at station 3. And the retry buffer is used for reapplying the access to the pipeline according to the management rule. Obtaining the correct data by revisiting the STAG;
7) The single error retry buffer buffers a single entry to arbitrate for the pipeline with the highest priority. The secondary Cache control unit will not receive a new request until the request for the single-miss retry is completed normally.
8) If the request is normally ended, if the request hits the sce, data is sent to the requesting party at station 3.
Therefore, if the STAG check error does not occur in the request, the processing flow provided by the invention is the same as the pipeline beat compared with the traditional processing method. If the request has single error in the non-hit path, the processing flow provided by the invention is consistent with the access delay of the traditional processing flow. If a single error occurs in the hit way, the processing flow provided by the invention needs to access the STAG through the pipeline twice and adds up to 8 beats of access delay, and compared with the traditional processing flow, the access delay is increased by 4 beats, but the probability of the single error occurs in the hit way is very low, so that the pipeline station of the request is equivalent to the traditional method. Since the hardware is also used to correct the correction order errors, the method is equivalent to the traditional method in terms of the stability and the yield of the chip operation. However, the invention changes the hit judgment and the read data verification which are executed in series in the traditional design into parallel execution, thus shortening the time for obtaining the result, reducing the maximum delay of the platform 2 and improving the design frequency which can be realized by the control component pipeline of the secondary Cache. According to the physical realization result, the design frequency of the pipeline can be improved to about 1.6GHz, and compared with the traditional method, the frequency is improved by about 60 percent.
Claims (3)
1. The check list error processing method of the second-level cache mark array is characterized by comprising the following steps of:
(1) Performing read data and hit judgment of the check mark array in parallel;
(2) Comprehensively judging whether the request can be normally executed according to the checking result and the hit result; if the check finds a single error, the request is ended, and the processing is carried out according to the hit result; if the hit result is no hit or a single miss is on the hit path, the hit result is not trusted and cannot be normally executed; if the hit result is a single error and is not in the hit path, the hit result is reliable and can be normally executed;
(3) For a request which cannot be normally executed, entering a retry processing flow to obtain correct data, specifically: the request for the retry buffer error is utilized to allow the request retry buffer to re-arbitrate the pipeline of the secondary cache management unit to obtain the correct tag array data.
2. The method for processing single errors in a second level cache tag array according to claim 1, wherein if multiple errors are found in the checking in the step (2), the hardware cannot correct, and the machine check errors are processed by the operating system.
3. The method of claim 1, wherein the request retry buffer is a single entry buffer and the pipeline of the second level cache management unit is arbitrated with the highest priority.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910859104.0A CN110597656B (en) | 2019-09-11 | 2019-09-11 | Check list error processing method of secondary cache tag array |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910859104.0A CN110597656B (en) | 2019-09-11 | 2019-09-11 | Check list error processing method of secondary cache tag array |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110597656A CN110597656A (en) | 2019-12-20 |
CN110597656B true CN110597656B (en) | 2023-08-08 |
Family
ID=68859039
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910859104.0A Active CN110597656B (en) | 2019-09-11 | 2019-09-11 | Check list error processing method of secondary cache tag array |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110597656B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105335247A (en) * | 2015-09-24 | 2016-02-17 | 中国航天科技集团公司第九研究院第七七一研究所 | Fault-tolerant structure and fault-tolerant method for Cache in high-reliability system chip |
CN109062613A (en) * | 2018-06-01 | 2018-12-21 | 杭州中天微系统有限公司 | Multicore interconnects L2 cache and accesses verification method |
-
2019
- 2019-09-11 CN CN201910859104.0A patent/CN110597656B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105335247A (en) * | 2015-09-24 | 2016-02-17 | 中国航天科技集团公司第九研究院第七七一研究所 | Fault-tolerant structure and fault-tolerant method for Cache in high-reliability system chip |
CN109062613A (en) * | 2018-06-01 | 2018-12-21 | 杭州中天微系统有限公司 | Multicore interconnects L2 cache and accesses verification method |
Also Published As
Publication number | Publication date |
---|---|
CN110597656A (en) | 2019-12-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8051337B2 (en) | System and method for fast cache-hit detection | |
US8589763B2 (en) | Cache memory system | |
US9454422B2 (en) | Error feedback and logging with memory on-chip error checking and correcting (ECC) | |
US20140068208A1 (en) | Separately stored redundancy | |
US8566672B2 (en) | Selective checkbit modification for error correction | |
US9727411B2 (en) | Method and processor for writing and error tracking in a log subsystem of a file system | |
US9384091B2 (en) | Error code management in systems permitting partial writes | |
CN103077095B (en) | Error correction method and device for stored data and computer system | |
CN115509609A (en) | Data processing apparatus and method | |
US8868517B2 (en) | Scatter gather list for data integrity | |
US9323610B2 (en) | Non-blocking commands | |
CN111625199A (en) | Method and device for improving reliability of data path of solid state disk, computer equipment and storage medium | |
US9329926B1 (en) | Overlapping data integrity for semiconductor devices | |
CN114741250A (en) | System and method for validating a multi-level cache | |
US8533560B2 (en) | Controller, data storage device and program product | |
CN110597656B (en) | Check list error processing method of secondary cache tag array | |
CN110955916B (en) | Data integrity protection method, system and related equipment | |
CN108021467B (en) | Memory fault tolerance protection method, device, equipment and storage medium | |
US9288161B2 (en) | Verifying the functionality of an integrated circuit | |
US6567952B1 (en) | Method and apparatus for set associative cache tag error detection | |
US20120023388A1 (en) | Parity Look-Ahead Scheme for Tag Cache Memory | |
CN112181703B (en) | CAM supporting soft error retransmission mechanism between capacity processor and memory board and application method | |
US9348744B2 (en) | Implementing enhanced reliability of systems utilizing dual port DRAM | |
US10579470B1 (en) | Address failure detection for memory devices having inline storage configurations | |
US8533565B2 (en) | Cache controller and cache controlling method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |