CN110597656B - Check list error processing method of secondary cache tag array - Google Patents

Check list error processing method of secondary cache tag array Download PDF

Info

Publication number
CN110597656B
CN110597656B CN201910859104.0A CN201910859104A CN110597656B CN 110597656 B CN110597656 B CN 110597656B CN 201910859104 A CN201910859104 A CN 201910859104A CN 110597656 B CN110597656 B CN 110597656B
Authority
CN
China
Prior art keywords
request
hit
check
error
result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910859104.0A
Other languages
Chinese (zh)
Other versions
CN110597656A (en
Inventor
胡向东
尹飞
张晓东
路冬冬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SHANGHAI HIGH-PERFORMANCE INTEGRATED CIRCUIT DESIGN CENTER
Original Assignee
SHANGHAI HIGH-PERFORMANCE INTEGRATED CIRCUIT DESIGN CENTER
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SHANGHAI HIGH-PERFORMANCE INTEGRATED CIRCUIT DESIGN CENTER filed Critical SHANGHAI HIGH-PERFORMANCE INTEGRATED CIRCUIT DESIGN CENTER
Priority to CN201910859104.0A priority Critical patent/CN110597656B/en
Publication of CN110597656A publication Critical patent/CN110597656A/en
Application granted granted Critical
Publication of CN110597656B publication Critical patent/CN110597656B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1008Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
    • G06F11/1044Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices with specific ECC/EDC distribution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1008Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices
    • G06F11/1064Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's in individual solid state devices in cache or content addressable memories
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0811Multiuser, multiprocessor or multiprocessing cache systems with multilevel cache hierarchies
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0877Cache access modes
    • G06F12/0884Parallel mode, e.g. in parallel with main memory or CPU
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The invention relates to a check list error processing method of a secondary cache mark array, which comprises the following steps: performing read data and hit judgment of the check mark array in parallel; comprehensively judging whether the request can be normally executed according to the checking result and the hit result; and for the request which cannot be normally executed, entering a retry processing flow to obtain correct data. The invention can improve the design frequency of the chip.

Description

Check list error processing method of secondary cache tag array
Technical Field
The invention relates to the technical field of secondary cache access processes, in particular to a check list error processing method of a secondary cache mark array.
Background
Current general-purpose processors employ a layered Cache (Cache) architecture to mitigate the difference between the increasingly expanding computing performance and the host supply performance of the processor. The marking array of the secondary Cache stores the physical address corresponding to the Cache block, so that the data in the secondary Cache and the data in the main memory can be mapped in a group-associated mode. Due to manufacturing process imperfections, sporadic errors may occur when reading and writing the tag array, which has a certain impact on the correctness of the processor. In order to ensure the correctness of the read-write data of the two-level Cache marked array (STAG for short), the main stream is to use a Hamming code algorithm to generate ECC check codes and write the ECC check codes together before writing, and to check the read data in combination with the ECC check codes when reading the array. When multiple errors occur in verification, the hardware cannot be corrected, and the error is processed by software through error reporting means. When the mark array check errors occur, if the check list errors occur, the hardware automatically corrects the check list errors and uses the check list errors, and the error times are recorded for early warning; all requests need to be checked to see if an ECC single or multiple error has occurred.
However, the probability of error occurrence of the mark array check counted in the actual chip is very small, and most of them are ECC single errors. When an error occurs in checking the tag array, the conventional processing situation is as follows, the access situation of the request on the pipeline of the second-level Cache control unit is as shown in fig. 1, the request passes through the pipeline and the read-write interface of the STAG is as shown in fig. 2, and the specific operations are as follows:
1) Requesting to arbitrate a pipeline of a second-level Cache control component at a station 0 to obtain a port authority for accessing the STAG;
2) At station 1, requesting to read the STAG, obtaining read data;
3) At station 2, the read data is checked, as may be the case;
a) If multiple errors occur, the hardware cannot correct, and the machine checking errors are processed by the operating system;
b) If single error occurs, the hardware automatically corrects to obtain correct data;
c) No checking error occurs, and the read data is correct data;
4) At station 2, after the verification is completed, judging whether STAG is hit or not by using correct data;
5) At station 3, if STAG is hit, data is sent to the requestor.
It follows that in the case of conventional processing for requesting access to the STAG, whether or not a check error occurs, it is necessary to wait for the check to complete before obtaining a hit result. Whether the read result is checked or hit judged, the delay in physical implementation is relatively large. If both are implemented in one station at the same time, then this station tends to be the station with the greatest full core delay. This directly determines the highest frequency at which the chip can operate correctly. Therefore, chips designed by using the conventional STAG single-error processing method often have a frequency of about 1.0GHz at which the chips can actually operate correctly.
Disclosure of Invention
The invention aims to provide a check list error processing method of a secondary cache mark array, which improves the design frequency of a chip.
The technical scheme adopted for solving the technical problems is as follows: the utility model provides a check list error processing method of a second-level cache mark array, which comprises the following steps:
(1) Performing read data and hit judgment of the check mark array in parallel;
(2) Comprehensively judging whether the request can be normally executed according to the checking result and the hit result;
(3) And for the request which cannot be normally executed, entering a retry processing flow to obtain correct data.
The judging mode in the step (2) is as follows:
(a) If the verification finds multiple errors, the hardware cannot correct, and the machine checking errors are processed by the operating system;
(b) If the check finds a single error, the request is ended, and the processing is carried out according to the hit result;
the hit result in the step (b) is no hit or a single miss hit way, which indicates that the hit result is unreliable and cannot be normally executed; and (c) the hit result in the step (b) is a single error miss path, which indicates that the hit result is reliable and can be normally executed.
The step (3) of retrying the request with the set request for buffering the erroneous request allows re-arbitration of the pipeline of the secondary cache management unit to obtain the correct tag array data.
The request retry buffer is a single entry buffer and arbitrates the pipeline of the second level cache management unit with the highest priority.
Advantageous effects
Due to the adoption of the technical scheme, compared with the prior art, the invention has the following advantages and positive effects: the invention uses the idea of parallel execution to change the hit judgment and read data verification executed in series in the traditional design into parallel execution, thus shortening the time for obtaining the result. Because whether the read result is correct or not is not known in the process of judging the hit, whether the result is credible or not cannot be determined, the real hit result needs to be comprehensively judged by combining the verification result and the hit result. For data with single errors, hardware automatically corrects the data according to a pipeline station, and for a request with single errors, whether the data need to enter single error retry to obtain correct data is determined according to a judgment rule. The design frequency that the control component pipeline of the secondary Cache can realize is increased due to the reduction of the maximum delay of the platform 2.
Drawings
FIG. 1 is a schematic diagram of a prior art pipelined station for requests;
FIG. 2 is a schematic diagram of a prior art STAG interface requesting access through a pipeline;
FIG. 3 is a schematic diagram of a pipeline station for a single error retry procedure in accordance with the present invention;
FIG. 4 is a schematic diagram of a STAG interface for requesting access through a pipeline in accordance with the present invention;
FIG. 5 is a flow chart showing a method for determining whether a single error retry is required in the present invention.
Detailed Description
The invention will be further illustrated with reference to specific examples. It is to be understood that these examples are illustrative of the present invention and are not intended to limit the scope of the present invention. Further, it is understood that various changes and modifications may be made by those skilled in the art after reading the teachings of the present invention, and such equivalents are intended to fall within the scope of the claims appended hereto.
The embodiment of the invention relates to a check list error processing method of a secondary cache mark array, which is shown in fig. 1 and comprises the following steps: performing read data and hit judgment of the check mark array in parallel; comprehensively judging whether the request can be normally executed according to the checking result and the hit result; and for the request which cannot be normally executed, entering a retry processing flow to obtain correct data.
The embodiment improves the existing STAG check single error processing method, the access condition of the request on the pipeline of the second-level Cache control unit is shown in fig. 3, the request passes through the pipeline and the read-write interface of the STAG is shown in fig. 4, and the specific operation is as follows:
1) Requesting to arbitrate a pipeline of a second-level Cache control component at a station 0 to obtain a port authority for accessing the STAG;
2) At station 1, requesting to read the STAG, obtaining read data;
3) At the platform 2, checking the read data, if single errors exist, generating corrected data, and judging hit secondary Cache;
4) At station 3, the execution of the request is determined by combining the check result and the hit result. The determination can be made according to the flow shown in fig. 5:
a) If the verification finds multiple errors, the hardware cannot correct, and the machine checking errors are processed by the operating system. At this point, the request is considered to be ended;
b) If the check finds a single error, the request is ended, and the processing is carried out according to the hit result;
c) If the check finds single error and does not hit, the hit result is not credible, and a single error retry flow is required to be entered;
d) If the check finds a single error and the single error is on the hit way, the hit result is not credible, and a single error retry flow is required to be entered;
e) If the check finds a single error and the single error is not in the hit way, the hit result is reliable, and the request ends normal execution. The hardware corrects the erroneous data write to STAG.
5) The Cache routing hardware with single error rewrites the corrected correct data into the STAG at the station 4;
6) If a single error retry procedure is requested, a retry buffer is written at station 3. And the retry buffer is used for reapplying the access to the pipeline according to the management rule. Obtaining the correct data by revisiting the STAG;
7) The single error retry buffer buffers a single entry to arbitrate for the pipeline with the highest priority. The secondary Cache control unit will not receive a new request until the request for the single-miss retry is completed normally.
8) If the request is normally ended, if the request hits the sce, data is sent to the requesting party at station 3.
Therefore, if the STAG check error does not occur in the request, the processing flow provided by the invention is the same as the pipeline beat compared with the traditional processing method. If the request has single error in the non-hit path, the processing flow provided by the invention is consistent with the access delay of the traditional processing flow. If a single error occurs in the hit way, the processing flow provided by the invention needs to access the STAG through the pipeline twice and adds up to 8 beats of access delay, and compared with the traditional processing flow, the access delay is increased by 4 beats, but the probability of the single error occurs in the hit way is very low, so that the pipeline station of the request is equivalent to the traditional method. Since the hardware is also used to correct the correction order errors, the method is equivalent to the traditional method in terms of the stability and the yield of the chip operation. However, the invention changes the hit judgment and the read data verification which are executed in series in the traditional design into parallel execution, thus shortening the time for obtaining the result, reducing the maximum delay of the platform 2 and improving the design frequency which can be realized by the control component pipeline of the secondary Cache. According to the physical realization result, the design frequency of the pipeline can be improved to about 1.6GHz, and compared with the traditional method, the frequency is improved by about 60 percent.

Claims (3)

1. The check list error processing method of the second-level cache mark array is characterized by comprising the following steps of:
(1) Performing read data and hit judgment of the check mark array in parallel;
(2) Comprehensively judging whether the request can be normally executed according to the checking result and the hit result; if the check finds a single error, the request is ended, and the processing is carried out according to the hit result; if the hit result is no hit or a single miss is on the hit path, the hit result is not trusted and cannot be normally executed; if the hit result is a single error and is not in the hit path, the hit result is reliable and can be normally executed;
(3) For a request which cannot be normally executed, entering a retry processing flow to obtain correct data, specifically: the request for the retry buffer error is utilized to allow the request retry buffer to re-arbitrate the pipeline of the secondary cache management unit to obtain the correct tag array data.
2. The method for processing single errors in a second level cache tag array according to claim 1, wherein if multiple errors are found in the checking in the step (2), the hardware cannot correct, and the machine check errors are processed by the operating system.
3. The method of claim 1, wherein the request retry buffer is a single entry buffer and the pipeline of the second level cache management unit is arbitrated with the highest priority.
CN201910859104.0A 2019-09-11 2019-09-11 Check list error processing method of secondary cache tag array Active CN110597656B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910859104.0A CN110597656B (en) 2019-09-11 2019-09-11 Check list error processing method of secondary cache tag array

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910859104.0A CN110597656B (en) 2019-09-11 2019-09-11 Check list error processing method of secondary cache tag array

Publications (2)

Publication Number Publication Date
CN110597656A CN110597656A (en) 2019-12-20
CN110597656B true CN110597656B (en) 2023-08-08

Family

ID=68859039

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910859104.0A Active CN110597656B (en) 2019-09-11 2019-09-11 Check list error processing method of secondary cache tag array

Country Status (1)

Country Link
CN (1) CN110597656B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105335247A (en) * 2015-09-24 2016-02-17 中国航天科技集团公司第九研究院第七七一研究所 Fault-tolerant structure and fault-tolerant method for Cache in high-reliability system chip
CN109062613A (en) * 2018-06-01 2018-12-21 杭州中天微系统有限公司 Multicore interconnects L2 cache and accesses verification method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105335247A (en) * 2015-09-24 2016-02-17 中国航天科技集团公司第九研究院第七七一研究所 Fault-tolerant structure and fault-tolerant method for Cache in high-reliability system chip
CN109062613A (en) * 2018-06-01 2018-12-21 杭州中天微系统有限公司 Multicore interconnects L2 cache and accesses verification method

Also Published As

Publication number Publication date
CN110597656A (en) 2019-12-20

Similar Documents

Publication Publication Date Title
US8051337B2 (en) System and method for fast cache-hit detection
US8589763B2 (en) Cache memory system
US9454422B2 (en) Error feedback and logging with memory on-chip error checking and correcting (ECC)
US20140068208A1 (en) Separately stored redundancy
US8566672B2 (en) Selective checkbit modification for error correction
US9727411B2 (en) Method and processor for writing and error tracking in a log subsystem of a file system
US9384091B2 (en) Error code management in systems permitting partial writes
CN103077095B (en) Error correction method and device for stored data and computer system
CN115509609A (en) Data processing apparatus and method
US8868517B2 (en) Scatter gather list for data integrity
US9323610B2 (en) Non-blocking commands
CN111625199A (en) Method and device for improving reliability of data path of solid state disk, computer equipment and storage medium
US9329926B1 (en) Overlapping data integrity for semiconductor devices
CN114741250A (en) System and method for validating a multi-level cache
US8533560B2 (en) Controller, data storage device and program product
CN110597656B (en) Check list error processing method of secondary cache tag array
CN110955916B (en) Data integrity protection method, system and related equipment
CN108021467B (en) Memory fault tolerance protection method, device, equipment and storage medium
US9288161B2 (en) Verifying the functionality of an integrated circuit
US6567952B1 (en) Method and apparatus for set associative cache tag error detection
US20120023388A1 (en) Parity Look-Ahead Scheme for Tag Cache Memory
CN112181703B (en) CAM supporting soft error retransmission mechanism between capacity processor and memory board and application method
US9348744B2 (en) Implementing enhanced reliability of systems utilizing dual port DRAM
US10579470B1 (en) Address failure detection for memory devices having inline storage configurations
US8533565B2 (en) Cache controller and cache controlling method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant