CN111833961A - On-chip memory online fault diagnosis system and method - Google Patents

On-chip memory online fault diagnosis system and method Download PDF

Info

Publication number
CN111833961A
CN111833961A CN202010991704.5A CN202010991704A CN111833961A CN 111833961 A CN111833961 A CN 111833961A CN 202010991704 A CN202010991704 A CN 202010991704A CN 111833961 A CN111833961 A CN 111833961A
Authority
CN
China
Prior art keywords
memory
chip
controller
chip memory
data backup
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010991704.5A
Other languages
Chinese (zh)
Other versions
CN111833961B (en
Inventor
张力航
仇雨菁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Semidrive Technology Co Ltd
Original Assignee
Nanjing Semidrive Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Semidrive Technology Co Ltd filed Critical Nanjing Semidrive Technology Co Ltd
Priority to CN202010991704.5A priority Critical patent/CN111833961B/en
Publication of CN111833961A publication Critical patent/CN111833961A/en
Application granted granted Critical
Publication of CN111833961B publication Critical patent/CN111833961B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11CSTATIC STORES
    • G11C29/00Checking stores for correct operation ; Subsequent repair; Testing stores during standby or offline operation
    • G11C29/56External testing equipment for static stores, e.g. automatic test equipment [ATE]; Interfaces therefor

Landscapes

  • For Increasing The Reliability Of Semiconductor Memories (AREA)
  • Techniques For Improving Reliability Of Storages (AREA)

Abstract

An on-chip memory online fault diagnosis system comprises a memory access controller, a data backup controller, an address resolver, an interleaver, a static memory tester and an on-chip memory, wherein the memory access controller converts a received system request into a required address signal, a required write data signal, a required read-write control signal and a required read data signal; the data backup controller backs up the contents of the memory groups needing to be backed up to the redundancy groups; the address resolver distributes the access request of the storage access controller to a storage logic group; the interleaver maps access requests to logical groupings of memory to physical groupings of memory; the static memory tester is used for periodically testing the on-chip memory; the on-chip memory is used for storing and reading data. The fault diagnosis coverage rate of the on-chip memory is improved, and the availability and the performance of the system are not sacrificed.

Description

On-chip memory online fault diagnosis system and method
Technical Field
The invention relates to the technical field of integrated circuits, in particular to an on-chip memory online fault diagnosis system and method.
Background
Under the trends of automobile electromotion, intellectualization, networking and sharing, the automobile electronic and electric system is facing the change of the covered land. A completely new system architecture requires a higher performance and integration level core controller (such as SoC and MCU) to support. With the further improvement of the integration level of the vehicle-mounted SoC and the MCU, the capacity of the internal memory of the chip is significantly improved, and the probability of the functional failure of the internal memory of the chip is also improved, so a novel on-chip memory diagnosis structure is needed to implement the fault diagnosis with high coverage, and the performance and function of the system during operation are not interfered by the fault diagnosis process.
Currently, ECC (Error Checking and Correcting) memory and built-in memory self-test technology is widely applied in conventional vehicle electronic systems, especially in the core controller. With the continuous improvement of the integration of electronic systems, the density of the internal memory is rapidly increased, which requires a more advanced security mechanism to ensure the functional security of the memory. Meanwhile, the performance requirements of the core controller in the electronic system are continuously increased, and a test method for increasing the failure diagnosis rate of the internal memory of the core controller without sacrificing the system performance and the availability is needed.
In the prior art, the fault diagnosis of the on-chip memory has the following modes:
1. the Memory on the chip is integrally tested by using a built-In Self-Test (MBIST) circuit In the process of starting or shutting down the system. The testing method can effectively and comprehensively test the permanent failure of the memory, but the defects are obvious. Because the test is only carried out when the system is started or shut down, permanent failure and random failure generated when the system runs cannot be detected.
2. By adding extra storage space to store data-related check bits, permanent and random failures of the memory are detected and repaired by adopting an error correction Code (Error correction Code) mode. This approach can compensate for the shortcomings of scheme 1 to some extent, but since ECC can only detect limited failure modes, its diagnostic coverage cannot meet the requirements of scenarios requiring high functional security levels.
3. To make up for the deficiency of scheme 2, scheme 1 may be combined to perform periodic MBIST tests on the on-chip memory during system operation to improve diagnostic coverage. But due to the nature of MBIST, the original data content of the memory is lost after the test is completed. This requires the system to backup the memory data before testing and restore the memory data after testing. While interrupting all of the memory user's work during the test. Although diagnostic coverage is improved, the availability and performance of the system will be greatly reduced. Fig. 1 is a schematic structural diagram of an on-chip memory online fault diagnosis system of a conventional core processor, and as shown in fig. 1, in the on-chip memory online fault diagnosis system of the conventional core processor, a memory access controller receives a system request, converts the request into an address, write data, read-write control and read data signals required by the memory access controller; then the signals are directly connected with a memory, so that the operations of writing and reading data in the memory are completed; when the memory is tested periodically, the memory access controller is suspended, the static memory tester tests the memory, and the memory control authority is handed back to the memory access controller after the test is finished. At the same time the content preceding the memory is all corrupted.
Disclosure of Invention
In order to solve the defects of the prior art, the invention aims to provide an on-chip memory online fault diagnosis system and method, wherein a memory group is added, the content of the group is backed up to a redundancy group by adopting a mode of floating the sequence number of the redundancy memory group, the fault diagnosis coverage rate of the on-chip memory is improved, and the availability and the performance of the system are not sacrificed.
In order to achieve the above object, the present invention provides an on-chip memory online failure diagnosis system, comprising a memory access controller, a data backup controller, an address resolver, an interleaver, a static memory tester, and an on-chip memory, wherein,
the memory access controller converts the received system request into a required address signal, a write data signal, a read-write control signal and a read data signal;
the data backup controller backs up the contents of the memory groups needing to be backed up to the redundancy groups;
the address resolver distributes the access request of the storage access controller to a storage logic group;
the interleaver maps access requests to logical groupings of memory to physical groupings of memory;
the static memory tester is used for periodically testing the on-chip memory;
the on-chip memory is used for storing and reading data.
The system further comprises an arbiter which arbitrates the access requests of the storage access controller and the data backup controller.
Further, the arbiter dynamically adjusts the priority of the storage access controller and the data backup controller according to the function and the fault tolerance time interval.
Further, the data backup controller backs up the contents of the grouping to a redundancy grouping before the physical grouping of memories is tested.
Further, the data backup controller monitors the write operation of the storage access controller during the data backup process.
Furthermore, the data backup controller performs backup of the packet data by adopting a redundant packet sequence number floating mode.
Further, the interleaver maps access requests for N logical groups of memory to N +1 physical groups of memory.
In order to achieve the above object, the present invention also provides an on-chip memory online fault diagnosis method, comprising the steps of,
grouping on-chip memories;
backing up the content of a memory packet to a redundant packet, and testing the memory packet;
and after the test is finished, testing the next memory group by taking the memory group as a new redundant group.
Further, the grouping of on-chip memories includes,
dividing the on-chip memory into a plurality of independent memory groups of equal size; according to the principle that the low order and the high order of the address are simultaneously interleaved, the access request is distributed to different groups with the highest probability;
when grouping is performed, one of the packets is selected as a redundant packet.
Furthermore, the content of the memory grouping is backed up to the redundancy grouping by adopting a mode of floating the sequence number of the redundancy grouping.
The on-chip memory online fault diagnosis system and method provided by the invention have the following technical effects:
1) online memory fault diagnosis is achieved with less additional memory overhead;
2) the diagnosis process does not damage the stored content of the memory;
3) the suspension time of normal functions caused by fault diagnosis is greatly reduced (even eliminated), and the system availability is improved;
4) reducing memory aging effects caused by increased memory access from online diagnostics;
5) providing an extremely high failure diagnosis rate and high coverage for the memory inside the core controller.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
FIG. 1 is a schematic diagram of a conventional on-chip memory online fault diagnosis system of a core processor;
FIG. 2 is a schematic structural diagram of an on-chip memory online fault diagnosis system according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a second embodiment of an on-chip memory online fault diagnosis system according to the present invention;
FIG. 4 is a flowchart of an on-chip memory online fault diagnosis method according to the present invention.
Detailed Description
The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation.
The on-chip memory online fault diagnosis system and method of the invention are used in a core controller with high functional safety requirement, in the embodiment of the invention, the core controller includes but is not limited to: and micro-processing control chips such as SoC (System on Chip) and MCU (micro controller Unit) are provided.
Fig. 2 is a schematic structural diagram of an on-chip memory online fault diagnosis system according to an embodiment of the present invention, and as shown in fig. 2, the on-chip memory online fault diagnosis system of the present invention includes a memory access controller 101, a data backup controller 102, an address parser 103, an interleaver 104, an arbiter 105, a static memory tester 106, and an on-chip memory 107, wherein,
and a memory access controller 101, which is connected to the data backup controller 102 and the address parser 103, respectively, and converts the received system request into a desired address signal, write data signal, read/write control signal, and read data signal.
And a data backup controller 102 that backs up the contents of memory packets that need to be backed up in the on-chip memory 107 to redundancy packets and monitors the write operation of the memory access controller 101.
In the embodiment of the present invention, the data backup controller 102 needs to backup the contents of a certain memory physical grouping to a redundancy grouping before testing the group. The invention adopts the mode of floating the grouping serial number of the redundant memory. The serial number of the first redundant group is N, the N backup group 0 is used, and the group N replaces the group 0 after the backup is finished; after the test of the group 0 is finished, the group 0 is used as a redundant group, and then the group 0 is used for backing up the group 1; and then cyclically reciprocate. Because of the adoption of the floating redundant grouping, the access to each memory grouping introduced by data backup is balanced, thereby achieving the aim of memory balanced aging.
In the embodiment of the present invention, during the data backup process, the data backup controller 102 may change the content of the backup object by the host. The data backup controller 102 needs to monitor the write operation of the storage access controller, and if the data update occurs to the address segment already backed up, the content of the address corresponding to the redundancy packet needs to be updated at the same time.
An address parser 103 that receives access requests such as address signals, write data signals, and read/write control signals of the memory access controller 101 and distributes them to logical groups of N memories through an arbiter 105.
An interleaver 104 that maps accesses to logical groupings of memory to physical groupings of memory.
In embodiments of the present invention, since each memory packet may be assigned to a different logical packet at different times, the interleaver maps accesses to N logical memory packets to N +1 physical memory packets depending on the current situation.
A static memory tester 106 that periodically tests the on-chip memory 107.
And the on-chip memory 107 receives access requests of the memory access controller 101 and the data backup controller 102, and stores and reads data.
Fig. 3 is a schematic structural diagram of an on-chip memory online fault diagnosis system according to an embodiment of the present invention, and as shown in fig. 3, the on-chip memory online fault diagnosis system further includes an arbiter 105 that arbitrates access requests of the memory access controller 101 and the data backup controller 102, and dynamically adjusts priorities of the memory access controller 101 and the data backup controller 102.
In the embodiment of the present invention, when the storage access controller 101 and the data backup controller 102 need to access one memory group at the same time, the access of the storage access controller 101 and the data backup controller 102 is arbitrated; the priority of the storage access controller 101 and the data backup controller 102 may be dynamically adjusted by the storage access controller 101 according to the function and the Fault Tolerance Time Interval (FTTI).
Fig. 4 is a flowchart of an on-chip memory online fault diagnosis method according to the present invention, and the on-chip memory online fault diagnosis method of the present invention will be described in detail with reference to fig. 4.
First, in step 401, grouping the on-chip memories, and adding a memory redundancy group for storing the original data of the memories during the test to the memory groups.
In the embodiment of the present invention, the on-chip memory 107 receives the instruction of the system, and groups the on-chip memory as follows:
dividing a memory into N independent groups with equal size, wherein each group can be used for carrying out rapid full coverage rate test by using a traditional MBIST algorithm; the access of the core controller is distributed to different packets with the highest probability according to the principle that the low bits and the high bits of the address are simultaneously interleaved. If the memory of 1MB is divided into 16 packets each with a packet size of 64KB, if the bit width of the data accessed by the host is 32-bit, then the access addresses Addr [19:18], Addr [3:2] } in Addr [19:0] are equal to 0, 1, 2, …, 15 are distributed to the packets 0, 1, 2, …, 15 correspondingly; when grouping is carried out, a group N is added to save the original data of the memory in the testing process.
In step 402, the contents of the memory packets are backed up to redundancy packets and the memory packets are tested.
In the embodiment of the present invention, the data backup controller 102 needs to backup the contents of a certain memory physical grouping to a redundancy grouping before testing the group. The invention adopts the mode of floating the grouping serial number of the redundant memory. The serial number of the first redundant group is N, the N backup group 0 is used, and the group N replaces the group 0 after the backup is finished; after the test of the group 0 is finished, the group 0 is used as a redundant group, and then the group 0 is used for backing up the group 1; and then cyclically reciprocate. Because of the adoption of the floating redundant grouping, the access to each memory grouping introduced by data backup is balanced, thereby achieving the aim of memory balanced aging.
In the embodiment of the present invention, the data backup controller 102 needs to monitor the write operation of the storage access controller, and if the data of the backed up address segment is updated, the content of the address corresponding to the redundant packet needs to be updated at the same time.
In embodiments of the present invention, since each memory packet may be assigned to a different logical packet at different times, the interleaver 104 maps accesses to N logical memory packets to N +1 physical memory packets depending on the current situation.
In the embodiment of the present invention, the arbiter 105 needs to arbitrate the access requests of the storage access controller 101 and the data backup controller 102, and dynamically adjust the priorities of the storage access controller 101 and the data backup controller 102.
In the embodiment of the present invention, the static memory tester 106 periodically tests all the packets of the on-chip memory 107. For example, if the serial number of the first redundant packet is N, the packet N is used to backup the content of the packet 0, and the packet N replaces the packet 0 after the backup is completed; after the test of the group 0 is finished, the group 0 is used as a redundant group, and then the group 0 is used for backing up the group 1; and then cyclically reciprocate.
The invention relates to an on-chip memory online fault diagnosis system and a method thereof.A memory group N is additionally arranged to store the original data of a memory in the test process, and before the memory physical group is tested, the content of the group needs to be backed up to a redundancy group by adopting the mode of floating the serial number of the redundancy memory group, thereby not generating interference on the normal function access memory of the system and not damaging the original memory content of the memory.
Those of ordinary skill in the art will understand that: although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that changes may be made in the embodiments and/or equivalents thereof without departing from the spirit and scope of the invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. An on-chip memory online failure diagnosis system is characterized by comprising a memory access controller, a data backup controller, an address parser, an interleaver, a static memory tester and an on-chip memory, wherein,
the memory access controller converts the received system request into a required address signal, a write data signal, a read-write control signal and a read data signal;
the data backup controller backs up the contents of the memory groups needing to be backed up to the redundancy groups;
the address resolver distributes the access request of the storage access controller to a storage logic group;
the interleaver maps access requests to logical groupings of memory to physical groupings of memory;
the static memory tester is used for periodically testing the on-chip memory;
the on-chip memory is used for storing and reading data.
2. The on-chip memory online fault diagnosis system according to claim 1, further comprising an arbiter that arbitrates access requests of the memory access controller and the data backup controller.
3. The on-chip memory online failure diagnostic system of claim 2, wherein the arbiter dynamically adjusts the priority of the memory access controller and the data backup controller according to function and failure tolerance time interval.
4. The on-chip memory online failure diagnostic system of claim 1, wherein the data backup controller backs up contents of a physical grouping of memory to a redundancy grouping before the grouping is tested.
5. The on-chip memory online failure diagnosis system according to claim 4, wherein the data backup controller monitors the write operation of the storage access controller during data backup.
6. The system of claim 4, wherein the data backup controller performs backup of the packet data by using a redundant packet sequence number floating manner.
7. The on-chip memory online failure diagnostic system of claim 1, wherein the interleaver maps access requests to N logical groupings of memory to N +1 physical groupings of memory.
8. An on-chip memory online fault diagnosis method comprises the following steps,
grouping on-chip memories;
backing up the content of a memory packet to a redundant packet, and testing the memory packet;
and after the test is finished, testing the next memory group by taking the memory group as a new redundant group.
9. The on-chip memory online fault diagnosis method according to claim 8, wherein the grouping of the on-chip memories comprises,
dividing the on-chip memory into a plurality of independent memory groups of equal size; according to the principle that the low order and the high order of the address are simultaneously interleaved, the access request is distributed to different groups with the highest probability;
when grouping is performed, one of the packets is selected as a redundant packet.
10. The on-chip memory online fault diagnosis method according to claim 8, wherein the backing up the contents of the memory packets to the redundancy packets is performed by backing up the contents of the memory packets to the redundancy packets in a manner that the sequence numbers of the redundancy packets float.
CN202010991704.5A 2020-09-21 2020-09-21 On-chip memory online fault diagnosis system and method Active CN111833961B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010991704.5A CN111833961B (en) 2020-09-21 2020-09-21 On-chip memory online fault diagnosis system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010991704.5A CN111833961B (en) 2020-09-21 2020-09-21 On-chip memory online fault diagnosis system and method

Publications (2)

Publication Number Publication Date
CN111833961A true CN111833961A (en) 2020-10-27
CN111833961B CN111833961B (en) 2021-03-23

Family

ID=72918478

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010991704.5A Active CN111833961B (en) 2020-09-21 2020-09-21 On-chip memory online fault diagnosis system and method

Country Status (1)

Country Link
CN (1) CN111833961B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115424658A (en) * 2022-11-01 2022-12-02 南京芯驰半导体科技有限公司 Storage unit testing method and device, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102034555A (en) * 2011-01-19 2011-04-27 哈尔滨工业大学 On-line error correcting device for fault by parity check code and method thereof
CN104143359A (en) * 2013-05-10 2014-11-12 全视技术有限公司 On-Line Memory Testing System And Method
CN104412327A (en) * 2013-01-02 2015-03-11 默思股份有限公司 Built in self-testing and repair device and method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102034555A (en) * 2011-01-19 2011-04-27 哈尔滨工业大学 On-line error correcting device for fault by parity check code and method thereof
CN104412327A (en) * 2013-01-02 2015-03-11 默思股份有限公司 Built in self-testing and repair device and method
CN104143359A (en) * 2013-05-10 2014-11-12 全视技术有限公司 On-Line Memory Testing System And Method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115424658A (en) * 2022-11-01 2022-12-02 南京芯驰半导体科技有限公司 Storage unit testing method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN111833961B (en) 2021-03-23

Similar Documents

Publication Publication Date Title
US7200770B2 (en) Restoring access to a failed data storage device in a redundant memory system
US6754117B2 (en) System and method for self-testing and repair of memory modules
US8745450B2 (en) Fully-buffered dual in-line memory module with fault correction
JP5327484B2 (en) Method and apparatus for repairing high capacity / high bandwidth memory devices
US7304875B1 (en) Content addressable memory (CAM) devices that support background BIST and BISR operations and methods of operating same
US5109360A (en) Row/column address interchange for a fault-tolerant memory system
US5745673A (en) Memory architecture for solid state discs
US20050210186A1 (en) Semiconductor device
US20080282037A1 (en) Method and apparatus for controlling cache
CA1059239A (en) Memory diagnostic arrangement
EP1416499A1 (en) Self-repairing built-in self test for linked list memories
US9262284B2 (en) Single channel memory mirror
KR920001104B1 (en) Address line error test method
CN112543909A (en) Enhanced codewords for media persistence and diagnostics
CN111833961B (en) On-chip memory online fault diagnosis system and method
US20080013389A1 (en) Random access memory including test circuit
US7418636B2 (en) Addressing error and address detection systems and methods
JP2013250690A (en) Data processor, microcontroller, and self-diagnosis method of data processor
JPH1011348A (en) Controller for dram, and the dram
US11726864B2 (en) Data processing device and data processing method
US7437627B2 (en) Method and test device for determining a repair solution for a memory module
CN115220960A (en) DDR dual inline memory module, repairable memory system and operation method thereof
CN116992814A (en) Chip and electronic equipment
KR20230147684A (en) Logical memory recovery using shared physical memory
WO2008018989A2 (en) Fully- buffered dual in-line memory module with fault correction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant