JP2021111325A

JP2021111325A - Memory system for selecting counter-error operation by analyzing errors previously occurred, and data processing system including memory system

Info

Publication number: JP2021111325A
Application number: JP2020130222A
Authority: JP
Inventors: 應輔沈; Eung Bo Shim; 南永安; Nam Young Ahn
Original assignee: SK Hynix Inc
Current assignee: SK Hynix Inc
Priority date: 2020-01-07
Filing date: 2020-07-31
Publication date: 2021-08-02
Also published as: CN113157484A; US11609813B2; KR20210088916A; DE102020208450A1; US20210208966A1

Abstract

To provide a data processing system for selecting a counter-error operation by analyzing errors previously occurred.SOLUTION: The data processing system includes: a memory system 110 including a plurality of memory devices 1501 to 1508, each of which includes a first error correction unit and a plurality of cell array regions (memory banks BK<1> to <4>) in which a plurality of memory cells are coupled in an array form; and a host 102 including a second error correction unit which corrects an error of data transferred from the memory system, generating error correction information on the error correction operation, setting an error strength for each of the plurality of memory devices using the error correction information and log information generated in each of the plurality of memory devices, and performing a counter-error operation on each of the plurality of memory devices according to the error correcting strength. In each of the plurality of memory devices, an error of access data occurred during access operation with respect to the plurality of cell array regions is corrected by the first error correction unit so as to generate log information.SELECTED DRAWING: Figure 1A

Description

本発明は、データ処理システムに関し、具体的に、以前に発生したエラー分析を介してエラー対応動作を選択できるメモリシステム及びメモリシステムを含むデータ処理システムに関する。 The present invention relates to a data processing system, and specifically to a data processing system including a memory system and a memory system capable of selecting an error handling operation through a previously generated error analysis.

コンピュータ装置あるいは有無線電子装置、例えば、サーバ、デスクトップコンピュータ、ラップトップコンピュータのようなコンピュータ装置あるいは携帯電話、ゲーム機、ＴＶ、プロジェクタなどのような電子装置は、動作過程で非常に多くのデータを生成し、処理することができる。このように、動作過程で生成され、処理されるデータを格納するために、一般的に、メモリ装置を利用するメモリシステム、言い換えれば、データ格納装置を使用できる。データ格納装置は、コンピュータ装置あるいは電子装置の主記憶装置または補助記憶装置として使用されることができる。 Computer devices or wireless electronic devices, such as computers such as servers, desktop computers, laptop computers, or electronic devices such as mobile phones, game consoles, televisions, projectors, etc., carry a great deal of data in the process of operation. Can be generated and processed. In this way, a memory system that generally utilizes a memory device, in other words, a data storage device, can be used to store the data generated and processed in the operation process. The data storage device can be used as a main storage device or an auxiliary storage device of a computer device or an electronic device.

一方、メモリシステムには、複数のメモリ装置が含まれ得るし、複数のメモリ装置からデータを書き込み／読み出しする過程で一部データが正常に書き込み／読み出しされないエラーが発生する可能性がある。一般的に、エラーを復旧するアルゴリズムを介してエラーが発生したほとんどのデータを正常なデータとして処理が可能ではある。しかし、エラーを復旧するアルゴリズムを介しても処理できない深刻なエラーが存在する可能性があり、このような深刻なエラーが発生する場合、メモリシステム全体の信頼性を大きく低下させることがある。 On the other hand, the memory system may include a plurality of memory devices, and there is a possibility that an error may occur in which some data is not normally written / read in the process of writing / reading data from the plurality of memory devices. In general, it is possible to process most of the data in which an error has occurred as normal data via an error recovery algorithm. However, there may be serious errors that cannot be handled through the error recovery algorithm, and when such serious errors occur, the reliability of the entire memory system may be significantly reduced.

したがって、深刻なエラーがある時点で、ある領域で発生するかを予め予測する動作は極めて重要でありうる。しかし、従来では、エラーの発生回数をカウントする動作を介して統計的な方式でエラーの発生を予測したため、その正確度が非常に低くなるという問題点があった。 Therefore, the action of predicting in advance whether a serious error will occur in a certain area at a certain point in time can be extremely important. However, in the past, since the occurrence of an error was predicted by a statistical method through the operation of counting the number of occurrences of the error, there was a problem that the accuracy was very low.

本発明の実施形態は、複数のメモリ装置を含むメモリシステムにおいて、以前に発生したエラーを分析して複数のメモリ装置の各々にエラー対応動作を行うことができる装置及び方法を含むデータ処理システムを提供する。 In the embodiment of the present invention, in a memory system including a plurality of memory devices, a data processing system including a device and a method capable of analyzing a previously generated error and performing an error handling operation for each of the plurality of memory devices is provided. offer.

本発明の実施形態に係るデータ処理システムは、複数のワードライン及び複数のビットラインに複数のメモリセルがアレイ形態で接続された複数のセルアレイ領域及び第１のエラー訂正部を各々備える複数のメモリ装置を含むメモリシステムと、第２のエラー訂正部を備え、前記メモリシステムから伝達されたデータのエラーを前記第２のエラー訂正部が訂正し、前記第２のエラー訂正部のエラー訂正動作に対するエラー訂正情報を生成し、前記エラー訂正情報及び前記複数のメモリ装置の各々で生成されたログ（ｌｏｇ）情報を利用して前記複数のメモリ装置の各々に対してエラー強度を設定し、前記エラー強度によって前記複数のメモリ装置の各々に対してエラー対応動作を行うホストとを備え、前記複数のメモリ装置の各々は、前記複数のセルアレイ領域に対するアクセス動作中に発生したアクセスデータのエラーを前記第１のエラー訂正部が訂正し、前記第１のエラー訂正部のエラー訂正動作に対する前記ログ（ｌｏｇ）情報を生成できる。 The data processing system according to the embodiment of the present invention has a plurality of memories each including a plurality of cell array areas in which a plurality of memory cells are connected in an array form to a plurality of word lines and a plurality of bit lines, and a first error correction unit. A memory system including a device and a second error correction section are provided, and the second error correction section corrects an error in data transmitted from the memory system, with respect to an error correction operation of the second error correction section. The error correction information is generated, and the error strength is set for each of the plurality of memory devices by using the error correction information and the log information generated by each of the plurality of memory devices, and the error is described. Each of the plurality of memory devices includes a host that performs an error handling operation for each of the plurality of memory devices depending on the strength, and each of the plurality of memory devices causes an error of access data generated during the access operation for the plurality of cell array areas. The error correction unit of 1 can correct the data, and the log information for the error correction operation of the first error correction unit can be generated.

また、前記複数のメモリ装置の各々は、読み出し／書き込み動作を含むアクセス動作の実行中、アクセスデータにエラーが発生する場合、発生したエラーを訂正するために、内部に含まれた前記第１のエラー訂正部を動作させ、前記第１のエラー訂正部によりエラーが訂正されたデータのローデータ（ｒａｗｄａｔａ）を内部の情報格納領域に累積、格納して前記ログ情報を生成し、前記ホストの要請に応じて前記ログ情報を前記メモリシステムを介して前記ホストに出力することができる。 Further, when an error occurs in the access data during the execution of the access operation including the read / write operation, each of the plurality of memory devices includes the first one internally in order to correct the error. The error correction unit is operated, and the raw data (raw data) of the data whose error has been corrected by the first error correction unit is accumulated and stored in the internal information storage area to generate the log information, and the host The log information can be output to the host via the memory system upon request.

また、前記ホストは、前記エラー訂正情報をリアルタイムまたは設定された時点で収集し、前記設定された時点で前記メモリシステムから前記ログ情報を収集するエラー収集部と、前記ログ情報及び前記エラー訂正情報を分析して、前記複数のメモリ装置の各々で発生したエラーの個数及び種類を確認し、確認結果に応じて前記複数のメモリ装置の各々に対するエラー等級を決定する第１のエラー分析部と、前記第１のエラー分析部で決定されたエラー等級によって前記複数のメモリ装置のうち、一部のメモリ装置に対しては、前記ログ情報及び前記エラー訂正情報の追加分析を介してエラーの形態及び個数を確認してエラー強度を決定し、残りのメモリ装置に対しては、前記第１のエラー分析部で決定されたエラー等級に対応するようにエラー強度を決定する第２のエラー分析部と、前記第２のエラー分析部で決定されたエラー強度によって前記複数のメモリ装置の各々に対して前記エラー対応動作を行う対応動作部とを備えることができる。 Further, the host collects the error correction information in real time or at a set time, and collects the log information from the memory system at the set time, and the log information and the error correction information. The first error analysis unit, which confirms the number and types of errors generated in each of the plurality of memory devices, and determines the error grade for each of the plurality of memory devices according to the confirmation result. Among the plurality of memory devices according to the error grade determined by the first error analysis unit, for some of the memory devices, the error form and the error form and the error form and the error are obtained through additional analysis of the log information and the error correction information. A second error analysis unit that confirms the number and determines the error intensity, and for the remaining memory devices, determines the error intensity so as to correspond to the error grade determined by the first error analysis unit. It is possible to provide a corresponding operation unit that performs the error handling operation for each of the plurality of memory devices according to the error intensity determined by the second error analysis unit.

また、前記第１のエラー分析部は、前記複数のメモリ装置のうち、内部で発生したエラーの個数が第１基準個数以上であるメモリ装置を第１のメモリ装置に区分し、前記第１のメモリ装置のうち、発生したエラーの種類が第２基準個数以上のワードラインで発生した第１のエラーである場合、対応する前記第１のメモリ装置を第１のエラー等級を有する第２のメモリ装置に区分し、前記第１のメモリ装置のうち、発生したエラーの種類が前記第１のエラーでない他の種類のエラーである場合、対応する前記第１のメモリ装置を第２のエラー等級を有する第３のメモリ装置に区分することができる。 Further, the first error analysis unit classifies the memory devices in which the number of errors generated internally is equal to or greater than the first reference number among the plurality of memory devices into the first memory device, and the first memory device is described. Among the memory devices, when the type of error generated is the first error generated in the word line of the second reference number or more, the corresponding first memory device is the second memory having the first error grade. When the first memory device is classified into devices and the type of error that has occurred is an error of another type other than the first error, the corresponding first memory device is given a second error grade. It can be classified into a third memory device having.

また、前記第１及び第２のエラー訂正部の各々は、前記複数のメモリ装置の各々から入出力されるデータに対するエラー訂正動作をエラー訂正コード（ＥＣＣ、ＥｒｒｏｒＣｏｒｒｅｃｔｉｏｎＣｏｄｅ）が含まれたコードワード（ｃｏｄｅｗｏｒｄ）単位で行い、前記第２のエラー分析部は、前記第２のメモリ装置のうち、発生したエラーの形態が第３基準個数以上のコードワード単位にまたがり（ａｃｒｏｓｓ）、含まれたエラー個数の合計が第４基準個数以上である場合、対応する前記第２のメモリ装置を第１のエラー強度を有する第４のメモリ装置に区分し、前記第２のメモリ装置のうち、発生したエラーの形態が、前記第３基準個数以上のコードワードにまたがり、含まれたエラー個数の合計が前記第４基準個数未満である場合、または前記第３基準個数未満のコードワードにまたがる場合、対応する前記第２のメモリ装置を第２のエラー強度を有する第５のメモリ装置に区分し、前記第３のメモリ装置に前記第２のエラー強度を付与して前記第５のメモリ装置に区分することができる。 Further, each of the first and second error correction units is a code word including an error correction code (ECC, Error Correction Code) for error correction operation for data input / output from each of the plurality of memory devices. This is performed in (code word) units, and the second error analysis unit includes the second memory device in which the form of the error that has occurred spans (aclosses) the code word units of the third reference number or more. When the total number of errors is equal to or greater than the fourth reference number, the corresponding second memory device is classified into the fourth memory device having the first error strength, and the second memory device is generated. When the form of the error spans the code words of the third reference number or more and the total number of errors included is less than the fourth reference number, or when the code words span the code words less than the third reference number. The second memory device is classified into a fifth memory device having a second error strength, and the third memory device is given the second error strength to be classified into the fifth memory device. be able to.

また、前記対応動作部は、前記第４のメモリ装置でエラーが発生した領域を選択してアクセスを遮断する動作と、前記第４のメモリ装置でエラーが発生した領域を選択してリペア（ｒｅｐａｉｒ）する動作と、前記第４のメモリ装置でエラーが発生した領域を選択してディセーブル（ｄｉｓａｂｌｅ）させる動作とのうち、いずれか１つの動作を前記第４のメモリ装置の状態に応じて前記エラー対応動作として選択して行うことができる。 Further, the corresponding operation unit selects an area in which an error has occurred in the fourth memory device to block access, and selects an area in which an error has occurred in the fourth memory device to repair (repair). ) And the operation of selecting and disabling the area in which the error occurred in the fourth memory device, the operation of any one of the operations is performed according to the state of the fourth memory device. It can be selected and performed as an error handling operation.

また、前記ホストは、前記メモリシステムに電源が供給された時点から特定時間間隔毎に繰り返される時点を前記設定された時点として指定する動作と、前記メモリシステムに対するアクセス動作中にアクセスデータで発生したエラー個数をカウントして、第５基準個数を超過する度に、超過する時点を前記設定された時点として指定した後、エラー個数のカウントを初期化する動作と、前記メモリシステムに対するアクセス動作中にアクセスデータで発生したエラーを訂正するためのエラー訂正動作に特定時間以上かかる時点を前記設定された時点として指定する動作とのうち、少なくとも１つの動作を選択して行うことができる。 Further, the host generates an operation of designating a time point repeated at specific time intervals from the time when power is supplied to the memory system as the set time point, and an access data during an access operation to the memory system. The number of errors is counted, and each time the number of errors exceeds the fifth reference number, the time point at which the error number is exceeded is specified as the set time point, and then during the operation of initializing the count of the number of errors and the operation of accessing the memory system. At least one operation can be selected from the operation of designating the time point at which the error correction operation for correcting the error generated in the access data takes a specific time or more as the set time point.

本発明の実施形態に係るメモリシステムは、複数のワードライン及び複数のビットラインに複数のメモリセルがアレイ形態で接続された複数のセルアレイ領域及び第１のエラー訂正部を各々備え、前記複数のセルアレイ領域に対するアクセス動作中に発生したアクセスデータのエラーを前記第１のエラー訂正部が訂正し、前記第１のエラー訂正部のエラー訂正動作に対するログ（ｌｏｇ）情報を生成する複数のメモリ装置と、第２のエラー訂正部を備え、前記複数のメモリ装置から伝達されたデータのエラーを前記第２のエラー訂正部が訂正し、前記第２のエラー訂正部のエラー訂正動作に対するエラー訂正情報を生成し、前記ログ情報及び前記エラー訂正情報を利用して前記複数のメモリ装置の各々に対してエラー強度を設定し、前記複数のメモリ装置の各々に対して前記エラー強度に対応するエラー対応動作を行うコントローラとを備えることができる。 The memory system according to the embodiment of the present invention includes a plurality of cell array regions in which a plurality of memory cells are connected in an array form to a plurality of word lines and a plurality of bit lines, and a first error correction section, respectively. With a plurality of memory devices, the first error correction unit corrects an error of access data generated during an access operation to the cell array area, and generates log information for the error correction operation of the first error correction unit. , The second error correction section is provided, and the error of the data transmitted from the plurality of memory devices is corrected by the second error correction section, and the error correction information for the error correction operation of the second error correction section is provided. Generate, use the log information and the error correction information to set an error strength for each of the plurality of memory devices, and perform an error handling operation corresponding to the error strength for each of the plurality of memory devices. It can be provided with a controller that performs the above.

また、前記複数のメモリ装置の各々は、読み出し／書き込み動作を含むアクセス動作の実行中、アクセスデータにエラーが発生する場合、発生したエラーを訂正するために、内部に含まれた前記第１のエラー訂正部を動作させ、前記第１のエラー訂正部によりエラーが訂正されたデータのローデータ（ｒａｗｄａｔａ）を内部の情報格納領域に累積、格納して前記ログ情報を生成し、前記コントローラの要請に応じて前記ログ情報を前記コントローラに出力することができる。 Further, when an error occurs in the access data during the execution of the access operation including the read / write operation, each of the plurality of memory devices includes the first one internally in order to correct the error. The error correction unit is operated, and the raw data (raw data) of the data whose error has been corrected by the first error correction unit is accumulated and stored in the internal information storage area to generate the log information, and the controller of the controller. The log information can be output to the controller upon request.

また、前記コントローラは、前記エラー訂正情報をリアルタイムまたは設定された時点で収集し、前記設定された時点で前記複数のメモリ装置の各々から前記ログ情報を収集するエラー収集部と、前記ログ情報及び前記エラー訂正情報を分析して、前記複数のメモリ装置の各々で発生したエラーの個数及び種類を確認し、確認結果に応じて前記複数のメモリ装置の各々に対するエラー等級を決定する第１のエラー分析部と、前記第１のエラー分析部で決定されたエラー等級によって前記複数のメモリ装置のうち、一部のメモリ装置に対しては、前記ログ情報及び前記エラー訂正情報の追加分析を介してエラーの形態及び個数を確認してエラー強度を決定し、残りのメモリ装置に対しては、前記第１のエラー分析部で決定されたエラー等級に対応するようにエラー強度を決定する第２のエラー分析部と、前記第２のエラー分析部で決定されたエラー強度によって前記複数のメモリ装置の各々に対して前記エラー対応動作を行う対応動作部とを備えることができる。 Further, the controller collects the error correction information in real time or at a set time, and collects the log information from each of the plurality of memory devices at the set time, and the log information and the log information. The first error that analyzes the error correction information, confirms the number and types of errors that have occurred in each of the plurality of memory devices, and determines the error grade for each of the plurality of memory devices according to the confirmation result. For some of the plurality of memory devices according to the error class determined by the analysis unit and the first error analysis unit, the log information and the error correction information are additionally analyzed. A second error strength is determined by confirming the form and number of errors, and for the remaining memory devices, the error intensity is determined so as to correspond to the error grade determined by the first error analysis unit. An error analysis unit and a response operation unit that performs the error response operation for each of the plurality of memory devices according to the error intensity determined by the second error analysis unit can be provided.

また、前記コントローラは、電源が供給された時点から特定時間間隔毎に繰り返される時点を前記設定された時点として指定する動作と、前記複数のメモリ装置に対するアクセス動作中にアクセスデータで発生したエラー個数をカウントして、第５基準個数を超過する度に、超過する時点を前記設定された時点として指定した後、エラー個数のカウントを初期化する動作と、前記複数のメモリ装置に対するアクセス動作中にアクセスデータで発生したエラーを訂正するためのエラー訂正動作に特定時間以上かかる時点を前記設定された時点として指定する動作とのうち、少なくとも１つの動作を選択して行うことができる。 Further, the controller specifies an operation of designating a time point repeated at specific time intervals from the time when power is supplied as the set time point, and the number of errors generated in the access data during the access operation to the plurality of memory devices. Is counted, and each time the fifth reference number is exceeded, the time when the number is exceeded is specified as the set time, and then during the operation of initializing the count of the number of errors and the operation of accessing the plurality of memory devices. At least one operation can be selected from the operation of designating the time point at which the error correction operation for correcting the error generated in the access data takes a specific time or more as the set time point.

また、本発明の実施形態に係るメモリシステムの動作方法は、複数のワードライン及び複数のビットラインに複数のメモリセルがアレイ形態で接続された複数のセルアレイ領域及びエラー訂正部を各々備える複数のメモリ装置を含むメモリシステムの動作方法において、前記複数のメモリ装置の各々に対するアクセス動作中に発生したアクセスデータのエラーを前記エラー訂正部が訂正し、前記エラー訂正部のエラー訂正動作に対するログ（ｌｏｇ）情報を生成する生成ステップと、前記ログ情報を利用して前記複数のメモリ装置の各々に対するエラー等級を設定する分析ステップと、前記エラー等級によって前記複数のメモリ装置の各々に対してエラー対応動作を行う対応ステップとを含むことができる。 Further, the operation method of the memory system according to the embodiment of the present invention includes a plurality of cell array areas in which a plurality of memory cells are connected in an array form to a plurality of word lines and a plurality of bit lines, and a plurality of error correction sections. In the operation method of the memory system including the memory device, the error correction section corrects an error of access data generated during the access operation for each of the plurality of memory devices, and the log (log) for the error correction operation of the error correction section. ) A generation step for generating information, an analysis step for setting an error grade for each of the plurality of memory devices using the log information, and an error handling operation for each of the plurality of memory devices according to the error grade. Can include corresponding steps to perform.

また、前記生成ステップは、前記複数のメモリ装置の各々に対するアクセス動作の実行中、アクセスデータにエラーが発生する場合、発生したエラーを訂正するために前記エラー訂正部を動作させる動作ステップと、前記動作ステップで前記エラー訂正部によりエラーが訂正されたデータに対するローデータ（ｒａｗｄａｔａ）を前記複数のメモリ装置の各々に含まれた情報格納領域に累積、格納して前記ログ情報を生成するステップとを含むことができる。 Further, the generation step includes an operation step of operating the error correction unit in order to correct the error when an error occurs in the access data during the execution of the access operation for each of the plurality of memory devices. In the operation step, the raw data (raw data) for the data whose error has been corrected by the error correction unit is accumulated and stored in the information storage area included in each of the plurality of memory devices, and the log information is generated. Can be included.

また、前記分析ステップは、設定された時点毎に前記情報格納領域に格納された前記ログ情報を収集する収集ステップと、前記収集ステップで収集された前記ログ情報を分析して、前記複数のメモリ装置の各々で発生したエラーの個数及び種類を確認し、確認結果に応じて前記複数のメモリ装置の各々に対するエラー等級を決定するエラー分析ステップとを含むことができる。 In addition, the analysis step analyzes the collection step for collecting the log information stored in the information storage area at each set time point and the log information collected in the collection step, and analyzes the log information, and the plurality of memories. It can include an error analysis step of confirming the number and types of errors that have occurred in each of the devices and determining the error grade for each of the plurality of memory devices according to the confirmation result.

また、前記エラー分析ステップは、前記複数のメモリ装置のうち、内部で発生したエラーの個数が第１基準個数以上であるメモリ装置を第１のメモリ装置に区分するステップと、前記第１のメモリ装置のうち、発生したエラーの種類が第２基準個数以上のワードラインで発生した第１のエラーである場合、対応する前記第１のメモリ装置を第１のエラー等級を有する第２のメモリ装置に区分するステップと、前記第１のメモリ装置のうち、発生したエラーの種類が前記第１のエラーでない他の種類のエラーである場合、対応する前記第１のメモリ装置を第２のエラー等級を有する第３のメモリ装置に区分するステップとを含むことができる。 Further, the error analysis step includes a step of classifying a memory device in which the number of errors generated internally is equal to or greater than the first reference number among the plurality of memory devices into a first memory device, and the first memory. Among the devices, when the type of error that occurred is the first error that occurred in the word line equal to or greater than the second reference number, the corresponding first memory device is the second memory device having the first error grade. If the type of error that occurred is an error of another type other than the first error, the corresponding first memory device is classified into a second error grade. It can include a step of classifying into a third memory device having the above.

また、前記対応ステップは、前記第２のメモリ装置でエラーが発生した領域を選択してアクセスを遮断する動作と、前記第２のメモリ装置でエラーが発生した領域を選択してリペア（ｒｅｐａｉｒ）する動作と、前記第２のメモリ装置でエラーが発生した領域を選択してディセーブル（ｄｉｓａｂｌｅ）させる動作とのうち、いずれか１つの動作を前記第２のメモリ装置の状態に応じて選択して前記エラー対応動作として行うことができる。 Further, in the corresponding step, an operation of selecting an area in which an error has occurred in the second memory device to block access and a repair in selecting an area in which an error has occurred in the second memory device are performed. One of the operation of performing the operation and the operation of selecting and disabling the area in which the error occurred in the second memory device is selected according to the state of the second memory device. This can be performed as the error handling operation.

また、電源が供給された時点から特定時間間隔毎に繰り返される時点を前記設定された時点として指定するステップと、前記複数のメモリ装置に対するアクセス動作中にアクセスデータで発生したエラー個数をカウントして、第５基準個数を超過する度に、超過する時点を前記設定された時点として指定した後、エラー個数のカウントを初期化するステップと、前記複数のメモリ装置に対するアクセス動作中にアクセスデータで発生したエラーを訂正するためのエラー訂正動作に特定時間以上かかる時点を前記設定された時点として指定するステップとのうち、少なくとも１つのステップをさらに含むことができる。 Further, the step of designating the time point repeated at specific time intervals from the time when the power is supplied as the set time point and the number of errors generated in the access data during the access operation to the plurality of memory devices are counted. , Each time the fifth reference number is exceeded, a step of initializing the count of the number of errors after designating the excess time point as the set time point, and an access data generated during the access operation to the plurality of memory devices. It is possible to further include at least one step among the steps of designating a time point at which a specific time or more is required for the error correction operation for correcting the error as the set time point.

本技術は、複数のメモリ装置を含むメモリシステムにおいて複数のメモリ装置の各々に対するアクセス動作中に発生したエラーのログ情報を生成した後、エラーのログ情報をエラーの個数と種類及び形態を基準に分析して複数のメモリ装置の各々に対してエラー強度を異なるように設定することで、複数のメモリ装置の各々に対してエラー対応動作を行うことができる。 This technology generates log information of errors that occur during access operation to each of a plurality of memory devices in a memory system including a plurality of memory devices, and then obtains error log information based on the number, type, and form of errors. By analyzing and setting the error strength to be different for each of the plurality of memory devices, it is possible to perform an error handling operation for each of the plurality of memory devices.

これにより、複数のメモリ装置のうち、深刻なエラーが発生する可能性の高いメモリ装置またはメモリ装置の特定領域を予め予測して、それに合うエラー対応動作を行うという効果がある。 This has the effect of predicting in advance a specific area of the memory device or the memory device in which a serious error is likely to occur among the plurality of memory devices, and performing an error handling operation corresponding to the prediction.

本発明の第１実施形態に係るデータ処理システムの構成を説明するために示した図。The figure shown for demonstrating the structure of the data processing system which concerns on 1st Embodiment of this invention. 本発明の第２実施形態に係るメモリシステムの構成を説明するために示した図。The figure shown for demonstrating the configuration of the memory system which concerns on 2nd Embodiment of this invention. 本発明の第３実施形態に係るメモリシステムの構成を説明するために示した図。The figure shown for demonstrating the configuration of the memory system which concerns on 3rd Embodiment of this invention. 本発明の実施形態に係るログ情報分析動作を説明するために示した図。The figure shown for demonstrating the log information analysis operation which concerns on embodiment of this invention. 本発明の実施形態に係るログ情報分析動作を説明するために示した図。The figure shown for demonstrating the log information analysis operation which concerns on embodiment of this invention. 本発明の実施形態に係るログ情報分析動作を説明するために示した図。The figure shown for demonstrating the log information analysis operation which concerns on embodiment of this invention. 本発明の実施形態に係るログ情報分析動作を説明するために示した図。The figure shown for demonstrating the log information analysis operation which concerns on embodiment of this invention. 本発明の実施形態に係るログ情報分析動作を説明するために示した図。The figure shown for demonstrating the log information analysis operation which concerns on embodiment of this invention.

以下、添付された図面を参照して本発明の好ましい実施形態を説明する。しかし、本発明は、以下において開示される実施形態に限定されるものではなく、互いに異なる様々な形態で構成されることができ、ただし、本実施形態は、本発明の開示が完全なようにし、通常の知識を有する者に本発明の範疇を完全に知らせるために提供されるものである。 Hereinafter, preferred embodiments of the present invention will be described with reference to the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, and may be configured in various forms different from each other, provided that the present embodiment completes the disclosure of the present invention. , Provided to fully inform those with ordinary knowledge of the scope of the invention.

図１Ａは、本発明の第１実施形態に係るデータ処理システムの構成を説明するために示した図である。 FIG. 1A is a diagram shown for explaining the configuration of the data processing system according to the first embodiment of the present invention.

図１Ａに示すように、本発明の第１実施形態に係るデータ処理システムは、ホスト１０２及びメモリシステム１１０を備えることができる。ここで、メモリシステム１１０は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８を備えることができる。そして、ホスト１０２は、エラー収集部１０２１と、第１のエラー分析部１０２３と、第２のエラー分析部１０２４と、対応動作部１０２５と、ホストＥＣＣ１０２６とを備えることができる。 As shown in FIG. 1A, the data processing system according to the first embodiment of the present invention can include a host 102 and a memory system 110. Here, the memory system 110 can include a plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508. The host 102 can include an error collecting unit 1021, a first error analysis unit 1023, a second error analysis unit 1024, a corresponding operation unit 1025, and a host ECC 1026.

そして、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々は、複数のメモリバンクＢＫ＜１：４＞と、メモリＥＣＣ（ＥＣＣ１、ＥＣＣ２、ＥＣＣ３、ＥＣＣ４、ＥＣＣ５、ＥＣＣ６、ＥＣＣ７、ＥＣＣ８）と、情報格納領域ＰＡ１、ＰＡ２、ＰＡ３、ＰＡ４、ＰＡ５、ＰＡ６、ＰＡ７、ＰＡ８とを備えることができる。 Each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 has a plurality of memory banks BK <1: 4> and memory ECCs (ECC1, ECC2, ECC3, ECC4, ECC5, ECC6, ECC7, ECC8) and information storage areas PA1, PA2, PA3, PA4, PA5, PA6, PA7, PA8 can be provided.

参考までに、図１Ａに示された図面は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々がＤＲＡＭであることと仮定し、メモリ装置１０が他の種類のメモリ装置である場合、詳細構成が変更され得る。具体的に、複数のメモリバンクＢＫ＜１：４＞の各々には、複数のワードラインＷＬ１、ＷＬ２、．．．、ＷＬＸと複数のビットラインＢＬ１、ＢＬ２、ＢＬ３、．．．、ＢＬＹとにアレイ（Ａｒｒａｙ）形態で接続された複数のメモリセル（ＣＥＬＬ）を備えることができ、複数のメモリセルの各々は、少なくとも１ビットのデータを格納することができる。すなわち、複数のメモリバンクＢＫ＜１：４＞の各々は、複数のメモリセルがアレイ形態で備えられた「セルアレイ領域」とみなすことができる。したがって、「複数のメモリバンク」という表現は、メモリ装置１０がＤＲＡＭであることと仮定したことであって、他の種類のメモリ装置である場合、「複数のセルアレイ領域」という表現に代替され得るであろう。まとめると、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々の内部構成は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々の特性、メモリシステム１１０が使用される目的、あるいはホスト１０２で要求するメモリシステム１１０の仕様などによって設計変更されることができる。 For reference, the drawings shown in FIG. 1A assume that each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, 1508 is a DRAM, and the memory device 10 is of another type. In the case of the memory device of, the detailed configuration can be changed. Specifically, each of the plurality of memory banks BK <1: 4> has a plurality of word lines WL1, WL2 ,. .. .. , WLX and multiple bit lines BL1, BL2, BL3 ,. .. .. , BLY can be provided with a plurality of memory cells (CELL) connected in an array (Array) form, and each of the plurality of memory cells can store at least one bit of data. That is, each of the plurality of memory banks BK <1: 4> can be regarded as a "series array region" in which a plurality of memory cells are provided in an array form. Therefore, the expression "plurality of memory banks" is based on the assumption that the memory device 10 is a DRAM, and can be replaced by the expression "plurality of cell array areas" in the case of other types of memory devices. Will. In summary, the internal configurations of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 are the respective internal configurations of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508, respectively. The design can be changed depending on the characteristics of the memory system 110, the purpose in which the memory system 110 is used, the specifications of the memory system 110 required by the host 102, and the like.

そして、メモリシステム１１０に備えられた複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々は、アクセス動作、例えば、データの読み出し／書き込み動作を行う過程でエラーが発生して、内部に備えられたメモリＥＣＣ（ＥＣＣ１、ＥＣＣ２、ＥＣＣ３、ＥＣＣ４、ＥＣＣ５、ＥＣＣ６、ＥＣＣ７、ＥＣＣ８）の動作を介してエラーが復旧される場合、メモリＥＣＣ（ＥＣＣ１、ＥＣＣ２、ＥＣＣ３、ＥＣＣ４、ＥＣＣ５、ＥＣＣ６、ＥＣＣ７、ＥＣＣ８）によりエラーが復旧されたデータに対するログ情報ＬＯＧ＿ＩＮＦＯを生成できる。すなわち、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々は、アクセス動作中、メモリＥＣＣ（ＥＣＣ１、ＥＣＣ２、ＥＣＣ３、ＥＣＣ４、ＥＣＣ５、ＥＣＣ６、ＥＣＣ７、ＥＣＣ８）によりエラーが復旧されたアクセス動作のエラーと関連したローデータ（ｒａｗｄａｔａ）を情報格納領域ＰＡ１、ＰＡ２、ＰＡ３、ＰＡ４、ＰＡ５、ＰＡ６、ＰＡ７、ＰＡ８に累積、格納してログ情報ＬＯＧ＿ＩＮＦＯを生成できる。このとき、ログ情報ＬＯＧ＿ＩＮＦＯに含まれるエラーと関連したローデータ（ｒａｗｄａｔａ）は、エラーの発生と関連して複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々で生成できる全てのデータを意味できる。すなわち、ログ情報ＬＯＧ＿ＩＮＦＯに含まれるエラーと関連したローデータは、エラーの発生時点と、発生位置と、発生形態と、種類、及び発生個数を表すデータでありうる。例えば、エラーが発生したデータのビット数、エラーが発生したデータの物理的な格納位置、絶対的なエラー発生時点、エラーが発生した物理的な領域の範囲、及び発生したエラーの種類等を表すデータでありうる。そして、情報格納領域ＰＡ１、ＰＡ２、ＰＡ３、ＰＡ４、ＰＡ５、ＰＡ６、ＰＡ７、ＰＡ８は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々にレジスタ（ｒｅｇｉｓｔｅｒ）形態で含まれた格納空間でありうる。また、情報格納領域ＰＡ１、ＰＡ２、ＰＡ３、ＰＡ４、ＰＡ５、ＰＡ６、ＰＡ７、ＰＡ８は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に備えられた複数のバンクＢＫ＜１：４＞のうち、少なくとも１つのバンクで少なくとも一部空間でありうる。 Then, each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 provided in the memory system 110 causes an error in the process of performing an access operation, for example, a data read / write operation. Then, when the error is recovered through the operation of the memory ECC (ECC1, ECC2, ECC3, ECC4, ECC5, ECC6, ECC7, ECC8) provided inside, the memory ECC (ECC1, ECC2, ECC3, ECC4, The log information LOG_INFO can be generated for the data whose error has been recovered by ECC5, ECC6, ECC7, ECC8). That is, each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 has an error due to the memory ECC (ECC1, ECC2, ECC3, ECC4, ECC5, ECC6, ECC7, ECC8) during the access operation. The log information LOG_INFO can be generated by accumulating and storing the raw data (raw data) associated with the recovered access operation error in the information storage areas PA1, PA2, PA3, PA4, PA5, PA6, PA7, and PA8. At this time, the raw data (raw data) associated with the error included in the log information LOG_INFO is stored in each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 in relation to the occurrence of the error. It can mean all the data that can be generated. That is, the raw data associated with the error included in the log information LOG_INFO may be data representing the time point where the error occurred, the position where the error occurred, the form of occurrence, the type, and the number of occurrences. For example, it indicates the number of bits of the data in which the error occurred, the physical storage position of the data in which the error occurred, the absolute time when the error occurred, the range of the physical area in which the error occurred, the type of the error that occurred, and the like. It can be data. The information storage areas PA1, PA2, PA3, PA4, PA5, PA6, PA7, and PA8 are registered in each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 in the form of registers. It can be a contained storage space. Further, the information storage areas PA1, PA2, PA3, PA4, PA5, PA6, PA7, and PA8 are provided in a plurality of banks provided in each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508. Of the BK <1: 4>, at least one bank can be at least a part of the space.

そして、ホスト１０２は、次のような動作を介してメモリシステム１１０に備えられた複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々で発生したエラーと関連したローデータ（ｒａｗｄａｔａ）を収集できる。 Then, the host 102 has a row associated with an error generated in each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 provided in the memory system 110 through the following operations. Data (raw data) can be collected.

１番目の動作は、前述した説明のように、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するアクセス動作中、メモリＥＣＣ（ＥＣＣ１、ＥＣＣ２、ＥＣＣ３、ＥＣＣ４、ＥＣＣ５、ＥＣＣ６、ＥＣＣ７、ＥＣＣ８）によりエラーが復旧されたアクセス動作のエラーと関連したローデータがログ情報ＬＯＧ＿ＩＮＦＯとして情報格納領域ＰＡ１、ＰＡ２、ＰＡ３、ＰＡ４、ＰＡ５、ＰＡ６、ＰＡ７、ＰＡ８に累積されて格納されることができる。したがって、ホスト１０２は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々の情報格納領域ＰＡ１、ＰＡ２、ＰＡ３、ＰＡ４、ＰＡ５、ＰＡ６、ＰＡ７、ＰＡ８からログ情報ＬＯＧ＿ＩＮＦＯを収集できる。 The first operation is the memory ECC (ECC1, ECC2, ECC3, ECC4,) during the access operation for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508, as described above. The error was recovered by ECC5, ECC6, ECC7, ECC8). Raw data related to the access operation error is accumulated as log information LOG_INFO in the information storage areas PA1, PA2, PA3, PA4, PA5, PA6, PA7, and PA8. Can be stored. Therefore, the host 102 logs information LOG_INFO from the information storage areas PA1, PA2, PA3, PA4, PA5, PA6, PA7, and PA8 of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508, respectively. Can be collected.

２番目の動作は、ホスト１０２は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するアクセス動作、例えば、データの読み出し動作中、内部に備えられたホストＥＣＣ１０２６によりエラーが復旧されたアクセス動作のエラーと関連したローデータをエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯとして生成して収集することができる。このとき、ホスト１０２に備えられたホストＥＣＣ１０２６によりエラーが復旧されたアクセス動作の場合、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に備えられたメモリＥＣＣ（ＥＣＣ１、ＥＣＣ２、ＥＣＣ３、ＥＣＣ４、ＥＣＣ５、ＥＣＣ６、ＥＣＣ７、ＥＣＣ８）でエラーを復旧できなかったアクセス動作であると仮定することができる。このとき、エラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯに含まれるエラーと関連したローデータは、エラーの発生と関連してホストＥＣＣ１０２６で生成できる全てのデータを意味できる。すなわち、エラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯに含まれるエラーと関連したローデータは、エラーの発生時点と、発生位置と、発生形態と、種類、及び発生個数を表すデータでありうる。例えば、エラーが発生したデータのビット数、エラーが発生したデータの物理的な格納位置、絶対的なエラー発生時点、エラーが発生した物理的な領域の範囲、及び発生したエラーの種類等を表すデータでありうる。 In the second operation, the host 102 provides an internal host ECC1026 during an access operation for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508, for example, a data read operation. The raw data associated with the access operation error for which the error has been recovered can be generated and collected as error correction information ERR_CO_INFO. At this time, in the case of the access operation in which the error is recovered by the host ECC1026 provided in the host 102, the memory ECCs provided in each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 ( It can be assumed that the access operation cannot recover the error in ECC1, ECC2, ECC3, ECC4, ECC5, ECC6, ECC7, ECC8). At this time, the raw data associated with the error included in the error correction information ERR_CO_INFO can mean all the data that can be generated by the host ECC1026 in connection with the occurrence of the error. That is, the raw data associated with the error included in the error correction information ERR_CO_INFO may be data representing the time point where the error occurred, the position where the error occurred, the form of occurrence, the type, and the number of occurrences. For example, it indicates the number of bits of the data in which the error occurred, the physical storage position of the data in which the error occurred, the absolute time when the error occurred, the range of the physical area in which the error occurred, the type of the error that occurred, and the like. It can be data.

ホスト１０２は、ログ情報ＬＯＧ＿ＩＮＦＯ及びエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯを分析して複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々で発生したエラーの個数と種類及び形態を把握することが可能である。したがって、ホスト１０２は、ログ情報ＬＯＧ＿ＩＮＦＯ及びエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯをエラーの個数と種類及び形態を基準に分析してメモリシステム１１０に備えられた複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対して互いに異なるエラー強度を設定できる。また、ホスト１０２は、エラー強度によって複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対して互いに異なるエラー対応動作を行うことができる。 The host 102 analyzes the log information LOG_INFO and the error correction information ERR_CO_INFO to grasp the number, types, and forms of errors that have occurred in each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508. It is possible. Therefore, the host 102 analyzes the log information LOG_INFO and the error correction information ERR_CO_INFO based on the number, types, and forms of errors, and a plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506 provided in the memory system 110. , 1507 and 1508 can be set with different error intensities. Further, the host 102 can perform different error handling operations for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 depending on the error intensity.

より具体的に、ホスト１０２に備えられたエラー収集部１０２１は、設定された時点で情報収集のためのコマンド（図示せず）をメモリシステム１１０に伝達した後、情報収集のためのコマンドに応答してメモリシステム１１０に備えられた複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の情報格納領域ＰＡ１、ＰＡ２、ＰＡ３、ＰＡ４、ＰＡ５、ＰＡ６、ＰＡ７、ＰＡ８から出力されるログ情報ＬＯＧ＿ＩＮＦＯを伝達されて収集することができる。また、エラー収集部１０２１は、ホストＥＣＣ１０２６で生成されたエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯをリアルタイムまたは設定された時点毎に収集することができる。 More specifically, the error collecting unit 1021 provided in the host 102 transmits a command for collecting information (not shown) to the memory system 110 at a set time, and then responds to the command for collecting information. The information is output from the information storage areas PA1, PA2, PA3, PA4, PA5, PA6, PA7, and PA8 of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 provided in the memory system 110. Log information LOG_INFO can be transmitted and collected. Further, the error collecting unit 1021 can collect the error correction information ERR_CO_INFO generated by the host ECC1026 in real time or at each set time point.

ここで、ホスト１０２は、設定された時点を次のようないくつかの時点のうち、少なくとも１つ以上の時点を選択して指定することができる。 Here, the host 102 can select and specify at least one or more time points among several time points as follows.

１番目に、メモリシステム１１０に電源が供給された時点から特定時間間隔毎に繰り返される時点を設定された時点として指定することができる。 First, a time point that is repeated at specific time intervals from the time when power is supplied to the memory system 110 can be designated as a set time point.

２番目に、メモリシステム１１０に対するアクセス動作、すなわち、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８に対するアクセス動作中に発生したエラー個数をカウントし、カウント個数が予め決められた基準個数を超過する度に、超過する時点を設定された時点として指定することができる。このとき、設定された時点が指定される度にカウント個数は初期化されることができる。例えば、エラー個数のカウントは、ホストＥＣＣ１０２６で行われることができる。 Second, the number of errors that occur during the access operation to the memory system 110, that is, the access operation to the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 is counted, and the number of counts is determined in advance. Each time the specified reference number is exceeded, the time when the number is exceeded can be specified as the set time. At this time, the count number can be initialized every time the set time point is specified. For example, counting the number of errors can be done on the host ECC1026.

３番目に、メモリシステム１１０に対するアクセス動作、すなわち、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８に対するアクセス動作中に発生したエラーを復旧するためのエラー復旧動作を行い、エラー復旧動作にかかった時間が特定時間以上かかる時点を設定された時点として指定することができる。このとき、エラー復旧動作にかかった時間が特定時間以上かかるとは、発生したエラーを復旧する過程でハミングコード（ｈａｍｍｉｎｇｃｏｄｅ）を使用する相対的に簡単なエラー復旧動作が失敗して、リードソロモンコードを使用する相対的に複雑なエラー復旧動作が使用されたということを意味できる。例えば、エラー復旧動作は、ホストＥＣＣ１０２６で行われることができる。 Third, the access operation to the memory system 110, that is, the error recovery operation for recovering the error generated during the access operation to the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, 1508 is performed. , The time point at which the error recovery operation takes more than a specific time can be specified as the set time point. At this time, if the time required for the error recovery operation is longer than a specific time, a relatively simple error recovery operation using a hamming code fails in the process of recovering the generated error, and Reed-Solomon It can mean that a relatively complex error recovery operation using code was used. For example, the error recovery operation can be performed on the host ECC1026.

そして、エラー収集部１０２１は、ログ情報ＬＯＧ＿ＩＮＦＯ及びエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯをホスト１０２内部の設定された空間に格納することができる。このとき、図面に示されたように、エラー収集部１０２１内部は、別の格納領域になることができる。また、ホスト１０２内部の設定された空間は、図面に直接図示されていないが、ホスト１０２内部に含まれてホスト１０２の動作メモリとして使用されるホストメモリの特定格納空間になることができる。そして、第１のエラー分析部１０２３は、エラー収集部１０２１で収集したログ情報ＬＯＧ＿ＩＮＦＯ及びエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯを分析して複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々で発生したエラーの個数及び種類を確認し、確認されたエラーの個数及び種類に応じて複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するエラー等級を決定できる。このとき、第１のエラー分析部１０２３で決定された複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するエラー等級に関連した情報は、ホスト１０２内部の設定された空間に格納されることができる。 Then, the error collecting unit 1021 can store the log information LOG_INFO and the error correction information ERR_CO_INFO in the set space inside the host 102. At this time, as shown in the drawing, the inside of the error collecting unit 1021 can be another storage area. Further, although the space set inside the host 102 is not directly shown in the drawing, it can be a specific storage space of the host memory included inside the host 102 and used as the operating memory of the host 102. Then, the first error analysis unit 1023 analyzes the log information LOG_INFO and the error correction information ERR_CO_INFO collected by the error collection unit 1021 and analyzes the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, 1508. Check the number and type of errors that have occurred in each, and determine the error grade for each of the multiple memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 according to the number and type of confirmed errors. can. At this time, the information related to the error grade for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 determined by the first error analysis unit 1023 is set inside the host 102. Can be stored in the space.

そして、第２のエラー分析部１０２４は、第１のエラー分析部１０２３で決定された複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するエラー等級によって複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８のうち、一部のメモリ装置を選択できる。また、第２のエラー分析部１０２４は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８のうち、選択された一部のメモリ装置に対しては、ログ情報ＬＯＧ＿ＩＮＦＯ及びエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯの追加分析を介してエラーの形態及び個数を確認してエラー強度を決定し、一部のメモリ装置を除いた残りのメモリ装置に対しては、第１のエラー分析部１０２３で決定されたエラー等級に対応するようにエラー強度を決定できる。このとき、第２のエラー分析部１０２４は、第１のエラー分析部１０２３で決定された複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するエラー強度をホスト１０２内部の設定された空間で読み出すことができる。また、第２のエラー分析部１０２４で決定された複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するエラー等級に関連した情報は、ホスト１０２内部の設定された空間に格納されることができる。 Then, the second error analysis unit 1024 has a plurality of memories according to the error grades for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 determined by the first error analysis unit 1023. Some memory devices can be selected from the devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508. Further, the second error analysis unit 1024 transfers the log information LOG_INFO and the log information LOG_INFO and the log information LOG_INFO for some of the selected memory devices among the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508. Error correction information The form and number of errors are confirmed through additional analysis of ERR_CO_INFO to determine the error intensity, and for the remaining memory devices excluding some memory devices, the first error analysis unit 1023 The error intensity can be determined to correspond to the determined error grade. At this time, the second error analysis unit 1024 sets the error intensity for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 determined by the first error analysis unit 1023 as the host 102. It can be read in the set space inside. In addition, information related to the error grade for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 determined by the second error analysis unit 1024 is set inside the host 102. Can be stored in space.

そして、対応動作部１０２５は、第２のエラー分析部１０２４で決定された複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するエラー強度によって複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対して互いに異なるエラー対応動作を行うことができる。このとき、対応動作部１０２５は、第２のエラー分析部１０２４で決定された複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するエラー強度をホスト１０２内部の設定された空間から読み出すことができる。 Then, the corresponding operation unit 1025 determines the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 according to the error strength for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 determined by the second error analysis unit 1024. It is possible to perform different error handling operations for each of 1502, 1503, 1504, 1505, 1506, 1507, and 1508. At this time, the corresponding operation unit 1025 sets the error strength for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 determined by the second error analysis unit 1024 inside the host 102. It can be read from the space.

そして、ホストＥＣＣ１０２６は、メモリシステム１１０に格納するために生成したデータに対してエラー訂正エンコード（ｅｒｒｏｒｃｏｒｒｅｃｔｉｏｎｅｎｃｏｄｉｎｇ）動作を行ってエラー訂正コード（ＥＣＣ、ＥｒｒｏｒＣｏｒｒｅｃｔｉｏｎＣｏｄｅ）を生成できる。ホスト１０２は、メモリシステム１１０に格納するデータにエラー訂正コードを含めたコードワード（ｃｏｄｅｗｏｒｄ）単位のデータをメモリシステム１１０に伝達することができる。メモリシステム１１０は、ホスト１０２から入力されたコードワード単位のデータを複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８に格納することができる。また、ホストＥＣＣ１０２６は、メモリシステム１１０から入力されたデータにエラーが発生したか否かを確認する動作及び入力されたデータにエラーが発生した場合、エラー訂正デコード（ｅｒｒｏｒｃｏｒｒｅｃｔｉｏｎｄｅｃｏｄｉｎｇ）、すなわち、エラー復旧動作を行ってエラー発生以前の正常データを復旧できる。このとき、ホスト１０２からメモリシステム１１０に伝達したデータがコードワード単位のデータであるから、メモリシステム１１０からホスト１０２に入力されたデータもコードワード単位のデータでありうる。したがって、ホストＥＣＣ１０２６は、ホスト１０２に入力されたコードワード単位のデータに含まれたエラー訂正コードを使用してエラー復旧動作を行うことができる。このとき、ホストＥＣＣ１０２６は、エラービット個数が訂正可能なエラービット限界値以上発生すれば、エラー復旧動作が失敗する可能性があり、エラーが発生したビットを訂正することができない。一方、ホストＥＣＣ１０２６は、ハミングコード（ｈａｍｍｉｎｇｃｏｄｅ）、ＬＤＰＣ（ｌｏｗｄｅｎｓｉｔｙｐａｒｉｔｙｃｈｅｃｋ）コード（ｃｏｄｅ）、ＢＣＨ（Ｂｏｓｅ、Ｃｈａｕｄｈｒｉ、Ｈｏｃｑｕｅｎｇｈｅｍ）コード、ターボコード（ｔｕｒｂｏｃｏｄｅ）、リード−ソロモンコード（Ｒｅｅｄ−Ｓｏｌｏｍｏｎｃｏｄｅ）、コンボリューションコード（ｃｏｎｖｏｌｕｔｉｏｎｃｏｄｅ）、ＲＳＣ（ｒｅｃｕｒｓｉｖｅｓｙｓｔｅｍａｔｉｃｃｏｄｅ）、ＴＣＭ（ｔｒｅｌｌｉｓ−ｃｏｄｅｄｍｏｄｕｌａｔｉｏｎ）、ＢＣＭ（Ｂｌｏｃｋｃｏｄｅｄｍｏｄｕｌａｔｉｏｎ）などのコーデッドモジュレーション（ｃｏｄｅｄｍｏｄｕｌａｔｉｏｎ）を使用してエラー訂正を行うことができ、これに限定されるものではない。また、ホストＥＣＣ１０２６は、エラー訂正のためのコード、回路、モジュール、システム、または装置を全て含むことができる。 Then, the host ECC 1026 can generate an error correction code (ECC, Error Correction Code) by performing an error correction encoding operation on the data generated to be stored in the memory system 110. The host 102 can transmit data in codeword units including an error correction code to the data stored in the memory system 110 to the memory system 110. The memory system 110 can store code word unit data input from the host 102 in a plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508. Further, the host ECC1026 operates to confirm whether or not an error has occurred in the data input from the memory system 110, and when an error occurs in the input data, error correction decoding, that is, an error. It is possible to recover the normal data before the error occurred by performing the recovery operation. At this time, since the data transmitted from the host 102 to the memory system 110 is the data in the code word unit, the data input from the memory system 110 to the host 102 can also be the data in the code word unit. Therefore, the host ECC1026 can perform the error recovery operation by using the error correction code included in the codeword unit data input to the host 102. At this time, if the number of error bits exceeds the correctable error bit limit value, the host ECC1026 may fail in the error recovery operation and cannot correct the bit in which the error has occurred. On the other hand, the host ECC1026 has a humming code, an LDPC (low density parity check) code (code), a BCH (Bose, Khaudri, Hocquengem) code, a turbo code (turbo code), and a Reed-Solomon code (Reed). Code modulation (code error correction) such as code), convolution code (convolution code), RSC (recursive systematic code), TCM (trellis-coded modulation), BCM (Block coded modulation), etc. Yes, but not limited to this. The host ECC1026 can also include all codes, circuits, modules, systems, or devices for error correction.

参考までに、ホスト１０２に備えられたホストＥＣＣ１０２６と、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に備えられたメモリＥＣＣ（ＥＣＣ１、ＥＣＣ２、ＥＣＣ３、ＥＣＣ４、ＥＣＣ５、ＥＣＣ６、ＥＣＣ７、ＥＣＣ８）とは、エラー訂正可能なデータのサイズ差を有することができる。例えば、ホストＥＣＣ１０２６でエラー訂正可能なデータのサイズがメモリＥＣＣ（ＥＣＣ１、ＥＣＣ２、ＥＣＣ３、ＥＣＣ４、ＥＣＣ５、ＥＣＣ６、ＥＣＣ７、ＥＣＣ８）で訂正可能なデータのサイズよりさらに大きいことができる。また、前述した説明では、ホストＥＣＣ１０２６とメモリＥＣＣ（ＥＣＣ１、ＥＣＣ２、ＥＣＣ３、ＥＣＣ４、ＥＣＣ５、ＥＣＣ６、ＥＣＣ７、ＥＣＣ８）とが共にエラー訂正動作を行うことができることと説明したことがあるが、これはあくまでも１つの実施形態であり、限定されるものではない。例えば、メモリＥＣＣ（ＥＣＣ１、ＥＣＣ２、ＥＣＣ３、ＥＣＣ４、ＥＣＣ５、ＥＣＣ６、ＥＣＣ７、ＥＣＣ８）は、エラーの発生可否を確認する動作のみ行い、ホストＥＣＣ１０２６は、エラーの発生可否確認動作及びエラー訂正動作を共に行う実施形態もいくらでも可能である。 For reference, the host ECC1026 provided in the host 102 and the memory ECCs (ECC1, ECC2, ECC3, ECC4, respectively) provided in each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508, ECC5, ECC6, ECC7, ECC8) can have an error-correctable data size difference. For example, the size of the data that can be corrected by the host ECC1026 can be further larger than the size of the data that can be corrected by the memory ECC (ECC1, ECC2, ECC3, ECC4, ECC5, ECC6, ECC7, ECC8). Further, in the above description, it has been explained that both the host ECC1026 and the memory ECC (ECC1, ECC2, ECC3, ECC4, ECC5, ECC6, ECC7, ECC8) can perform the error correction operation. It is only one embodiment and is not limited to this. For example, the memory ECC (ECC1, ECC2, ECC3, ECC4, ECC5, ECC6, ECC7, ECC8) only performs the operation of confirming the occurrence of an error, and the host ECC1026 performs both the error occurrence confirmation operation and the error correction operation. Any number of embodiments can be performed.

図１Ｂは、本発明の第２実施形態に係るデータ処理システムの構成を説明するために示した図である。 FIG. 1B is a diagram shown for explaining the configuration of the data processing system according to the second embodiment of the present invention.

図１Ｂに示すように、本発明の第２実施形態に係るデータ処理システムは、ホスト１０２及びメモリシステム１１０を備えることができる。ここで、メモリシステム１１０は、コントローラ１３０及び複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８を備えることができる。そして、コントローラ１３０は、エラー収集部１３０１と、第１のエラー分析部１３０３と、第２のエラー分析部１３０４と、対応動作部１３０５と、システムＥＣＣ１３０６とを備えることができる。 As shown in FIG. 1B, the data processing system according to the second embodiment of the present invention can include a host 102 and a memory system 110. Here, the memory system 110 can include a controller 130 and a plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, 1508. The controller 130 can include an error collecting unit 1301, a first error analysis unit 1303, a second error analysis unit 1304, a corresponding operation unit 1305, and a system ECC1306.

参考までに、図１Ｂに示された図面は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々がＤＲＡＭであることと仮定し、メモリ装置１０が他の種類のメモリ装置である場合、詳細構成が変更され得る。具体的に、複数のメモリバンクＢＫ＜１：４＞の各々には、複数のワードラインＷＬ１、ＷＬ２、．．．、ＷＬＸと複数のビットラインＢＬ１、ＢＬ２、ＢＬ３、．．．、ＢＬＹとにアレイ（Ａｒｒａｙ）形態で接続された複数のメモリセル（ＣＥＬＬ）を備えることができ、複数のメモリセルの各々は、少なくとも１ビットのデータを格納することができる。すなわち、複数のメモリバンクＢＫ＜１：４＞の各々は、複数のメモリセルがアレイ形態で備えられた「セルアレイ領域」とみなすことができる。したがって、「複数のメモリバンク」という表現は、メモリ装置１０がＤＲＡＭであることと仮定したことであり、他の種類のメモリ装置である場合、「複数のセルアレイ領域」という表現に代替され得るであろう。まとめると、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々の内部構成は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々の特性、メモリシステム１１０が使用される目的、あるいはホスト１０２で要求するメモリシステム１１０の仕様などによって設計変更されることができる。 For reference, the drawings shown in FIG. 1B assume that each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, 1508 is a DRAM, and the memory device 10 is of another type. In the case of the memory device of, the detailed configuration can be changed. Specifically, each of the plurality of memory banks BK <1: 4> has a plurality of word lines WL1, WL2 ,. .. .. , WLX and multiple bit lines BL1, BL2, BL3 ,. .. .. , BLY can be provided with a plurality of memory cells (CELL) connected in an array (Array) form, and each of the plurality of memory cells can store at least one bit of data. That is, each of the plurality of memory banks BK <1: 4> can be regarded as a "series array region" in which a plurality of memory cells are provided in an array form. Therefore, the expression "plurality of memory banks" is based on the assumption that the memory device 10 is a DRAM, and in the case of other types of memory devices, the expression "plurality of cell array areas" can be substituted. There will be. In summary, the internal configurations of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 are the respective internal configurations of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508, respectively. The design can be changed depending on the characteristics of the memory system 110, the purpose in which the memory system 110 is used, the specifications of the memory system 110 required by the host 102, and the like.

そして、メモリシステム１１０に備えられた複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々は、アクセス動作、例えば、データの読み出し／書き込み動作を行う過程でエラーが発生して内部に備えられたメモリＥＣＣ（ＥＣＣ１、ＥＣＣ２、ＥＣＣ３、ＥＣＣ４、ＥＣＣ５、ＥＣＣ６、ＥＣＣ７、ＥＣＣ８）の動作を介してエラーが復旧される場合、メモリＥＣＣ（ＥＣＣ１、ＥＣＣ２、ＥＣＣ３、ＥＣＣ４、ＥＣＣ５、ＥＣＣ６、ＥＣＣ７、ＥＣＣ８）によりエラーが復旧されたデータに対するログ情報ＬＯＧ＿ＩＮＦＯを生成できる。すなわち、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々は、アクセス動作中、メモリＥＣＣ（ＥＣＣ１、ＥＣＣ２、ＥＣＣ３、ＥＣＣ４、ＥＣＣ５、ＥＣＣ６、ＥＣＣ７、ＥＣＣ８）によりエラーが復旧されたアクセス動作のエラーと関連したローデータ（ｒａｗｄａｔａ）を情報格納領域ＰＡ１、ＰＡ２、ＰＡ３、ＰＡ４、ＰＡ５、ＰＡ６、ＰＡ７、ＰＡ８に累積、格納してログ情報ＬＯＧ＿ＩＮＦＯを生成できる。このとき、ログ情報ＬＯＧ＿ＩＮＦＯに含まれるエラーと関連したローデータ（ｒａｗｄａｔａ）は、エラーの発生と関連して複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々で生成できる全てのデータを意味できる。すなわち、ログ情報ＬＯＧ＿ＩＮＦＯに含まれるエラーと関連したローデータは、エラーの発生時点と、発生位置と、発生形態と、種類、及び発生個数を表すデータでありうる。例えば、エラーが発生したデータのビット数、エラーが発生したデータの物理的な格納位置、絶対的なエラー発生時点、エラーが発生した物理的な領域の範囲、及び発生したエラーの種類等を表すデータでありうる。そして、情報格納領域ＰＡ１、ＰＡ２、ＰＡ３、ＰＡ４、ＰＡ５、ＰＡ６、ＰＡ７、ＰＡ８は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々にレジスタ（ｒｅｇｉｓｔｅｒ）形態で含まれた格納空間でありうる。また、情報格納領域ＰＡ１、ＰＡ２、ＰＡ３、ＰＡ４、ＰＡ５、ＰＡ６、ＰＡ７、ＰＡ８は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に備えられた複数のバンクＢＫ＜１：４＞のうち、少なくとも１つのバンクで少なくとも一部空間でありうる。 Then, each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 provided in the memory system 110 causes an error in the process of performing an access operation, for example, a data read / write operation. When the error is recovered through the operation of the internal memory ECC (ECC1, ECC2, ECC3, ECC4, ECC5, ECC6, ECC7, ECC8), the memory ECC (ECC1, ECC2, ECC3, ECC4, ECC5) , ECC6, ECC7, ECC8) can generate log information LOG_INFO for the data for which the error has been recovered. That is, each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 has an error due to the memory ECC (ECC1, ECC2, ECC3, ECC4, ECC5, ECC6, ECC7, ECC8) during the access operation. The log information LOG_INFO can be generated by accumulating and storing the raw data (raw data) associated with the recovered access operation error in the information storage areas PA1, PA2, PA3, PA4, PA5, PA6, PA7, and PA8. At this time, the raw data (raw data) associated with the error included in the log information LOG_INFO is stored in each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 in relation to the occurrence of the error. It can mean all the data that can be generated. That is, the raw data associated with the error included in the log information LOG_INFO may be data representing the time point where the error occurred, the position where the error occurred, the form of occurrence, the type, and the number of occurrences. For example, it indicates the number of bits of the data in which the error occurred, the physical storage position of the data in which the error occurred, the absolute time when the error occurred, the range of the physical area in which the error occurred, the type of the error that occurred, and the like. It can be data. The information storage areas PA1, PA2, PA3, PA4, PA5, PA6, PA7, and PA8 are registered in each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 in the form of registers. It can be a contained storage space. Further, the information storage areas PA1, PA2, PA3, PA4, PA5, PA6, PA7, and PA8 are provided in a plurality of banks provided in each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508. Of the BK <1: 4>, at least one bank can be at least a part of the space.

そして、コントローラ１３０は、次のような動作を介してメモリシステム１１０に備えられた複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々で発生したエラーと関連したローデータ（ｒａｗｄａｔａ）を収集できる。 Then, the controller 130 uses the following operations to cause errors associated with errors occurring in each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 provided in the memory system 110. Data (raw data) can be collected.

１番目の動作は、前述した説明のように、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するアクセス動作中、メモリＥＣＣ（ＥＣＣ１、ＥＣＣ２、ＥＣＣ３、ＥＣＣ４、ＥＣＣ５、ＥＣＣ６、ＥＣＣ７、ＥＣＣ８）によりエラーが復旧されたアクセス動作のエラーと関連したローデータがログ情報ＬＯＧ＿ＩＮＦＯとして情報格納領域ＰＡ１、ＰＡ２、ＰＡ３、ＰＡ４、ＰＡ５、ＰＡ６、ＰＡ７、ＰＡ８に累積されて格納されることができる。したがって、コントローラ１３０は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々の情報格納領域ＰＡ１、ＰＡ２、ＰＡ３、ＰＡ４、ＰＡ５、ＰＡ６、ＰＡ７、ＰＡ８からログ情報ＬＯＧ＿ＩＮＦＯを収集できる。 The first operation is the memory ECC (ECC1, ECC2, ECC3, ECC4,) during the access operation for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508, as described above. The error was recovered by ECC5, ECC6, ECC7, ECC8). Raw data related to the access operation error is accumulated as log information LOG_INFO in the information storage areas PA1, PA2, PA3, PA4, PA5, PA6, PA7, and PA8. Can be stored. Therefore, the controller 130 logs information LOG_INFO from the respective information storage areas PA1, PA2, PA3, PA4, PA5, PA6, PA7, PA8 of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508. Can be collected.

２番目の動作は、コントローラ１３０は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するアクセス動作、例えば、データの読み出し動作中、内部に備えられたシステムＥＣＣ１３０６によりエラーが復旧されたアクセス動作のエラーと関連したローデータをエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯとして生成して収集することができる。このとき、コントローラ１３０に備えられたシステムＥＣＣ１３０６によりエラーが復旧されたアクセス動作の場合、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に備えられたメモリＥＣＣ（ＥＣＣ１、ＥＣＣ２、ＥＣＣ３、ＥＣＣ４、ＥＣＣ５、ＥＣＣ６、ＥＣＣ７、ＥＣＣ８）でエラーを復旧できなかったアクセス動作であると仮定することができる。このとき、エラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯに含まれるエラーと関連したローデータは、エラーの発生と関連してシステムＥＣＣ１３０６で生成できる全てのデータを意味できる。すなわち、エラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯに含まれるエラーと関連したローデータは、エラーの発生時点と、発生位置と、発生形態と、種類、及び発生個数を表すデータでありうる。例えば、エラーが発生したデータのビット数、エラーが発生したデータの物理的な格納位置、絶対的なエラー発生時点、エラーが発生した物理的な領域の範囲、及び発生したエラーの種類等を表すデータでありうる。 In the second operation, the controller 130 provides an internal system ECC1306 during an access operation for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508, for example, a data read operation. The raw data associated with the access operation error for which the error has been recovered can be generated and collected as error correction information ERR_CO_INFO. At this time, in the case of the access operation in which the error is recovered by the system ECC1306 provided in the controller 130, the memory ECCs provided in each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 ( It can be assumed that the access operation cannot recover the error in ECC1, ECC2, ECC3, ECC4, ECC5, ECC6, ECC7, ECC8). At this time, the raw data associated with the error included in the error correction information ERR_CO_INFO can mean all the data that can be generated by the system ECC1306 in connection with the occurrence of the error. That is, the raw data associated with the error included in the error correction information ERR_CO_INFO may be data representing the time point where the error occurred, the position where the error occurred, the form of occurrence, the type, and the number of occurrences. For example, it indicates the number of bits of the data in which the error occurred, the physical storage position of the data in which the error occurred, the absolute time when the error occurred, the range of the physical area in which the error occurred, the type of the error that occurred, and the like. It can be data.

コントローラ１３０は、ログ情報ＬＯＧ＿ＩＮＦＯ及びエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯを分析して複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々で発生したエラーの個数と種類及び形態を把握することが可能である。したがって、コントローラ１３０は、ログ情報ＬＯＧ＿ＩＮＦＯ及びエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯをエラーの個数と種類及び形態を基準に分析してメモリシステム１１０に備えられた複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対して互いに異なるエラー強度を設定できる。また、コントローラ１３０は、エラー強度によって複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対して互いに異なるエラー対応動作を行うことができる。 The controller 130 analyzes the log information LOG_INFO and the error correction information ERR_CO_INFO to grasp the number, types, and forms of errors that have occurred in each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508. It is possible. Therefore, the controller 130 analyzes the log information LOG_INFO and the error correction information ERR_CO_INFO based on the number, types, and forms of errors, and a plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506 provided in the memory system 110. , 1507 and 1508 can be set with different error intensities. Further, the controller 130 can perform different error handling operations for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 depending on the error intensity.

より具体的に、コントローラ１３０に備えられたエラー収集部１３０１は、設定された時点で情報収集のためのコマンド（図示せず）をメモリシステム１１０に伝達した後、情報収集のためのコマンドに応答してメモリシステム１１０に備えられた複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の情報格納領域ＰＡ１、ＰＡ２、ＰＡ３、ＰＡ４、ＰＡ５、ＰＡ６、ＰＡ７、ＰＡ８から出力されるログ情報ＬＯＧ＿ＩＮＦＯを伝達されて収集することができる。また、エラー収集部１３０１は、システムＥＣＣ１３０６で生成されたエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯをリアルタイムまたは設定された時点毎に収集することができる。 More specifically, the error collecting unit 1301 provided in the controller 130 transmits a command for collecting information (not shown) to the memory system 110 at a set time, and then responds to the command for collecting information. The information is output from the information storage areas PA1, PA2, PA3, PA4, PA5, PA6, PA7, and PA8 of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 provided in the memory system 110. Log information LOG_INFO can be transmitted and collected. Further, the error collecting unit 1301 can collect the error correction information ERR_CO_INFO generated by the system ECC1306 in real time or at each set time point.

ここで、コントローラ１３０は、設定された時点を次のようないくつかの時点のうち、少なくとも１つ以上の時点を選択して指定することができる。 Here, the controller 130 can select and specify at least one or more time points among several time points as follows.

２番目に、メモリシステム１１０に対するアクセス動作、すなわち、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８に対するアクセス動作中に発生したエラー個数をカウントし、カウント個数が予め決められた基準個数を超過する度に、超過する時点を設定された時点として指定することができる。このとき、設定された時点が指定される度に、カウント個数は初期化されることができる。例えば、エラー個数のカウントは、システムＥＣＣ１３０６で行われることができる。 Second, the number of errors that occur during the access operation to the memory system 110, that is, the access operation to the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 is counted, and the number of counts is determined in advance. Each time the specified reference number is exceeded, the time when the number is exceeded can be specified as the set time. At this time, the count number can be initialized each time the set time point is specified. For example, counting the number of errors can be done by system ECC1306.

３番目に、メモリシステム１１０に対するアクセス動作、すなわち、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８に対するアクセス動作中に発生したエラーを復旧するためのエラー復旧動作を行い、エラー復旧動作にかかった時間が特定時間以上かかる時点を設定された時点として指定することができる。このとき、エラー復旧動作にかかった時間が特定時間以上かかるとは、発生したエラーを復旧する過程でハミングコード（ｈａｍｍｉｎｇｃｏｄｅ）を使用する相対的に簡単なエラー復旧動作が失敗して、リードソロモンコードを使用する相対的に複雑なエラー復旧動作が使用されたということを意味できる。例えば、エラー復旧動作は、システムＥＣＣ１３０６で行われることができる。 Third, the access operation to the memory system 110, that is, the error recovery operation for recovering the error generated during the access operation to the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, 1508 is performed. , The time point at which the error recovery operation takes more than a specific time can be specified as the set time point. At this time, if the time required for the error recovery operation is longer than a specific time, a relatively simple error recovery operation using a hamming code fails in the process of recovering the generated error, and Reed-Solomon It can mean that a relatively complex error recovery operation using code was used. For example, the error recovery operation can be performed on the system ECC1306.

そして、エラー収集部１３０１は、ログ情報ＬＯＧ＿ＩＮＦＯ及びエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯをコントローラ１３０内部の設定された空間に格納することができる。このとき、図面に示されたように、エラー収集部１３０１内部は、別の格納領域になることができる。また、コントローラ１３０内部の設定された空間は、図面に直接図示されていないが、コントローラ１３０内部に含まれてメモリシステム１１０の動作メモリとして使用されるシステムメモリの特定格納空間になることができる。 Then, the error collecting unit 1301 can store the log information LOG_INFO and the error correction information ERR_CO_INFO in the set space inside the controller 130. At this time, as shown in the drawing, the inside of the error collecting unit 1301 can be another storage area. Further, although the space set inside the controller 130 is not directly shown in the drawing, it can be a specific storage space of the system memory included inside the controller 130 and used as the operating memory of the memory system 110.

そして、第１のエラー分析部１３０３は、エラー収集部１３０１で収集したログ情報ＬＯＧ＿ＩＮＦＯ及びエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯを分析して、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々で発生したエラーの個数及び種類を確認し、確認されたエラーの個数及び種類に応じて複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するエラー等級を決定できる。このとき、第１のエラー分析部１３０３で決定された複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するエラー等級に関連した情報は、コントローラ１３０内部の設定された空間に格納されることができる。 Then, the first error analysis unit 1303 analyzes the log information LOG_INFO and the error correction information ERR_CO_INFO collected by the error collection unit 1301, and analyzes a plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, 1508. Check the number and types of errors that occurred in each of the above, and set the error grades for each of the multiple memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 according to the number and types of confirmed errors. Can be decided. At this time, the information related to the error grade for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 determined by the first error analysis unit 1303 is set inside the controller 130. Can be stored in the space.

そして、第２のエラー分析部１３０４は、第１のエラー分析部１３０３で決定された複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するエラー等級によって複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８のうち、一部のメモリ装置を選択できる。また、第２のエラー分析部１３０４は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８のうち、選択された一部のメモリ装置に対しては、ログ情報ＬＯＧ＿ＩＮＦＯ及びエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯの追加分析を介してエラーの形態及び個数を確認してエラー強度を決定し、一部のメモリ装置を除いた残りのメモリ装置に対しては、第１のエラー分析部１３０３で決定されたエラー等級に対応するようにエラー強度を決定できる。このとき、第２のエラー分析部１３０４は、第１のエラー分析部１３０３で決定された複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するエラー強度をコントローラ１３０内部の設定された空間から読み出すことができる。また、第２のエラー分析部１３０４で決定された複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するエラー等級に関連した情報は、コントローラ１３０内部の設定された空間に格納されることができる。 Then, the second error analysis unit 1304 has a plurality of memories according to the error grades for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 determined by the first error analysis unit 1303. Some memory devices can be selected from the devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508. Further, the second error analysis unit 1304 sets the log information LOG_INFO and the log information LOG_INFO for some of the selected memory devices among the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508. Error correction information The form and number of errors are confirmed through additional analysis of ERR_CO_INFO to determine the error intensity, and for the remaining memory devices excluding some memory devices, the first error analysis unit 1303 is used. The error intensity can be determined to correspond to the determined error grade. At this time, the second error analysis unit 1304 determines the error strength for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 determined by the first error analysis unit 1303. It can be read from the set space inside. In addition, information related to error grades for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 determined by the second error analysis unit 1304 is set inside the controller 130. Can be stored in space.

そして、対応動作部１３０５は、第２のエラー分析部１３０４で決定された複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するエラー強度によって複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対して互いに異なるエラー対応動作を行うことができる。このとき、対応動作部１３０５は、第２のエラー分析部１３０４で決定された複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するエラー強度をコントローラ１３０内部の設定された空間から読み出すことができる。 Then, the corresponding operation unit 1305 determines the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 according to the error strength for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 determined by the second error analysis unit 1304. It is possible to perform different error handling operations for each of 1502, 1503, 1504, 1505, 1506, 1507, and 1508. At this time, the corresponding operation unit 1305 sets the error strength for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 determined by the second error analysis unit 1304 inside the controller 130. It can be read from the space.

そして、システムＥＣＣ１３０６は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に格納するためのデータに対してエラー訂正エンコード（ｅｒｒｏｒｃｏｒｒｅｃｔｉｏｎｅｎｃｏｄｉｎｇ）動作を行ってエラー訂正コード（ＥＣＣ、ＥｒｒｏｒＣｏｒｒｅｃｔｉｏｎＣｏｄｅ）を生成できる。コントローラ１３０は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に格納するデータにエラー訂正コードを含めたコードワード（ｃｏｄｅｗｏｒｄ）単位のデータを複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に伝達することができる。複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々は、コントローラ１３０から入力されたコードワード単位のデータを格納することができる。また、システムＥＣＣ１３０６は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々から読み出されたデータにエラーが発生したか否かを確認する動作及び読み出されたデータにエラーが発生した場合、エラー訂正デコード（ｅｒｒｏｒｃｏｒｒｅｃｔｉｏｎｄｅｃｏｄｉｎｇ）、すなわち、エラー復旧動作を行ってエラー発生以前の正常データを復旧できる。このとき、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に格納されたデータがコードワード単位のデータであるから、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々から読み出されたデータもコードワード単位のデータでありうる。したがって、システムＥＣＣ１３０６は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々から読み出されたコードワード単位のデータに含まれたエラー訂正コードを使用してエラー復旧動作を行うことができる。このとき、システムＥＣＣ１３０６は、エラービット個数が訂正可能なエラービット限界値以上発生すれば、エラー復旧動作に失敗する可能性があり、エラーが発生したビットを訂正することができない。一方、システムＥＣＣ１３０６は、ハミングコード（ｈａｍｍｉｎｇｃｏｄｅ）、ＬＤＰＣ（ｌｏｗｄｅｎｓｉｔｙｐａｒｉｔｙｃｈｅｃｋ）コード（ｃｏｄｅ）、ＢＣＨ（Ｂｏｓｅ、Ｃｈａｕｄｈｒｉ、Ｈｏｃｑｕｅｎｇｈｅｍ）コード、ターボコード（ｔｕｒｂｏｃｏｄｅ）、リード−ソロモンコード（Ｒｅｅｄ−Ｓｏｌｏｍｏｎｃｏｄｅ）、コンボリューションコード（ｃｏｎｖｏｌｕｔｉｏｎｃｏｄｅ）、ＲＳＣ（ｒｅｃｕｒｓｉｖｅｓｙｓｔｅｍａｔｉｃｃｏｄｅ）、ＴＣＭ（ｔｒｅｌｌｉｓ−ｃｏｄｅｄｍｏｄｕｌａｔｉｏｎ）、ＢＣＭ（Ｂｌｏｃｋｃｏｄｅｄｍｏｄｕｌａｔｉｏｎ）などのコーデッドモジュレーション（ｃｏｄｅｄｍｏｄｕｌａｔｉｏｎ）を使用してエラー訂正を行うことができ、これに限定されるものではない。また、システムＥＣＣ１３０６は、エラー訂正のためのコード、回路、モジュール、システム、または装置を全て含むことができる。 Then, the system ECC1306 performs an error correction encoding operation on the data to be stored in each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 to correct the error. Code (ECC, Error Correction Code) can be generated. The controller 130 stores data in code word units including an error correction code in the data stored in each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 in a plurality of memory devices. It can be transmitted to each of 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508. Each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 can store data in code word units input from the controller 130. Further, the system ECC1306 is operated and read to confirm whether or not an error has occurred in the data read from each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508. When an error occurs in the data, error correction decoding, that is, an error recovery operation can be performed to recover the normal data before the error occurs. At this time, since the data stored in each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 is the data in code word units, the plurality of memory devices 1501, 1502, 1503, 1504 , 1505, 1506, 1507, 1508 can also be code word unit data. Therefore, the system ECC1306 recovers the error by using the error correction code included in the code word unit data read from each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508. Can perform operations. At this time, if the number of error bits exceeds the correctable error bit limit value, the system ECC1306 may fail in the error recovery operation and cannot correct the bit in which the error has occurred. On the other hand, the system ECC1306 includes a humming code, an LDPC (low density parity check) code (code), a BCH (Bose, Khaudri, Hocquengem) code, a turbo code (turbo code), and a Reed-Solomon code (Reed). Code modulation (code error correction) such as code), convolution code (convolution code), RSC (recursive systematic code), TCM (trellis-coded modulation), BCM (Block coded modulation), etc. Yes, but not limited to this. System ECC1306 may also include all codes, circuits, modules, systems, or devices for error correction.

参考までに、コントローラ１３０に備えられたシステムＥＣＣ１３０６と、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に備えられたメモリＥＣＣ（ＥＣＣ１、ＥＣＣ２、ＥＣＣ３、ＥＣＣ４、ＥＣＣ５、ＥＣＣ６、ＥＣＣ７、ＥＣＣ８）とは、エラー訂正可能なデータのサイズ差を有することができる。例えば、システムＥＣＣ１３０６でエラー訂正可能なデータのサイズがメモリＥＣＣ（ＥＣＣ１、ＥＣＣ２、ＥＣＣ３、ＥＣＣ４、ＥＣＣ５、ＥＣＣ６、ＥＣＣ７、ＥＣＣ８）で訂正可能なデータのサイズよりさらに大きいことができる。また、前述した説明では、システムＥＣＣ１３０６とメモリＥＣＣ（ＥＣＣ１、ＥＣＣ２、ＥＣＣ３、ＥＣＣ４、ＥＣＣ５、ＥＣＣ６、ＥＣＣ７、ＥＣＣ８）とが共にエラー訂正動作を行うことができることと説明したことがあるが、これはあくまでも１つの実施形態であり、限定されるものではない。例えば、メモリＥＣＣ（ＥＣＣ１、ＥＣＣ２、ＥＣＣ３、ＥＣＣ４、ＥＣＣ５、ＥＣＣ６、ＥＣＣ７、ＥＣＣ８）は、エラーの発生可否を確認する動作のみ行い、システムＥＣＣ１３０６は、エラーの発生可否確認動作及びエラー訂正動作を共に行う実施形態もいくらでも可能である。 For reference, the system ECC1306 provided in the controller 130 and the memory ECCs (ECC1, ECC2, ECC3, ECC4, respectively) provided in each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, 1508, ECC5, ECC6, ECC7, ECC8) can have an error-correctable data size difference. For example, the size of the data that can be corrected by the system ECC1306 can be further larger than the size of the data that can be corrected by the memory ECC (ECC1, ECC2, ECC3, ECC4, ECC5, ECC6, ECC7, ECC8). Further, in the above description, it has been explained that both the system ECC1306 and the memory ECC (ECC1, ECC2, ECC3, ECC4, ECC5, ECC6, ECC7, ECC8) can perform the error correction operation. It is only one embodiment and is not limited to this. For example, the memory ECC (ECC1, ECC2, ECC3, ECC4, ECC5, ECC6, ECC7, ECC8) only performs the operation of confirming the occurrence of an error, and the system ECC1306 performs both the error occurrence confirmation operation and the error correction operation. Any number of embodiments can be performed.

図１Ｃは、本発明の第３実施形態に係るメモリシステムの構成を説明するために示した図である。 FIG. 1C is a diagram shown for explaining the configuration of the memory system according to the third embodiment of the present invention.

図１Ｃに示すように、本発明の第３実施形態に係るメモリシステム１１０は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８を備えることができる。そして、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々は、複数のメモリバンクＢＫ＜１：４＞と、エラー収集部１５１１と、エラー分析部１５１３と、対応動作部１５１５と、メモリＥＣＣ１５１６と、情報格納領域ＰＡ１、ＰＡ２、ＰＡ３、ＰＡ４、ＰＡ５、ＰＡ６、ＰＡ７、ＰＡ８とを備えることができる。 As shown in FIG. 1C, the memory system 110 according to the third embodiment of the present invention can include a plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, 1508. Each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 corresponds to the plurality of memory banks BK <1: 4>, the error collecting unit 1511, and the error analysis unit 1513. The operation unit 1515, the memory ECC1516, and the information storage areas PA1, PA2, PA3, PA4, PA5, PA6, PA7, and PA8 can be provided.

参考までに、図１Ｃに示された図面は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々がＤＲＡＭであることと仮定し、メモリ装置１０が他の種類のメモリ装置である場合、詳細構成が変更されることができる。具体的に、複数のメモリバンクＢＫ＜１：４＞の各々には、複数のワードラインＷＬ１、ＷＬ２、．．．、ＷＬＸと複数のビットラインＢＬ１、ＢＬ２、ＢＬ３、．．．、ＢＬＹとにアレイ（Ａｒｒａｙ）形態で接続された複数のメモリセル（ＣＥＬＬ）を備えることができ、複数のメモリセルの各々は、少なくとも１ビットのデータを格納することができる。すなわち、複数のメモリバンクＢＫ＜１：４＞の各々は、複数のメモリセルがアレイ形態で備えられた「セルアレイ領域」とみなすことができる。したがって、「複数のメモリバンク」という表現は、メモリ装置１０がＤＲＡＭであることと仮定したことであり、他の種類のメモリ装置である場合、「複数のセルアレイ領域」という表現に代替され得るであろう。まとめると、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々の内部構成は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々の特性、メモリシステム１１０が使用される目的、あるいはメモリシステム１１０の仕様などによって設計変更されることができる。 For reference, the drawings shown in FIG. 1C assume that each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, 1508 is a DRAM, and the memory device 10 is of another type. In the case of the memory device of, the detailed configuration can be changed. Specifically, each of the plurality of memory banks BK <1: 4> has a plurality of word lines WL1, WL2 ,. .. .. , WLX and multiple bit lines BL1, BL2, BL3 ,. .. .. , BLY can be provided with a plurality of memory cells (CELL) connected in an array (Array) form, and each of the plurality of memory cells can store at least one bit of data. That is, each of the plurality of memory banks BK <1: 4> can be regarded as a "series array region" in which a plurality of memory cells are provided in an array form. Therefore, the expression "plurality of memory banks" is based on the assumption that the memory device 10 is a DRAM, and in the case of other types of memory devices, the expression "plurality of cell array areas" can be substituted. There will be. In summary, the internal configurations of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 are the respective internal configurations of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508, respectively. The design can be changed depending on the characteristics of the memory system 110, the purpose in which the memory system 110 is used, the specifications of the memory system 110, and the like.

そして、メモリシステム１１０に備えられた複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々は、アクセス動作、例えば、複数のメモリバンクＢＫ＜１：４＞の各々に対するデータの読み出し／書き込み動作を行う過程でエラーが発生して、内部に備えられたメモリＥＣＣ（ＥＣＣ１、ＥＣＣ２、ＥＣＣ３、ＥＣＣ４、ＥＣＣ５、ＥＣＣ６、ＥＣＣ７、ＥＣＣ８）の動作を介してエラーが復旧される場合、メモリＥＣＣ（ＥＣＣ１、ＥＣＣ２、ＥＣＣ３、ＥＣＣ４、ＥＣＣ５、ＥＣＣ６、ＥＣＣ７、ＥＣＣ８）によりエラーが復旧されたデータに対するログ情報ＬＯＧ＿ＩＮＦＯを生成できる。すなわち、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々は、アクセス動作中、メモリＥＣＣ（ＥＣＣ１、ＥＣＣ２、ＥＣＣ３、ＥＣＣ４、ＥＣＣ５、ＥＣＣ６、ＥＣＣ７、ＥＣＣ８）によりエラーが復旧されたアクセス動作のエラーと関連したローデータ（ｒａｗｄａｔａ）を情報格納領域ＰＡ１、ＰＡ２、ＰＡ３、ＰＡ４、ＰＡ５、ＰＡ６、ＰＡ７、ＰＡ８に累積、格納してログ情報ＬＯＧ＿ＩＮＦＯを生成できる。このとき、ログ情報ＬＯＧ＿ＩＮＦＯに含まれるエラーと関連したローデータ（ｒａｗｄａｔａ）は、エラーの発生と関連して複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々で生成できる全てのデータを意味できる。すなわち、ログ情報ＬＯＧ＿ＩＮＦＯに含まれるエラーと関連したローデータは、エラーの発生時点と、発生位置と、発生形態と、種類、及び発生個数を表すデータでありうる。例えば、エラーが発生したデータのビット数、エラーが発生したデータの物理的な格納位置、絶対的なエラー発生時点、エラーが発生した物理的な領域の範囲、及び発生したエラーの種類等を表すデータでありうる。そして、情報格納領域ＰＡ１、ＰＡ２、ＰＡ３、ＰＡ４、ＰＡ５、ＰＡ６、ＰＡ７、ＰＡ８は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々にレジスタ（ｒｅｇｉｓｔｅｒ）形態で含まれた格納空間でありうる。また、情報格納領域ＰＡ１、ＰＡ２、ＰＡ３、ＰＡ４、ＰＡ５、ＰＡ６、ＰＡ７、ＰＡ８は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に備えられた複数のバンクＢＫ＜１：４＞のうち、少なくとも１つのバンクで少なくとも一部空間でありうる。 Then, each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 provided in the memory system 110 is used for an access operation, for example, for each of the plurality of memory banks BK <1: 4>. An error occurs in the process of reading / writing data, and the error is recovered through the operation of the internal memory ECCs (ECC1, ECC2, ECC3, ECC4, ECC5, ECC6, ECC7, ECC8). In this case, the log information LOG_INFO can be generated for the data whose error has been recovered by the memory ECC (ECC1, ECC2, ECC3, ECC4, ECC5, ECC6, ECC7, ECC8). That is, each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 has an error due to the memory ECC (ECC1, ECC2, ECC3, ECC4, ECC5, ECC6, ECC7, ECC8) during the access operation. The log information LOG_INFO can be generated by accumulating and storing the raw data (raw data) associated with the recovered access operation error in the information storage areas PA1, PA2, PA3, PA4, PA5, PA6, PA7, and PA8. At this time, the raw data (raw data) associated with the error included in the log information LOG_INFO is stored in each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 in relation to the occurrence of the error. It can mean all the data that can be generated. That is, the raw data associated with the error included in the log information LOG_INFO may be data representing the time point where the error occurred, the position where the error occurred, the form of occurrence, the type, and the number of occurrences. For example, it indicates the number of bits of the data in which the error occurred, the physical storage position of the data in which the error occurred, the absolute time when the error occurred, the range of the physical area in which the error occurred, the type of the error that occurred, and the like. It can be data. The information storage areas PA1, PA2, PA3, PA4, PA5, PA6, PA7, and PA8 are registered in each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 in the form of registers. It can be a contained storage space. Further, the information storage areas PA1, PA2, PA3, PA4, PA5, PA6, PA7, and PA8 are provided in a plurality of banks provided in each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508. Of the BK <1: 4>, at least one bank can be at least a part of the space.

具体的に、エラー収集部１５１１は、メモリＥＣＣ１５１６で生成されて情報格納領域ＰＡ１、ＰＡ２、ＰＡ３、ＰＡ４、ＰＡ５、ＰＡ６、ＰＡ７、ＰＡ８に格納されたログ情報ＬＯＧ＿ＩＮＦＯをリアルタイムまたは設定された時点毎に収集することができる。 Specifically, the error collecting unit 1511 stores the log information LOG_INFO generated in the memory ECC1516 and stored in the information storage areas PA1, PA2, PA3, PA4, PA5, PA6, PA7, and PA8 in real time or at each set time point. Can be collected.

ここで、エラー収集部１５１１は、設定された時点を次のようないくつかの時点のうち、少なくとも１つ以上の時点を選択して指定することができる。 Here, the error collecting unit 1511 can select and specify at least one or more time points among some of the following time points as the set time points.

２番目に、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８に対するアクセス動作中に発生したエラー個数をカウントし、カウント個数が予め決められた基準個数を超過する度に、超過する時点を設定された時点として指定することができる。このとき、設定された時点が指定される度に、カウント個数は初期化されることができる。例えば、エラー個数のカウントは、メモリＥＣＣ１５１６で行われることができる。 Second, the number of errors that occur during the access operation to the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 is counted, and each time the count number exceeds a predetermined reference number. , The time to exceed can be specified as the set time. At this time, the count number can be initialized each time the set time point is specified. For example, counting the number of errors can be done in memory ECC1516.

３番目に、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８に対するアクセス動作中に発生したエラーを復旧するためのエラー復旧動作を行い、エラー復旧動作にかかった時間が特定時間以上かかる時点を設定された時点として指定することができる。このとき、エラー復旧動作にかかった時間が特定時間以上かかるとは、発生したエラーを復旧する過程でハミングコード（ｈａｍｍｉｎｇｃｏｄｅ）を使用する相対的に簡単なエラー復旧動作が失敗して、リードソロモンコードを使用する相対的に複雑なエラー復旧動作が使用されたということを意味できる。例えば、エラー復旧動作は、メモリＥＣＣ１５１６で行われることができる。 Third, an error recovery operation is performed to recover an error that occurred during an access operation to a plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508, and the time required for the error recovery operation is performed. A time point that takes a specific time or more can be specified as a set time point. At this time, if the time required for the error recovery operation is longer than a specific time, a relatively simple error recovery operation using a hamming code fails in the process of recovering the generated error, and Reed-Solomon It can mean that a relatively complex error recovery operation using code was used. For example, the error recovery operation can be performed in the memory ECC1516.

そして、エラー分析部１５１３は、エラー収集部１５１１で収集されたログ情報ＬＯＧ＿ＩＮＦＯを分析して複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々で発生したエラーの個数と種類及び形態を把握することが可能である。具体的に、エラー分析部１５１３は、エラー収集部１５１１で収集したログ情報ＬＯＧ＿ＩＮＦＯを分析して複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々で発生したエラーの個数及び種類を確認し、確認されたエラーの個数及び種類に応じて複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するエラー等級を決定できる。このとき、エラー分析部１５１３で決定された複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するエラー等級に関連した情報は、情報格納領域ＰＡ１、ＰＡ２、ＰＡ３、ＰＡ４、ＰＡ５、ＰＡ６、ＰＡ７、ＰＡ８に格納されることができる。 Then, the error analysis unit 1513 analyzes the log information LOG_INFO collected by the error collection unit 1511, and the number of errors generated in each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508. It is possible to grasp the type and form. Specifically, the error analysis unit 1513 analyzes the log information LOG_INFO collected by the error collection unit 1511 to detect errors that occur in each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508. The number and type can be confirmed, and the error grade for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 can be determined according to the number and type of confirmed errors. At this time, the information related to the error grade for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 determined by the error analysis unit 1513 is the information storage areas PA1, PA2, PA3, and so on. It can be stored in PA4, PA5, PA6, PA7, PA8.

そして、対応動作部１５１５は、エラー分析部１５１３で決定された複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するエラー等級によって複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対して互いに異なるエラー対応動作を行うことができる。このとき、対応動作部１５１５は、エラー分析部１５１３で決定された複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するエラー等級を情報格納領域ＰＡ１、ＰＡ２、ＰＡ３、ＰＡ４、ＰＡ５、ＰＡ６、ＰＡ７、ＰＡ８から読み出すことができる。 Then, the corresponding operation unit 1515 has a plurality of memory devices 1501, 1502, 1503 according to the error grades for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 determined by the error analysis unit 1513. , 1504, 1505, 1506, 1507, and 1508 can each perform different error handling operations. At this time, the corresponding operation unit 1515 sets the error grades for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 determined by the error analysis unit 1513 in the information storage areas PA1, PA2, and PA3. , PA4, PA5, PA6, PA7, PA8.

そして、メモリＥＣＣ１５１６は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に格納するためのデータに対してエラー訂正エンコード（ｅｒｒｏｒｃｏｒｒｅｃｔｉｏｎｅｎｃｏｄｉｎｇ）動作を行ってエラー訂正コード（ＥＣＣ、ＥｒｒｏｒＣｏｒｒｅｃｔｉｏｎＣｏｄｅ）を生成できる。メモリＥＣＣ１５１６は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に格納するデータにエラー訂正コードを含めたコードワード（ｃｏｄｅｗｏｒｄ）単位のデータを複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８に格納することができる。また、メモリＥＣＣ１５１６は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々から読み出されたデータにエラーが発生したか否かを確認する動作及び読み出されたデータにエラーが発生した場合、エラー訂正デコード（ｅｒｒｏｒｃｏｒｒｅｃｔｉｏｎｄｅｃｏｄｉｎｇ）、すなわち、エラー復旧動作を行ってエラー発生以前の正常データを復旧できる。このとき、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に格納されたデータがコードワード単位のデータであるから、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々から読み出されたデータもコードワード単位のデータでありうる。したがって、メモリＥＣＣ１５１６は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々から読み出されたコードワード単位のデータに含まれたエラー訂正コードを使用してエラー復旧動作を行うことができる。このとき、メモリＥＣＣ１５１６は、エラービット個数が訂正可能なエラービット限界値以上発生すれば、エラー復旧動作に失敗する可能性があり、エラーが発生したビットを訂正することができない。一方、メモリＥＣＣ１５１６は、パリティコード（ｐａｒｉｔｙｃｏｄｅ）、ハミングコード（ｈａｍｍｉｎｇｃｏｄｅ）、ＬＤＰＣ（ｌｏｗｄｅｎｓｉｔｙｐａｒｉｔｙｃｈｅｃｋ）コード（ｃｏｄｅ）、ＢＣＨ（Ｂｏｓｅ、Ｃｈａｕｄｈｒｉ、Ｈｏｃｑｕｅｎｇｈｅｍ）コード、ターボコード（ｔｕｒｂｏｃｏｄｅ）、リード−ソロモンコード（Ｒｅｅｄ−Ｓｏｌｏｍｏｎｃｏｄｅ）、コンボリューションコード（ｃｏｎｖｏｌｕｔｉｏｎｃｏｄｅ）、ＲＳＣ（ｒｅｃｕｒｓｉｖｅｓｙｓｔｅｍａｔｉｃｃｏｄｅ）、ＴＣＭ（ｔｒｅｌｌｉｓ−ｃｏｄｅｄｍｏｄｕｌａｔｉｏｎ）、ＢＣＭ（Ｂｌｏｃｋｃｏｄｅｄｍｏｄｕｌａｔｉｏｎ）などのコーデッドモジュレーション（ｃｏｄｅｄｍｏｄｕｌａｔｉｏｎ）を使用してエラー訂正を行うことができ、これに限定されるものではない。また、メモリＥＣＣ１５１６は、エラー訂正のためのコード、回路、モジュール、システム、または装置を全て含むことができる。 Then, the memory ECC 1516 performs an error correction encoding operation on the data to be stored in each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 to correct the error. Code (ECC, Error Correction Code) can be generated. The memory ECC1516 is a plurality of memory devices that store data in code word units including an error correction code in the data stored in each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508. It can be stored in 1501, 1502, 1503, 1504, 1505, 1506, 1507, 1508. Further, the memory ECC 1516 is operated and read to confirm whether or not an error has occurred in the data read from each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508. When an error occurs in the data, error correction decoding, that is, an error recovery operation can be performed to recover the normal data before the error occurs. At this time, since the data stored in each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 is the data in code word units, the plurality of memory devices 1501, 1502, 1503, 1504 , 1505, 1506, 1507, 1508 can also be code word unit data. Therefore, the memory ECC 1516 recovers the error by using the error correction code included in the code word unit data read from each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508. Can perform operations. At this time, if the number of error bits exceeds the correctable error bit limit value, the memory ECC1516 may fail in the error recovery operation, and the bit in which the error has occurred cannot be corrected. On the other hand, the memory ECC1516 includes a parity code (parity code), a humming code (hamming code), an LDPC (low density parity check) code (code), a BCH (Bose, Chaudri, Hocquengem) code, and a turbo code (turbo). -Solomon code (Reed-Solomon code), convolution code (convolution code), RSC (recursive systematic code), TCM (trellis-coded modulation), BCM (Block coded modulation), etc. Error correction can be performed without limitation. The memory ECC1516 can also include all codes, circuits, modules, systems, or devices for error correction.

図２〜図５Ｂは、本発明の実施形態に係るデータ処理システムのログ情報分析動作を説明するために示した図である。 2 to 5B are diagrams shown for explaining the log information analysis operation of the data processing system according to the embodiment of the present invention.

まず、図１Ａ及び図２に示すように、ログ情報ＬＯＧ＿ＩＮＦＯ及びエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯを分析して複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するエラー等級がどのような方式で決定されるか分かることができる。 First, as shown in FIGS. 1A and 2, what is the error grade for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 by analyzing the log information LOG_INFO and the error correction information ERR_CO_INFO? It is possible to know whether it is determined by such a method.

具体的に、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々は、アクセス動作、例えば、データの読み出し／書き込み動作を行う過程でエラーが発生して、内部に備えられたメモリＥＣＣ（ＥＣＣ１、ＥＣＣ２、ＥＣＣ３、ＥＣＣ４、ＥＣＣ５、ＥＣＣ６、ＥＣＣ７、ＥＣＣ８）の動作を介してエラーが復旧される場合、メモリＥＣＣ（ＥＣＣ１、ＥＣＣ２、ＥＣＣ３、ＥＣＣ４、ＥＣＣ５、ＥＣＣ６、ＥＣＣ７、ＥＣＣ８）によりエラーが復旧されたデータに対するログ情報ＬＯＧ＿ＩＮＦＯを生成できる。 Specifically, each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 causes an error in the process of performing an access operation, for example, a data read / write operation, and internally. When the error is recovered through the operation of the provided memory ECC (ECC1, ECC2, ECC3, ECC4, ECC5, ECC6, ECC7, ECC8), the memory ECC (ECC1, ECC2, ECC3, ECC4, ECC5, ECC6, ECC7) , ECC8) can generate log information LOG_INFO for the data for which the error has been recovered.

そして、ホスト１０２は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するアクセス動作、例えば、データの読み出し動作中、内部に備えられたホストＥＣＣ１０２６の動作を介してエラーが復旧される場合、ホストＥＣＣ１０２６によりエラーが復旧されたデータに対するエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯを生成できる。 Then, the host 102 uses an access operation for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508, for example, an operation of the host ECC1026 provided inside during a data read operation. When the error is recovered, the host ECC1026 can generate error correction information ERR_CO_INFO for the data for which the error has been recovered.

そして、ホスト１０２は、ログ情報ＬＯＧ＿ＩＮＦＯ及びエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯを収集して分析することができる。すなわち、ホスト１０２は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するエラー等級及びエラー強度を決定できる。 Then, the host 102 can collect and analyze the log information LOG_INFO and the error correction information ERR_CO_INFO. That is, the host 102 can determine the error grade and the error intensity for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508.

参考までに、図面に具体的に図示されていないが、メモリシステム１１０は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８とホスト１０２との間で信号を伝達するためのホストインターフェース（図示せず）をさらに備えることができる。すなわち、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々は、内部で生成されたログ情報ＬＯＧ＿ＩＮＦＯをホストインターフェースを介してホスト１０２に出力することができる。 For reference, although not specifically illustrated in the drawings, the memory system 110 transmits signals between the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, 1508 and the host 102. A host interface (not shown) for the purpose can be further provided. That is, each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 can output the internally generated log information LOG_INFO to the host 102 via the host interface.

同様に、図面に具体的に図示されていないが、ホスト１０２は、メモリシステム１１０とホスト１０２内部の他の構成要素１０２１、１０２３、１０２４、１０２５、１０２６との間で信号を伝達するためのメモリインターフェース（図示せず）をさらに備えることができる。すなわち、ホスト１０２は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８がメモリシステム１１０を介して出力したログ情報ＬＯＧ＿ＩＮＦＯをメモリインターフェースを介して伝達されることができる。 Similarly, although not specifically illustrated in the drawings, the host 102 is a memory for transmitting signals between the memory system 110 and other components 1021, 1023, 1024, 1025, 1026 within the host 102. Further interfaces (not shown) can be provided. That is, the host 102 can transmit the log information LOG_INFO output by the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 via the memory system 110 via the memory interface.

一方、ホスト１０２に備えられた第１のエラー分析部１０２３は、エラー収集部１０２１で収集されたログ情報ＬＯＧ＿ＩＮＦＯ及びエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯを分析して複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８のうち、エラー発生個数が第１基準個数以上であるメモリ装置を確認し、該当するメモリ装置を「第１のメモリ装置」に区分することができる（Ｓ１０）。 On the other hand, the first error analysis unit 1023 provided in the host 102 analyzes the log information LOG_INFO and the error correction information ERR_CO_INFO collected by the error collection unit 1021 and analyzes a plurality of memory devices 1501, 1502, 1503, 1504, 1505. , 1506, 1507, and 1508, it is possible to confirm the memory device in which the number of error occurrences is equal to or greater than the first reference number, and classify the corresponding memory device into the "first memory device" (S10).

例えば、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８のうち、１番目のメモリ装置１５０１に対するアクセス過程でメモリＥＣＣ（ＥＣＣ１、ＥＣＣ２、ＥＣＣ３、ＥＣＣ４、ＥＣＣ５、ＥＣＣ６、ＥＣＣ７、ＥＣＣ８）またはホストＥＣＣ１０２６により復旧されたエラーの個数が１２個であり、残りのメモリ装置１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するアクセス過程でメモリＥＣＣ（ＥＣＣ１、ＥＣＣ２、ＥＣＣ３、ＥＣＣ４、ＥＣＣ５、ＥＣＣ６、ＥＣＣ７、ＥＣＣ８）またはホストＥＣＣ１０２６により各々復旧されたエラーの個数が１０個未満であると仮定することができる。そして、第１基準個数は、１０個であると仮定することができる。このような場合、第１のエラー分析部１０２３は、１番目のメモリ装置１５０１を「第１のメモリ装置」に区分し、残りのメモリ装置１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対してはエラー等級を決定しないことができる。 For example, among a plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, 1508, the memory ECC (ECC1, ECC2, ECC3, ECC4, ECC5, ECC6, ECC7) is performed in the access process to the first memory device 1501. , ECC8) or the number of errors recovered by the host ECC1026 is 12, and the memory ECC (ECC1, ECC2, ECC3) is in the process of accessing each of the remaining memory devices 1502, 1503, 1504, 1505, 1506, 1507, and 1508. , ECC4, ECC5, ECC6, ECC7, ECC8) or the number of errors recovered by the host ECC1026 can be assumed to be less than 10. Then, it can be assumed that the first reference number is 10. In such a case, the first error analysis unit 1023 classifies the first memory device 1501 into the "first memory device", and the remaining memory devices 1502, 1503, 1504, 1505, 1506, 1507, 1508. The error grade may not be determined for each.

具体的に、第１のエラー分析部１０２３は、ログ情報ＬＯＧ＿ＩＮＦＯ及びエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯを分析して、「第１のメモリ装置」に区分されたメモリ装置で発生したエラーの種類を確認することができる（Ｓ１０のＹＥＳ）。このとき、第１のエラー分析部１０２３は、「第１のメモリ装置」に区分されたメモリ装置で発生したエラーの種類をワードライン単位で発生したエラー（Ｓ２０）と、シングルビット単位で発生したエラー（Ｓ３０）と、ビットライン単位で発生したエラー（Ｓ４０）と、その他のエラー（Ｓ５０）とに区分することができる。 Specifically, the first error analysis unit 1023 may analyze the log information LOG_INFO and the error correction information ERR_CO_INFO to confirm the type of error that occurred in the memory device classified as the "first memory device". Yes (YES in S10). At this time, the first error analysis unit 1023 describes the types of errors that occurred in the memory devices classified into the "first memory device" as an error (S20) that occurred in word line units and an error (S20) that occurred in single bit units. It can be divided into an error (S30), an error (S40) generated in bit line units, and other errors (S50).

ここで、ワードライン単位で発生したエラー（Ｓ２０）が意味することは、「第１のメモリ装置」に区分されたメモリ装置で発生した少なくとも２個以上のエラーが同じバンク内の同じワードラインで発生する場合を意味できる。そして、シングルビット単位で発生したエラー（Ｓ３０）が意味することは、同じワード、同じビットラインで１個以下のエラーが発生した場合を意味できる。そして、ビットライン単位で発生したエラー（Ｓ４０）が意味することは、「第１のメモリ装置」に区分されたメモリ装置で発生した少なくとも２個以上のエラーが同じビットラインで発生する場合を意味できる。そして、その他のエラー（Ｓ５０）が意味することは、「第１のメモリ装置」に区分されたメモリ装置で発生した少なくとも２個以上のエラーが特定の分布を有さない場合、例えば、ワードライン単位と、シングルビット単位と、ビットライン単位とで発生したことと判断されない場合を意味できる。 Here, the error (S20) generated in word line units means that at least two or more errors generated in the memory device classified as the "first memory device" are in the same word line in the same bank. It can mean when it occurs. And, the meaning of the error (S30) generated in a single bit unit can mean the case where one or less errors occur in the same word and the same bit line. The error (S40) that occurs in bit line units means that at least two or more errors that occur in the memory device classified as the "first memory device" occur in the same bit line. can. Then, the other error (S50) means that at least two or more errors generated in the memory device classified as the "first memory device" do not have a specific distribution, for example, a word line. It can mean a case where it is not determined that the occurrence occurs in units, single bit units, and bit line units.

そして、第１のエラー分析部１０２３において「第１のメモリ装置」に区分されたメモリ装置で発生したエラーの種類を確認した結果、ワードライン単位で発生したエラー（Ｓ２０）である場合、「第１のメモリ装置」に区分されたメモリ装置で同じワードラインで発生したエラーの個数をカウントすることができる（Ｓ６０）。カウント結果、エラーの個数が第２基準個数以上である場合（Ｓ７０のＹＥＳ）、当該メモリ装置を第１のエラー等級と決定して「第２のメモリ装置」に区分することができる（Ｓ９０）。カウント結果、エラーの個数が第２基準個数未満である場合（Ｓ７０のＮＯ）、当該メモリ装置を第２のエラー等級と決定して「第３のメモリ装置」に区分することができる（Ｓ８０）。 Then, as a result of confirming the type of the error generated in the memory device classified into the "first memory device" by the first error analysis unit 1023, if the error (S20) occurs in word line units, the "first". It is possible to count the number of errors that occur in the same word line in the memory devices classified into "1 memory device" (S60). As a result of counting, when the number of errors is equal to or greater than the second reference number (YES in S70), the memory device can be determined as the first error class and classified as the "second memory device" (S90). .. As a result of counting, when the number of errors is less than the second reference number (NO in S70), the memory device can be determined as the second error class and classified as the "third memory device" (S80). ..

そして、第１のエラー分析部１０２３において「第１のメモリ装置」に区分されたメモリ装置１５０１で発生したエラーの種類を確認した結果、シングルビット単位で発生したエラー（Ｓ３０のＹＥＳ）と、ビットライン単位で発生したエラー（Ｓ４０のＹＥＳ）と、その他のエラー（Ｓ５０のＹＥＳ）とのみ存在するメモリ装置の場合、当該メモリ装置を第２のエラー等級と決定して「第３のメモリ装置」に区分することができる（Ｓ８０）。 Then, as a result of confirming the type of the error that occurred in the memory device 1501 classified as the "first memory device" in the first error analysis unit 1023, the error that occurred in a single bit unit (YES in S30) and the bit. In the case of a memory device in which only an error generated in line units (YES in S40) and other errors (YES in S50) exist, the memory device is determined to be the second error grade and the "third memory device" is determined. It can be classified into (S80).

例を挙げてまとめると、「第１のメモリ装置」に区分された１番目のメモリ装置１５０１で発生したエラーがワードライン単位で発生したエラーであり、同じワードラインで発生したエラーの個数が第２基準個数以上であることと仮定することができる。したがって、第１のエラー分析部１０２３は、「第１のメモリ装置」に区分された１番目のメモリ装置１５０１を第１のエラー等級と決定して「第２のメモリ装置」に区分することができる。 To summarize by giving an example, the error that occurred in the first memory device 1501 classified as the "first memory device" is an error that occurred in word line units, and the number of errors that occurred in the same word line is the first. It can be assumed that the number is 2 or more. Therefore, the first error analysis unit 1023 may determine the first memory device 1501 classified as the "first memory device" as the first error grade and classify it as the "second memory device". can.

図１Ａと図２及び図３に示すように、ホスト１０２に含まれた第２のエラー分析部１０２４は、第１のエラー分析部１０２３で決定された複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するエラー等級によって複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８のうち、一部のメモリ装置を選択できる。また、第２のエラー分析部１０２４は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８のうち、選択された一部のメモリ装置に対しては、ログ情報ＬＯＧ＿ＩＮＦＯ及びエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯの追加分析を介してエラーの形態及び個数を確認してエラー強度を決定し、一部のメモリ装置を除いた残りのメモリ装置に対しては、第１のエラー分析部１０２３で決定されたエラー等級に対応するようにエラー強度を決定できる。 As shown in FIGS. 1A, 2 and 3, the second error analysis unit 1024 included in the host 102 is a plurality of memory devices 1501, 1502, 1503, 1504 determined by the first error analysis unit 1023. , 1505, 1506, 1507, 1508, respectively, and some memory devices can be selected from a plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, 1508 depending on the error grade. Further, the second error analysis unit 1024 transfers the log information LOG_INFO and the log information LOG_INFO and the log information LOG_INFO for some of the selected memory devices among the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508. Error correction information The form and number of errors are confirmed through additional analysis of ERR_CO_INFO to determine the error intensity, and for the remaining memory devices excluding some memory devices, the first error analysis unit 1023 The error intensity can be determined to correspond to the determined error grade.

具体的に、第１のエラー分析部１０２３は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８をエラー等級が決定されなかったメモリ装置と、第１のエラー等級と決定された「第２のメモリ装置」と、第２のエラー等級と決定された「第３のメモリ装置」とに区分したことがある。 Specifically, the first error analysis unit 1023 sets a plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 as a memory device for which an error grade has not been determined, and a first error grade. It has been divided into a determined "second memory device" and a determined "third memory device" with a second error grade.

そして、第２のエラー分析部１０２４は、第１のエラー分析部１０２３で決定されたエラー等級が第１のエラー等級であるか否かを確認（Ｋ１０）できる。 Then, the second error analysis unit 1024 can confirm (K10) whether or not the error grade determined by the first error analysis unit 1023 is the first error grade.

Ｋ１０動作の確認結果、第２のエラー分析部１０２４は、第１のエラー分析部１０２３で第１のエラー等級と決定されなかったメモリ装置の場合（Ｋ１０のＮＯ）、すなわち、エラー等級が決定されなかったメモリ装置及び第２のエラー等級と決定された「第３のメモリ装置」に対して第２のエラー強度を付加して「第５のメモリ装置」に区分することができる（Ｋ７０）。このとき、第２のエラー強度が付加されて「第５のメモリ装置」に区分されたメモリ装置に対しては、対応動作部１０２５で第２のエラー対応動作を行うことができる（Ｋ８０）。 As a result of confirming the operation of K10, the second error analysis unit 1024 determines the error grade in the case of the memory device (NO of K10) which is not determined as the first error grade by the first error analysis unit 1023. A second error strength can be added to the missing memory device and the "third memory device" determined to be the second error grade, and the device can be classified into the "fifth memory device" (K70). At this time, for the memory device to which the second error strength is added and classified as the "fifth memory device", the corresponding operation unit 1025 can perform the second error handling operation (K80).

Ｋ１０動作の確認結果、第２のエラー分析部１０２４は、第１のエラー分析部１０２３で第１のエラー等級と決定されたメモリ装置の場合（Ｋ１０のＹＥＳ）、すなわち、第１のエラー等級と決定された「第２のメモリ装置」の場合、追加にログ情報ＬＯＧ＿ＩＮＦＯ及びエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯを分析してエラーの形態及び個数を確認した後、エラー強度を決定できる。具体的に、第２のエラー分析部１０２４は、第１のエラー等級と決定された「第２のメモリ装置」に対するログ情報ＬＯＧ＿ＩＮＦＯ及びエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯを追加に分析して「第２のメモリ装置」で発生したエラーの形態が第３基準個数以上のコードワード単位にまたがって（ａｃｒｏｓｓ）いるか否かを確認できる（Ｋ３０）。 As a result of confirming the operation of K10, the second error analysis unit 1024 is the case of the memory device determined by the first error analysis unit 1023 as the first error grade (YES of K10), that is, the first error grade. In the case of the determined "second memory device", the error intensity can be determined after additionally analyzing the log information LOG_INFO and the error correction information ERR_CO_INFO to confirm the form and number of errors. Specifically, the second error analysis unit 1024 additionally analyzes the log information LOG_INFO and the error correction information ERR_CO_INFO for the "second memory device" determined to be the first error grade, and "second memory device". It can be confirmed whether or not the form of the error generated in "is across (across) the codeword unit of the third reference number or more" (K30).

ここで、コードワード単位に「またがって（ａｃｒｏｓｓ）」いるエラーを確認する動作がいかなる意味を有するかを説明するために、図４〜図５Ｂを参照すれば、次のとおりである。 Here, with reference to FIGS. 4 to 5B, it is as follows in order to explain what the operation of confirming the error "accross" in codeword units has.

まず、図４に示すように、コードワード単位の基本的な意味は、ホスト１０２に備えられたホストＥＣＣ１０２６でエラーを訂正する動作を行うとき、動作の基準になるデータの量を表す単位を意味できる。例えば、ホスト１０２でメモリシステム１１０に格納するために、５１２ビットのデータを生成した場合（４０１）、ホストＥＣＣ１０２６は、５１２ビットのデータに対してエラー訂正エンコード動作（４０２）を行って６４ビットのエラー訂正コードを生成（４０３）できる。このとき、ホスト１０２は、内部で生成された５１２ビットのデータと６４ビットのエラー訂正コードとを合わせた合計５７２ビットのデータを２個のコードワード単位に分けて管理（４０４）することができる。すなわち、１つのコードワード単位には、ホスト１０２内部で生成された２５６ビットのデータと３２ビットのエラー訂正コードとを合わせた２８８ビットのデータが含まれ得る。ホスト１０２は、５７２ビットのデータをメモリシステム１１０に出力することができる（４０５）。参考までに、図面では、５７２ビットのデータを２個のコードワード単位に管理することを例示したが、実際には、より少ない個数のコードワード単位に管理するか、さらに多くの個数のコードワード単位に管理することもいくらでも可能である。 First, as shown in FIG. 4, the basic meaning of the code word unit means a unit representing the amount of data that serves as a reference for the operation when the host ECC1026 provided in the host 102 performs an operation for correcting an error. can. For example, when 512-bit data is generated for storage in the memory system 110 by the host 102 (401), the host ECC1026 performs an error correction encoding operation (402) on the 512-bit data to perform a 64-bit data. An error correction code can be generated (403). At this time, the host 102 can manage (404) a total of 572 bits of data including the internally generated 512-bit data and the 64-bit error correction code by dividing it into two code word units. .. That is, one code word unit may include 288-bit data in which 256-bit data generated inside the host 102 and 32-bit error correction code are combined. The host 102 can output 572 bits of data to the memory system 110 (405). For reference, in the drawing, it is illustrated that 572 bit data is managed in units of two codewords, but in reality, it is managed in units of a smaller number of codewords or a larger number of codewords. It is possible to manage as many units as you like.

そして、図４においてメモリシステム１１０は、図１の実施形態とは異なり、合計１８個のメモリ装置（１８ｄｅｖｉｃｅｓ）を含むことと仮定させる。このとき、メモリシステム１１０は、ホスト１０２から入力された５７２ビットのデータ（４０５）を１８個のメモリ装置（１８ｄｅｖｉｃｅｓ）に分散させて格納することができる。したがって、１８個のメモリ装置（１８ｄｅｖｉｃｅｓ）の各々に３２ビットのデータが格納され得る。また、メモリシステム１１０は、ホスト１０２で５７２ビットを２個のコードワード単位に分けて管理することによって１８個のメモリ装置（１８ｄｅｖｉｃｅｓ）の各々に２個のコードワード単位に対応するデータを格納することができる。したがって、１８個のメモリ装置（１８ｄｅｖｉｃｅｓ）の各々に１番目のコードワード単位（Ｃｏｄｅｗｏｒｄ０）に対応する１６ビットのデータと２番目のコードワード単位（Ｃｏｄｅｗｏｒｄ１）に対応する１６ビットのデータとが格納され得る。すなわち、メモリシステム１１０は、１番目のコードワード単位（Ｃｏｄｅｗｏｒｄ０）に対応する２８８ビットのデータ及び２番目のコードワード単位（Ｃｏｄｅｗｏｒｄ１）に対応する２８８ビットのデータを１８個のメモリ装置（１８ｄｅｖｉｃｅｓ）に分散させて格納することができる。 Then, in FIG. 4, unlike the embodiment of FIG. 1, it is assumed that the memory system 110 includes a total of 18 memory devices (18 devices). At this time, the memory system 110 can distribute and store the 572 bit data (405) input from the host 102 in 18 memory devices (18 devices). Therefore, 32 bits of data can be stored in each of the 18 memory devices (18 devices). Further, the memory system 110 stores data corresponding to each of the two codeword units in each of the 18 memory devices (18 devices) by managing the 572 bits divided into two codeword units on the host 102. can do. Therefore, 16-bit data corresponding to the first codeword unit (Codeword0) and 16-bit data corresponding to the second codeword unit (Codeword1) are stored in each of the 18 memory devices (18 devices). Can be done. That is, the memory system 110 stores 288-bit data corresponding to the first code word unit (Codeword 0) and 288-bit data corresponding to the second code word unit (Codeword 1) into 18 memory devices (18 devices). Can be distributed and stored in.

そして、メモリシステム１１０は、ホスト１０２から入力された５７２ビットのデータを１８個のメモリ装置（１８ｄｅｖｉｃｅｓ）の各々に格納するとき、５７２ビットのデータが連続したデータであるということを認識して、バーストランス（ＢｕｒｓｔＬｅｎｇｔｈ、ＢＬ）を設定して格納することができる。このとき、１８個のメモリ装置（１８ｄｅｖｉｃｅｓ）の各々が４個のデータ入出力端（×４）を有することと仮定することができる。したがって、メモリシステム１１０は、１番目のコードワード単位（Ｃｏｄｅｗｏｒｄ０）に対応する２８８ビットのデータを１８個のメモリ装置（１８ｄｅｖｉｃｅｓ）に上位バーストランス４（ＢＬ４）と指定して１６ビットずつ格納し、２番目のコードワード単位（Ｃｏｄｅｗｏｒｄ１）に対応する２８８ビットのデータを１８個のメモリ装置（１８ｄｅｖｉｃｅｓ）に下位バーストランス４（ＢＬ４）と指定して１６ビットずつ格納することができる。 Then, when the memory system 110 stores the 572 bit data input from the host 102 in each of the 18 memory devices (18 devices), the memory system 110 recognizes that the 572 bit data is continuous data. , Burst Length (BL) can be set and stored. At this time, it can be assumed that each of the 18 memory devices (18 devices) has four data input / output terminals (x4). Therefore, the memory system 110 stores 288-bit data corresponding to the first code word unit (Codeword 0) in 18 memory devices (18 devices) by designating the upper verse transformer 4 (BL4) and storing 16 bits at a time. The 288-bit data corresponding to the second code word unit (Codeword 1) can be stored in 18 memory devices (18 devices) by designating the lower berth transformer 4 (BL4) and storing 16 bits at a time.

図４を参照して説明した内容のように、ホスト１０２は、少なくとも１つ以上のコードワード単位で管理されるデータをメモリシステム１１０に出力することができる。また、メモリシステム１１０は、ホスト１０２から入力されたデータをコードワード単位に対応する形態で複数のメモリ装置に分散させて格納することができる。 As described with reference to FIG. 4, the host 102 can output data managed in units of at least one or more codewords to the memory system 110. Further, the memory system 110 can distribute and store the data input from the host 102 in a plurality of memory devices in a form corresponding to each code word.

図５Ａに示すように、図４において説明したように、１個のメモリ装置（Ｄｅｖｉｃｅ）に３２ビットのデータが２個のコードワード単位（Ｃｏｄｅｗｏｒｄ０、Ｃｏｄｅｗｏｒｄ１）の各々に対応する１６ビットずつのデータに分けられた後、４個のデータ入出力端（×４、ＤＱ＜０：３＞）に読み出されることが分かる。このとき、図面では、特定データ入出力端、例えば、１番及び３番のデータ入出力端ＤＱ＜１、３＞を介して読み出されたデータにエラービット（ＥＲＲＯＲＢＩＴ）が発生したことをわかることができる。すなわち、エラーの発生原因まで図面に含まれてはいないが、特定データ入出力端に読み出されたデータにエラーが発生したことにより、エラービット（ＥＲＲＯＲＢＩＴ）が２個のコードワード単位（Ｃｏｄｅｗｏｒｄ０、Ｃｏｄｅｗｏｒｄ１）にまたがっている状態（ａｃｒｏｓｓ）になることが分かる。 As shown in FIG. 5A, as described in FIG. 4, 32-bit data in one memory device (Device) is 16-bit data corresponding to each of two codeword units (Codeword0, Codeword1). After being divided into, it can be seen that the data is read out at four data input / output ends (x4, DQ <0: 3>). At this time, in the drawing, it is shown that an error bit (ERROR BIT) has occurred in the data read through the specific data input / output ends, for example, the data input / output ends DQ <1, 3> of Nos. 1 and 3. I can understand. That is, although the cause of the error is not included in the drawing, the error bit (ERROR BIT) is two codeword units (Codeword0) due to the error occurring in the data read at the specific data input / output end. , Codeword1), it can be seen that the state (errors) is straddled.

図５Ｂに示すように、図５Ａと同様に、１個のメモリ装置（Ｄｅｖｉｃｅ）に３２ビットのデータが２個のコードワード単位（Ｃｏｄｅｗｏｒｄ０、Ｃｏｄｅｗｏｒｄ１）の各々に対応する１６ビットずつのデータに分けられた後、４個のデータ入出力端（×４、ＤＱ＜０：３＞）に読み出されることが分かる。このとき、図面では、１番目のコードワード単位（Ｃｏｄｅｗｏｒｄ０）に含まれた読み出しデータにはエラービット（ＥＲＲＯＲＢＩＴ）が発生するが、２番目のコードワード単位（Ｃｏｄｅｗｏｒｄ１）に含まれた読み出しデータではエラーが発生しなかったことをわかることができる。すなわち、エラーの発生原因まで図面に含まれてはいないが、エラービット（ＥＲＲＯＲＢＩＴ）は、１個のコードワード単位（Ｃｏｄｅｗｏｒｄ０）にのみ含まれ、２個のコードワード単位（Ｃｏｄｅｗｏｒｄ０、Ｃｏｄｅｗｏｒｄ１）にまたがっている状態（ａｃｒｏｓｓ）でないことが分かる。 As shown in FIG. 5B, similarly to FIG. 5A, 32-bit data is divided into 16-bit data corresponding to each of two codeword units (Codeword0, Codeword1) in one memory device (Device). It can be seen that the data is read to the four data input / output terminals (x4, DQ <0: 3>). At this time, in the drawing, an error bit (ERROR BIT) is generated in the read data included in the first codeword unit (Codeword0), but in the read data included in the second codeword unit (Codeword1). It can be seen that no error occurred. That is, although the cause of the error is not included in the drawing, the error bit (ERROR BIT) is included in only one codeword unit (Codeword0) and in two codeword units (Codeword0, Codeword1). It can be seen that it is not in a straddling state (acloss).

さらに、図１Ａと図２及び図３に示すように、第２のエラー分析部１０２４で第１のエラー等級と決定された「第２のメモリ装置」に対するログ情報ＬＯＧ＿ＩＮＦＯを追加に分析して「第２のメモリ装置」で発生したエラーの形態が第３基準個数以上のコードワード単位にまたがって（ａｃｒｏｓｓ）いるか否かを確認する動作（Ｋ３０）は、第３基準個数を２個であると仮定するとき、「第２のメモリ装置」で発生したエラーの形態が図５Ａにおいて例示したように、２個のコードワード単位にまたがった形態であるか、それとも、図５Ｂにおいて例示したように、１個のコードワード単位にのみ含まれた形態であるかを確認する動作でありうる。 Further, as shown in FIGS. 1A, 2 and 3, the log information LOG_INFO for the "second memory device" determined to be the first error grade by the second error analysis unit 1024 is additionally analyzed and ". The operation (K30) for confirming whether or not the form of the error generated in the "second memory device" is across codeword units equal to or greater than the third reference number (K30) is that the third reference number is two. Assuming, the form of the error generated in the "second memory device" is a form straddling two codeword units as illustrated in FIG. 5A, or as illustrated in FIG. 5B. It may be an operation of confirming whether or not the form is included in only one code word unit.

Ｋ３０動作の確認結果、「第２のメモリ装置」で発生したエラーの形態が第３基準個数以上のコードワード単位にまたがった形態である場合（Ｋ３０のＹＥＳ）、第２のエラー分析部１０２４は、第３基準個数以上のコードワードにまたがったエラー個数の合計が第４基準個数以上であるか否かを確認できる（Ｋ４０）。例えば、第４基準個数を８個であると仮定するとき、図５Ａにおいて例示したように、２個のコードワード単位（Ｃｏｄｅｗｏｒｄ０、Ｃｏｄｅｗｏｒｄ１）にまたがったエラービットの合計が１６個であるから、８個である第４基準個数でありうる。 As a result of confirming the operation of K30, when the form of the error generated in the "second memory device" is a form straddling the codeword unit of the third reference number or more (YES of K30), the second error analysis unit 1024 , It is possible to confirm whether or not the total number of errors straddling the codewords of the third reference number or more is the fourth reference number or more (K40). For example, assuming that the fourth reference number is 8, as illustrated in FIG. 5A, the total number of error bits straddling two codeword units (Codeword0, Codeword1) is 16, so 8 It can be the fourth reference number, which is the number of pieces.

Ｋ４０動作の確認結果、「第２のメモリ装置」で発生したエラーの形態が第３基準個数以上のコードワード単位にまたがった形態（Ｋ３０のＹＥＳ）であり、第３基準個数以上のコードワードにまたがったエラー個数の合計が第４基準個数以上である場合（Ｋ４０のＹＥＳ）、第２のエラー分析部１０２４は、該当する「第２のメモリ装置」に対して第１のエラー強度を付加して「第４のメモリ装置」に区分することができる（Ｋ５０）。このとき、第１のエラー強度が付加されて「第４のメモリ装置」に区分されるメモリ装置に対しては、対応動作部１０２５で第１のエラー対応動作を行うことができる（Ｋ６０）。 As a result of confirming the operation of K40, the form of the error generated in the "second memory device" is a form that straddles the codeword unit of the third reference number or more (YES of K30), and the codeword has the codeword of the third reference number or more. When the total number of straddled errors is equal to or greater than the fourth reference number (YES in K40), the second error analysis unit 1024 adds the first error strength to the corresponding "second memory device". Can be classified into a "fourth memory device" (K50). At this time, for the memory device to which the first error strength is added and classified as the "fourth memory device", the corresponding operation unit 1025 can perform the first error handling operation (K60).

Ｋ３０動作の確認結果、「第２のメモリ装置」で発生したエラーの形態が第３基準個数未満のコードワード単位にのみ含まれた形態である場合（Ｋ３０のＮＯ）、第２のエラー分析部１０２４は、該当する「第２のメモリ装置」に対して第２のエラー強度を付加して「第５のメモリ装置」に区分することができる（Ｋ７０）。このとき、第２のエラー強度が付加されて「第５のメモリ装置」に区分されるメモリ装置に対しては、対応動作部１０２５で第２のエラー対応動作を行うことができる（Ｋ８０）。 As a result of confirming the operation of K30, when the form of the error generated in the "second memory device" is included only in the codeword unit less than the third reference number (NO of K30), the second error analysis unit. 1024 can be classified as a "fifth memory device" by adding a second error strength to the corresponding "second memory device" (K70). At this time, for the memory device to which the second error strength is added and classified as the "fifth memory device", the corresponding operation unit 1025 can perform the second error handling operation (K80).

第２のエラー分析部１０２４の動作を例を挙げて説明すれば、次のとおりである。 The operation of the second error analysis unit 1024 will be described with an example as follows.

まず、図２において例を挙げて説明したように、第１のエラー分析部１０２３は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８のうち、１番目のメモリ装置１５０１を第１のエラー等級と決定して「第２のメモリ装置」に区分し、残りのメモリ装置１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８に対してはエラー等級を決定しなかったし、第２のエラー等級と決定されて「第３のメモリ装置」に区分されたメモリ装置はないことと仮定したことがある。 First, as described with reference to FIG. 2, the first error analysis unit 1023 is the first memory device among the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508. 1501 was determined as the first error grade and classified as a "second memory device", and no error grade was determined for the remaining memory devices 1502, 1503, 1504, 1505, 1506, 1507, 1508. However, it has been assumed that there is no memory device determined to be the second error grade and classified as the "third memory device".

このとき、第２のエラー分析部１０２４は、第１のエラー分析部１０２３で第１のエラー等級と決定されなかったメモリ装置の場合（Ｋ１０のＮＯ）、すなわち、エラー等級が決定されなかったメモリ装置及び第２のエラー等級と決定された「第３のメモリ装置」に対して第２のエラー強度を付加して「第５のメモリ装置」に区分することができる（Ｋ７０）。したがって、第２のエラー分析部１０２４は、エラー等級が決定されなかった残りのメモリ装置１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８に対して第２のエラー強度を付加して「第５のメモリ装置」に区分することができる（Ｋ７０）。 At this time, the second error analysis unit 1024 is the case of the memory device (NO of K10) in which the first error analysis unit 1023 did not determine the first error grade, that is, the memory in which the error grade was not determined. A second error strength can be added to the device and the "third memory device" determined to be the second error grade, and the device can be classified into the "fifth memory device" (K70). Therefore, the second error analysis unit 1024 adds a second error intensity to the remaining memory devices 1502, 1503, 1504, 1505, 1506, 1507, 1508 for which the error grade has not been determined, and "fifth. It can be classified into "memory device" (K70).

そして、第２のエラー分析部１０２４は、第１のエラー分析部１０２３で第１のエラー等級と決定されたメモリ装置の場合（Ｋ１０のＹＥＳ）、すなわち、第１のエラー等級と決定された「第２のメモリ装置」の場合、追加にログ情報ＬＯＧ＿ＩＮＦＯ及びエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯを分析してエラーの形態及び個数を確認した後、エラー強度を決定できる。したがって、第２のエラー分析部１０２４は、第１のエラー等級と決定されて「第２のメモリ装置」に区分された１番目のメモリ装置１５０１に対して追加にログ情報ＬＯＧ＿ＩＮＦＯ及びエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯを分析してエラーの形態及び個数を確認した後、エラー強度を決定できる。 Then, the second error analysis unit 1024 is the case of the memory device determined to be the first error grade by the first error analysis unit 1023 (YES of K10), that is, the first error grade is determined to be ". In the case of the "second memory device", the error intensity can be determined after additionally analyzing the log information LOG_INFO and the error correction information ERR_CO_INFO to confirm the form and number of errors. Therefore, the second error analysis unit 1024 additionally logs information LOG_INFO and error correction information ERR_CO_INFO for the first memory device 1501 determined to be the first error grade and classified as the "second memory device". After confirming the form and number of errors, the error intensity can be determined.

具体的に、第２のエラー分析部１０２４は、１番目のメモリ装置１５０１に対して追加にログ情報ＬＯＧ＿ＩＮＦＯ及びエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯを分析してエラーの形態が第３基準個数以上のコードワード単位にまたがっているか否かを確認できる（Ｋ３０）。その結果、１番目のメモリ装置１５０１で発生したエラーの形態が第３基準個数以上のコードワード単位にまたがっていることと仮定することができる（Ｋ３０のＹＥＳ）。したがって、第２のエラー分析部１０２４は、１番目のメモリ装置１５０１で第３基準個数以上のコードワード単位にまたがっているエラーに含まれたエラービット個数の合計が第４基準個数以上であるか否かを確認できる（Ｋ４０）。確認結果、１番目のメモリ装置１５０１で第３基準個数以上のコードワード単位にまたがっているエラーの個数が第４基準個数以上であることと仮定することができる（Ｋ４０のＹＥＳ）。したがって、第２のエラー分析部１０２３は、１番目のメモリ装置１５０１に対して第１のエラー強度を付加して「第４のメモリ装置」に区分することができる（Ｋ６０）。 Specifically, the second error analysis unit 1024 additionally analyzes the log information LOG_INFO and the error correction information ERR_CO_INFO for the first memory device 1501, and the form of the error is in codeword units of the third reference number or more. It can be confirmed whether or not it is straddling (K30). As a result, it can be assumed that the form of the error generated in the first memory device 1501 spans the codeword units of the third reference number or more (YES in K30). Therefore, in the second error analysis unit 1024, whether the total number of error bits included in the error straddling the codeword unit of the third reference number or more in the first memory device 1501 is the fourth reference number or more. It can be confirmed whether or not (K40). As a result of confirmation, it can be assumed that the number of errors straddling the codeword unit of the third reference number or more in the first memory device 1501 is the fourth reference number or more (YES of K40). Therefore, the second error analysis unit 1023 can add the first error strength to the first memory device 1501 and classify it into the "fourth memory device" (K60).

一方、ホスト１０２に含まれた対応動作部１０２５は、第２のエラー分析部１０２４で決定された複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するエラー強度によって複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対して互いに異なるエラー対応動作を行うことができる。 On the other hand, the corresponding operation unit 1025 included in the host 102 depends on the error strength for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 determined by the second error analysis unit 1024. It is possible to perform different error handling operations for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508.

具体的に、対応動作部１０２５は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８のうち、第２のエラー分析部１０２４で第１のエラー強度を付加して「第４のメモリ装置」に区分されたメモリ装置に対して第１のエラー対応動作を行うことができる。また、対応動作部１０２５は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８のうち、第２のエラー分析部１０２４で第２のエラー強度を付加して「第５のメモリ装置」に区分されたメモリ装置に対して第２のエラー対応動作を行うことができる。 Specifically, the corresponding operation unit 1025 adds the first error strength to the second error analysis unit 1024 among the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508. The first error handling operation can be performed on the memory device classified into the "fourth memory device". Further, the corresponding operation unit 1025 adds a second error strength to the second error analysis unit 1024 among the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508, and "fifth. The second error handling operation can be performed on the memory device classified into the "memory device".

ここで、第１のエラー対応動作は、次のような動作のうち、少なくともいずれか１つの動作を含むことができる。 Here, the first error handling operation can include at least one of the following operations.

１番目の動作は、「第４のメモリ装置」に区分されたメモリ装置でエラーが発生した領域を選択してアクセスを遮断する動作である。例えば、対応動作部１０２５は、「第４のメモリ装置」に区分された１番目のメモリ装置１５０１で特定ブロックまたは特定ワードラインまたは特定ビットラインを選択してアクセスを遮断できる。このとき、対応動作部１０２５は、アクセス遮断対象である特定ブロックまたは特定ワードラインまたは特定ビットラインに格納されたデータを複写して「他の領域」に格納した後、アクセス遮断動作を行うことができる。このとき、「他の領域」は、１番目のメモリ装置１５０１の他の正常なブロックまたは他の正常なワードラインまたは他の正常なビットラインになることができる。また、「他の領域」は、１番目のメモリ装置１５０１でない他のメモリ装置１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８に含まれた他の正常なブロックまたは他の正常なワードラインまたは他の正常なビットラインになることができる。参考までに、アクセス遮断対象である特定ブロックまたは特定ワードラインまたは特定ビットラインに格納されたデータを「他の領域」に格納する動作が正常に行われ得る理由は、特定ブロックまたは特定ワードラインまたは特定ビットラインの場合、隣接した未来時点に復旧不可能なエラーが発生する可能性が高いと予想してアクセス遮断動作対象と選択されただけであり、現在時点では、正常動作する状態または復旧可能なエラーのみ発生する状態であるためである。 The first operation is an operation of blocking access by selecting an area in which an error has occurred in the memory device classified into the "fourth memory device". For example, the corresponding operation unit 1025 can block access by selecting a specific block, a specific word line, or a specific bit line in the first memory device 1501 classified as the “fourth memory device”. At this time, the corresponding operation unit 1025 may perform the access blocking operation after copying the data stored in the specific block, the specific word line, or the specific bit line that is the access blocking target and storing it in the "other area". can. At this time, the "other area" can be another normal block or other normal wordline or other normal bitline of the first memory device 1501. Also, the "other area" is another normal block or other normal wordline contained in another memory device 1502, 1503, 1504, 1505, 1506, 1507, 1508 that is not the first memory device 1501. It can be another normal bitline. For reference, the reason why the operation of storing the data stored in the specific block or specific word line or specific bit line that is the access blocking target in the "other area" can be performed normally is because of the specific block or specific word line or In the case of a specific bitline, it was only selected as an access blocking operation target in anticipation that an unrecoverable error is likely to occur at an adjacent future point in time, and at the present time, it is in a normal operating state or recoverable. This is because it is in a state where only such errors occur.

２番目の動作は、「第４のメモリ装置」に区分されたメモリ装置でエラーが発生した領域を選択してリペア（ｒｅｐａｉｒ）する動作である。例えば、対応動作部１０２５は、「第４のメモリ装置」に区分された１番目のメモリ装置１５０１で特定ブロックまたは特定ワードラインまたは特定ビットラインを他の正常なリダンダンシーブロックまたはリダンダンシーワードラインまたはリダンダンシービットラインでリペアすることができる。このとき、ホスト１０２は、リペア対象になる１番目のメモリ装置１５０１でリペア動作が完了するまで１番目のメモリ装置１５０１に対するアクセスが中断されるようにすることができる。そして、リペア対象になる１番目のメモリ装置１５０１は、リペア対象である特定ブロックまたは特定ワードラインまたは特定ビットラインに格納されたデータを内部の情報格納領域ＰＡ１に複写した後、リペア動作を行うことができる。リペア動作が完了した後、１番目のメモリ装置１５０１は、情報格納領域ＰＡ１に複写されたデータをリペア完了したリダンダンシーブロックまたはリダンダンシーワードラインまたはリダンダンシービットラインに復旧することができる。参考までに、リペア対象である特定ブロックまたは特定ワードラインまたは特定ビットラインに格納されたデータを内部の情報格納領域ＰＡ１に複写する動作が正常に行われ得る理由は、特定ブロックまたは特定ワードラインまたは特定ビットラインの場合、隣接した未来時点に復旧不可能なエラーが発生する可能性が高いと予想してリペア対象と選択されただけであり、現在時点では、正常動作する状態または復旧可能なエラーのみ発生する状態であるためである。 The second operation is an operation of selecting and repairing an area in which an error has occurred in the memory device classified into the "fourth memory device". For example, the corresponding operation unit 1025 sets a specific block or a specific word line or a specific bit line in the first memory device 1501 classified as the “fourth memory device” into another normal redundancy block or redundancy word line or redundancy bit. Can be repaired on the line. At this time, the host 102 can suspend access to the first memory device 1501 until the repair operation is completed in the first memory device 1501 to be repaired. Then, the first memory device 1501 to be repaired performs the repair operation after copying the data stored in the specific block, the specific word line, or the specific bit line to be repaired to the internal information storage area PA1. Can be done. After the repair operation is completed, the first memory device 1501 can restore the data copied to the information storage area PA1 to the repaired redundancy block, redundancy sea word line, or redundancy bit line. For reference, the reason why the operation of copying the data stored in the specific block, the specific word line, or the specific bit line to be repaired to the internal information storage area PA1 can be normally performed is the specific block, the specific word line, or the specific bit line. For a particular bitline, it was only selected for repair in anticipation of an unrecoverable error likely to occur at an adjacent future point in time, and is currently in a working state or recoverable error. This is because it is a state in which only occurs.

３番目の動作は、「第４のメモリ装置」に区分されたメモリ装置でエラーが発生した領域を選択してディセーブル（ｄｉｓａｂｌｅ）させる動作である。例えば、対応動作部１０２５は、「第４のメモリ装置」に区分された１番目のメモリ装置１５０１で特定ブロックまたは特定ワードラインまたは特定ビットラインをディセーブルさせることができる。このとき、ディセーブル対象になる１番目のメモリ装置１５０１は、ディセーブル対象である特定ブロックまたは特定ワードラインまたは特定ビットラインに格納されたデータを複写して「他の領域」に格納することができ、データが「他の領域」へ移動したことをホスト１０２に通知することができる。このとき、「他の領域」は、１番目のメモリ装置１５０１の他の正常なブロックまたは他の正常なワードラインまたは他の正常なビットラインになることができる。また、「他の領域」は、１番目のメモリ装置１５０１でない他のメモリ装置１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８に含まれた他の正常なブロックまたは他の正常なワードラインまたは他の正常なビットラインになることができる。参考までに、ディセーブル対象である特定ブロックまたは特定ワードラインまたは特定ビットラインに格納されたデータを内部の「他の領域」に格納する動作が正常に行われ得る理由は、特定ブロックまたは特定ワードラインまたは特定ビットラインの場合、隣接した未来時点に復旧不可能なエラーが発生する可能性が高いと予想してディセーブル対象と選択されただけであり、現在時点では、正常動作する状態または復旧可能なエラーのみ発生する状態であるためである。 The third operation is an operation of selecting and disabling an area in which an error has occurred in the memory device classified into the "fourth memory device". For example, the corresponding operation unit 1025 can disable a specific block, a specific word line, or a specific bit line in the first memory device 1501 classified as the “fourth memory device”. At this time, the first memory device 1501 to be disabled may copy the data stored in the specific block, the specific word line, or the specific bit line to be disabled and store it in the "other area". It is possible to notify the host 102 that the data has moved to "another area". At this time, the "other area" can be another normal block or other normal wordline or other normal bitline of the first memory device 1501. Also, the "other area" is another normal block or other normal wordline contained in another memory device 1502, 1503, 1504, 1505, 1506, 1507, 1508 that is not the first memory device 1501. It can be another normal bitline. For reference, the reason why the operation of storing the data stored in the specific block or specific word line or specific bit line to be disabled in the internal "other area" can be performed normally is because of the specific block or specific word. In the case of a line or a specific bitline, it was only selected as a disable target in anticipation of an unrecoverable error at an adjacent future point in time, and is currently in a working state or recovery. This is because only possible errors occur.

そして、第２のエラー対応動作は、「第５のメモリ装置」に区分されたメモリ装置に対するアクセス動作中にエラーが発生する場合、ホストＥＣＣ１０２６を介してエラーが発生したコードワード単位のデータに対してエラー訂正コードを使用したエラー復旧動作を含むことができる。 Then, in the second error handling operation, when an error occurs during the access operation to the memory device classified into the "fifth memory device", the data in codeword units where the error occurs via the host ECC1026 It can include error recovery operations using error correction codes.

そして、図１Ｂ及び図２に示すように、ログ情報ＬＯＧ＿ＩＮＦＯ及びエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯを分析して複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するエラー等級がどのような方式で決定されるか分かることができる。 Then, as shown in FIGS. 1B and 2, what is the error grade for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 by analyzing the log information LOG_INFO and the error correction information ERR_CO_INFO? It is possible to know whether it is determined by such a method.

具体的に、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々は、アクセス動作、例えば、データの読み出し／書き込み動作を行う過程でエラーが発生して内部に備えられたメモリＥＣＣ（ＥＣＣ１、ＥＣＣ２、ＥＣＣ３、ＥＣＣ４、ＥＣＣ５、ＥＣＣ６、ＥＣＣ７、ＥＣＣ８）の動作を介してエラーが復旧される場合、メモリＥＣＣ（ＥＣＣ１、ＥＣＣ２、ＥＣＣ３、ＥＣＣ４、ＥＣＣ５、ＥＣＣ６、ＥＣＣ７、ＥＣＣ８）によりエラーが復旧されたデータに対するログ情報ＬＯＧ＿ＩＮＦＯを生成できる。 Specifically, each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 is provided internally with an error occurring in the process of performing an access operation, for example, a data read / write operation. When the error is recovered through the operation of the memory ECC (ECC1, ECC2, ECC3, ECC4, ECC5, ECC6, ECC7, ECC8), the memory ECC (ECC1, ECC2, ECC3, ECC4, ECC5, ECC6, ECC7, Log information LOG_INFO can be generated for the data for which the error has been recovered by ECC8).

そして、コントローラ１３０は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するアクセス動作、例えば、データの読み出し動作中、内部に備えられたシステムＥＣＣ１３０６によりエラーが復旧される場合、システムＥＣＣ１３０６によりエラーが復旧されたデータに対するエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯを生成できる。 Then, the controller 130 recovers the error by the system ECC1306 provided inside during the access operation for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508, for example, the data read operation. If so, the system ECC1306 can generate error correction information ERR_CO_INFO for the data for which the error has been recovered.

そして、コントローラ１３０は、ログ情報ＬＯＧ＿ＩＮＦＯ及びエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯを収集して分析することができる。すなわち、コントローラ１３０は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するエラー等級及びエラー強度を決定できる。 Then, the controller 130 can collect and analyze the log information LOG_INFO and the error correction information ERR_CO_INFO. That is, the controller 130 can determine the error grade and the error intensity for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508.

一方、コントローラ１３０に備えられた第１のエラー分析部１３０３は、エラー収集部１３０１で収集されたログ情報ＬＯＧ＿ＩＮＦＯ及びエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯを分析して複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８のうち、エラー発生個数が第１基準個数以上であるメモリ装置を確認し、該当するメモリ装置を「第１のメモリ装置」に区分することができる（Ｓ１０）。 On the other hand, the first error analysis unit 1303 provided in the controller 130 analyzes the log information LOG_INFO and the error correction information ERR_CO_INFO collected by the error collection unit 1301 and analyzes a plurality of memory devices 1501, 1502, 1503, 1504, 1505. , 1506, 1507, and 1508, it is possible to confirm the memory device in which the number of error occurrences is equal to or greater than the first reference number, and classify the corresponding memory device into the "first memory device" (S10).

例えば、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８のうち、１番目のメモリ装置１５０１に対するアクセス過程でメモリＥＣＣ（ＥＣＣ１、ＥＣＣ２、ＥＣＣ３、ＥＣＣ４、ＥＣＣ５、ＥＣＣ６、ＥＣＣ７、ＥＣＣ８）またはシステムＥＣＣ１３０６により復旧されたエラーの個数が１２個であり、残りのメモリ装置１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するアクセス過程でメモリＥＣＣ（ＥＣＣ１、ＥＣＣ２、ＥＣＣ３、ＥＣＣ４、ＥＣＣ５、ＥＣＣ６、ＥＣＣ７、ＥＣＣ８）またはシステムＥＣＣ１３０６により各々復旧されたエラーの個数が１０個未満であると仮定することができる。そして、第１基準個数は、１０個であると仮定することができる。このような場合、第１のエラー分析部１３０３は、１番目のメモリ装置１５０１を「第１のメモリ装置」に区分し、残りのメモリ装置１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対してはエラー等級を決定しないことができる。 For example, among a plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, 1508, the memory ECC (ECC1, ECC2, ECC3, ECC4, ECC5, ECC6, ECC7) is performed in the access process to the first memory device 1501. , ECC8) or the number of errors recovered by the system ECC1306 is 12, and the memory ECC (ECC1, ECC2, ECC3) is in the process of accessing each of the remaining memory devices 1502, 1503, 1504, 1505, 1506, 1507, and 1508. , ECC4, ECC5, ECC6, ECC7, ECC8) or system ECC1306 can each assume that the number of errors recovered is less than 10. Then, it can be assumed that the first reference number is 10. In such a case, the first error analysis unit 1303 classifies the first memory device 1501 into the "first memory device", and the remaining memory devices 1502, 1503, 1504, 1505, 1506, 1507, 1508. The error grade may not be determined for each.

具体的に、第１のエラー分析部１３０３は、ログ情報ＬＯＧ＿ＩＮＦＯ及びエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯを分析して「第１のメモリ装置」に区分されたメモリ装置で発生したエラーの種類を確認することができる（Ｓ１０のＹＥＳ）。このとき、第１のエラー分析部１３０３は、「第１のメモリ装置」に区分されたメモリ装置で発生したエラーの種類をワードライン単位で発生したエラー（Ｓ２０）と、シングルビット単位で発生したエラー（Ｓ３０）と、ビットライン単位で発生したエラー（Ｓ４０）と、その他のエラー（Ｓ５０）とに区分することができる。 Specifically, the first error analysis unit 1303 can analyze the log information LOG_INFO and the error correction information ERR_CO_INFO to confirm the type of error generated in the memory device classified into the "first memory device". (YES in S10). At this time, the first error analysis unit 1303 classifies the types of errors that occurred in the memory devices classified into the "first memory device" into the error (S20) that occurred in word line units and the error (S20) that occurred in single bit units. It can be divided into an error (S30), an error (S40) generated in bit line units, and other errors (S50).

そして、第１のエラー分析部１３０３で「第１のメモリ装置」に区分されたメモリ装置で発生したエラーの種類を確認した結果、ワードライン単位で発生したエラー（Ｓ２０）である場合、「第１のメモリ装置」に区分されたメモリ装置で同じワードラインで発生したエラーの個数をカウントすることができる（Ｓ６０）。カウント結果、エラーの個数が第２基準個数以上である場合（Ｓ７０のＹＥＳ）、当該メモリ装置を第１のエラー等級と決定して「第２のメモリ装置」に区分することができる（Ｓ９０）。カウント結果、エラーの個数が第２基準個数未満である場合（Ｓ７０のＮＯ）、当該メモリ装置を第２のエラー等級と決定して「第３のメモリ装置」に区分することができる（Ｓ８０）。 Then, as a result of confirming the type of the error that occurred in the memory device classified into the "first memory device" by the first error analysis unit 1303, if the error (S20) occurs in word line units, the "first". It is possible to count the number of errors that occur in the same word line in the memory devices classified into "1 memory device" (S60). As a result of counting, when the number of errors is equal to or greater than the second reference number (YES in S70), the memory device can be determined as the first error class and classified as the "second memory device" (S90). .. As a result of counting, when the number of errors is less than the second reference number (NO in S70), the memory device can be determined as the second error class and classified as the "third memory device" (S80). ..

そして、第１のエラー分析部１３０３において「第１のメモリ装置」に区分されたメモリ装置で発生したエラーの種類を確認した結果、シングルビット単位で発生したエラー（Ｓ３０のＹＥＳ）と、ビットライン単位で発生したエラー（Ｓ４０のＹＥＳ）と、その他のエラー（Ｓ５０のＹＥＳ）とのみ存在するメモリ装置の場合、当該メモリ装置を第２のエラー等級と決定して「第３のメモリ装置」に区分することができる（Ｓ８０）。 Then, as a result of confirming the types of errors that occurred in the memory devices classified into the "first memory device" in the first error analysis unit 1303, the error that occurred in single bit units (YES in S30) and the bit line. In the case of a memory device in which only the error generated in the unit (YES in S40) and other errors (YES in S50) exist, the memory device is determined to be the second error grade and is designated as the "third memory device". It can be classified (S80).

例を挙げてまとめると、「第１のメモリ装置」に区分された１番目のメモリ装置１５０１で発生したエラーがワードライン単位で発生したエラーであり、同じワードラインで発生したエラーの個数が第２基準個数以上であることと仮定することができる。したがって、第１のエラー分析部１３０３は、「第１のメモリ装置」に区分された１番目のメモリ装置１５０１を第１のエラー等級と決定して「第２のメモリ装置」に区分することができる。 To summarize by giving an example, the error that occurred in the first memory device 1501 classified as the "first memory device" is an error that occurred in word line units, and the number of errors that occurred in the same word line is the first. It can be assumed that the number is 2 or more. Therefore, the first error analysis unit 1303 may determine the first memory device 1501 classified as the "first memory device" as the first error grade and classify it as the "second memory device". can.

図１Ｂと図２及び図３に示すように、コントローラ１３０に含まれた第２のエラー分析部１３０４は、第１のエラー分析部１３０３で決定された複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するエラー等級によって複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８のうち、一部のメモリ装置を選択できる。また、第２のエラー分析部１３０４は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８のうち、選択された一部のメモリ装置に対しては、ログ情報ＬＯＧ＿ＩＮＦＯ及びエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯの追加分析を介してエラーの形態及び個数を確認してエラー強度を決定し、一部のメモリ装置を除いた残りのメモリ装置に対しては、第１のエラー分析部１３０３で決定されたエラー等級に対応するようにエラー強度を決定できる。 As shown in FIGS. 1B, 2 and 3, the second error analysis unit 1304 included in the controller 130 includes a plurality of memory devices 1501, 1502, 1503, 1504 determined by the first error analysis unit 1303. , 1505, 1506, 1507, 1508, respectively, and some memory devices can be selected from a plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, 1508 depending on the error grade. Further, the second error analysis unit 1304 sets the log information LOG_INFO and the log information LOG_INFO for some of the selected memory devices among the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508. Error correction information The form and number of errors are confirmed through additional analysis of ERR_CO_INFO to determine the error intensity, and for the remaining memory devices excluding some memory devices, the first error analysis unit 1303 is used. The error intensity can be determined to correspond to the determined error grade.

具体的に、第１のエラー分析部１３０３は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８をエラー等級が決定されなかったメモリ装置と、第１のエラー等級と決定された「第２のメモリ装置」と、第２のエラー等級と決定された「第３のメモリ装置」とに区分したことがある。 Specifically, the first error analysis unit 1303 sets a plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 as a memory device for which an error grade has not been determined, and a first error grade. It has been divided into a determined "second memory device" and a determined "third memory device" with a second error grade.

そして、第２のエラー分析部１３０４は、第１のエラー分析部１３０３で決定されたエラー等級が第１のエラー等級であるか否かを確認（Ｋ１０）できる。 Then, the second error analysis unit 1304 can confirm (K10) whether or not the error grade determined by the first error analysis unit 1303 is the first error grade.

Ｋ１０動作の確認結果、第２のエラー分析部１３０４は、第１のエラー分析部１３０３で第１のエラー等級と決定されなかったメモリ装置の場合（Ｋ１０のＮＯ）、すなわち、エラー等級が決定されなかったメモリ装置と、第２のエラー等級と決定された「第３のメモリ装置」に対して第２のエラー強度を付加して「第５のメモリ装置」に区分することができる（Ｋ７０）。このとき、第２のエラー強度が付加されて「第５のメモリ装置」に区分されたメモリ装置に対しては、対応動作部１３０５で第２のエラー対応動作を行うことができる（Ｋ８０）。 As a result of confirming the operation of K10, the second error analysis unit 1304 determines the error grade in the case of the memory device (NO of K10) which is not determined as the first error grade by the first error analysis unit 1303. A second error strength can be added to the missing memory device and the "third memory device" determined to be the second error grade, and the device can be classified into the "fifth memory device" (K70). .. At this time, for the memory device to which the second error strength is added and classified as the "fifth memory device", the corresponding operation unit 1305 can perform the second error handling operation (K80).

Ｋ１０動作の確認結果、第２のエラー分析部１３０４は、第１のエラー分析部１３０３で第１のエラー等級と決定されたメモリ装置の場合（Ｋ１０のＹＥＳ）、すなわち、第１のエラー等級と決定された「第２のメモリ装置」の場合、追加にログ情報ＬＯＧ＿ＩＮＦＯ及びエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯを分析してエラーの形態及び個数を確認した後、エラー強度を決定できる。具体的に、第２のエラー分析部１３０４は、第１のエラー等級と決定された「第２のメモリ装置」に対するログ情報ＬＯＧ＿ＩＮＦＯ及びエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯを追加に分析して「第２のメモリ装置」で発生したエラーの形態が第３基準個数以上のコードワード単位にまたがって（ａｃｒｏｓｓ）いるか否かを確認できる（Ｋ３０）。 As a result of confirming the operation of K10, the second error analysis unit 1304 is the case of the memory device determined by the first error analysis unit 1303 as the first error grade (YES in K10), that is, the first error grade. In the case of the determined "second memory device", the error intensity can be determined after additionally analyzing the log information LOG_INFO and the error correction information ERR_CO_INFO to confirm the form and number of errors. Specifically, the second error analysis unit 1304 additionally analyzes the log information LOG_INFO and the error correction information ERR_CO_INFO for the "second memory device" determined to be the first error grade, and "second memory device". It can be confirmed whether or not the form of the error generated in "is across (across) the codeword unit of the third reference number or more" (K30).

ここで、コードワード単位に「またがって（ａｃｒｏｓｓ）」いるエラーを確認する動作がいかなる意味を有するかは、図１Ａ及び図４ないし図５Ｂに関連した説明において前述したことがある。したがって、ここでは、より具体的な説明を省略する。 Here, what the meaning of the operation of confirming the error of "crossing" in codeword units has been described above in the description related to FIGS. 1A and 4 to 5B. Therefore, a more specific description will be omitted here.

そして、第２のエラー分析部１３０４で第１のエラー等級と決定された「第２のメモリ装置」に対するログ情報ＬＯＧ＿ＩＮＦＯを追加に分析して「第２のメモリ装置」で発生したエラーの形態が第３基準個数以上のコードワード単位にまたがって（ａｃｒｏｓｓ）いるか否かを確認する動作（Ｋ３０）は、第３基準個数を２個であると仮定するとき、「第２のメモリ装置」で発生したエラーの形態が図５Ａにおいて例示したように、２個のコードワード単位にまたがった形態であるか、それとも、図５Ｂにおいて例示したように、１個のコードワード単位にのみ含まれた形態であるかを確認する動作でありうる。 Then, the log information LOG_INFO for the "second memory device" determined by the second error analysis unit 1304 as the first error grade is additionally analyzed, and the form of the error generated in the "second memory device" is determined. The operation (K30) for confirming whether or not the code word unit is equal to or greater than the third reference number occurs in the "second memory device", assuming that the third reference number is two. The form of the error is a form that spans two codeword units as illustrated in FIG. 5A, or a form that is included in only one codeword unit as illustrated in FIG. 5B. It may be an operation to confirm the existence.

Ｋ３０動作の確認結果、「第２のメモリ装置」で発生したエラーの形態が第３基準個数以上のコードワード単位にまたがった形態である場合（Ｋ３０のＹＥＳ）、第２のエラー分析部１３０４は、第３基準個数以上のコードワードにまたがったエラー個数の合計が第４基準個数以上であるか否かを確認できる（Ｋ４０）。例えば、第４基準個数を８個であると仮定するとき、図５Ａにおいて例示したように、２個のコードワード単位（Ｃｏｄｅｗｏｒｄ０、Ｃｏｄｅｗｏｒｄ１）にまたがったエラービットの合計が１６個であるから、８個である第４基準個数以上でありうる。 As a result of confirming the operation of K30, when the form of the error generated in the "second memory device" is a form straddling the codeword unit of the third reference number or more (YES of K30), the second error analysis unit 1304 , It is possible to confirm whether or not the total number of errors straddling the codewords of the third reference number or more is the fourth reference number or more (K40). For example, assuming that the fourth reference number is 8, as illustrated in FIG. 5A, the total number of error bits straddling two codeword units (Codeword0, Codeword1) is 16, so 8 It can be greater than or equal to the fourth reference number, which is the number of pieces.

Ｋ４０動作の確認結果、「第２のメモリ装置」で発生したエラーの形態が第３基準個数以上のコードワード単位にまたがった形態（Ｋ３０のＹＥＳ）であり、第３基準個数以上のコードワードにまたがったエラー個数の合計が第４基準個数以上である場合（Ｋ４０のＹＥＳ）、第２のエラー分析部１３０４は、該当する「第２のメモリ装置」に対して第１のエラー強度を付加して「第４のメモリ装置」に区分することができる（Ｋ５０）。このとき、第１のエラー強度が付加されて「第４のメモリ装置」に区分されるメモリ装置に対しては、対応動作部１３０５で第１のエラー対応動作を行うことができる（Ｋ６０）。 As a result of confirming the operation of K40, the form of the error generated in the "second memory device" is a form that straddles the codeword unit of the third reference number or more (YES of K30), and the codeword has the codeword of the third reference number or more. When the total number of straddled errors is equal to or greater than the fourth reference number (YES in K40), the second error analysis unit 1304 adds the first error strength to the corresponding "second memory device". Can be classified into a "fourth memory device" (K50). At this time, for the memory device to which the first error strength is added and classified as the "fourth memory device", the corresponding operation unit 1305 can perform the first error handling operation (K60).

Ｋ３０動作の確認結果、「第２のメモリ装置」で発生したエラーの形態が第３基準個数未満のコードワード単位にのみ含まれた形態である場合（Ｋ３０のＮＯ）、第２のエラー分析部１３０４は、該当する「第２のメモリ装置」に対して第２のエラー強度を付加して「第５のメモリ装置」に区分することができる（Ｋ７０）。このとき、第２のエラー強度が付加されて「第５のメモリ装置」に区分されるメモリ装置に対しては、対応動作部１３０５で第２のエラー対応動作を行うことができる（Ｋ８０）。 As a result of confirming the operation of K30, when the form of the error generated in the "second memory device" is included only in the codeword unit less than the third reference number (NO of K30), the second error analysis unit. 1304 can be classified into a "fifth memory device" by adding a second error strength to the corresponding "second memory device" (K70). At this time, for the memory device to which the second error strength is added and classified as the "fifth memory device", the corresponding operation unit 1305 can perform the second error handling operation (K80).

第２のエラー分析部１３０４の動作を例を挙げて説明すれば、次のとおりである。 The operation of the second error analysis unit 1304 will be described with an example as follows.

まず、図２において例を挙げて説明したように、第１のエラー分析部１３０３は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８のうち、１番目のメモリ装置１５０１を第１のエラー等級と決定して「第２のメモリ装置」に区分し、残りのメモリ装置１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８に対してはエラー等級を決定しなかったし、第２のエラー等級と決定されて「第３のメモリ装置」に区分されたメモリ装置はないことと仮定したことがある。 First, as described with reference to FIG. 2, the first error analysis unit 1303 is the first memory device among the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508. 1501 was determined as the first error grade and classified as a "second memory device", and no error grade was determined for the remaining memory devices 1502, 1503, 1504, 1505, 1506, 1507, 1508. However, it has been assumed that there is no memory device determined to be the second error grade and classified as the "third memory device".

このとき、第２のエラー分析部１３０４は、第１のエラー分析部１３０３で第１のエラー等級と決定されなかったメモリ装置の場合（Ｋ１０のＮＯ）、すなわち、エラー等級が決定されなかったメモリ装置及び第２のエラー等級と決定された「第３のメモリ装置」に対して第２のエラー強度を付加して「第５のメモリ装置」に区分することができる（Ｋ７０）。したがって、第２のエラー分析部１３０４は、エラー等級が決定されなかった残りのメモリ装置１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８に対して第２のエラー強度を付加して「第５のメモリ装置」に区分することができる（Ｋ７０）。 At this time, the second error analysis unit 1304 is in the case of a memory device (NO of K10) in which the first error analysis unit 1303 has not determined the first error grade, that is, the memory in which the error grade has not been determined. A second error strength can be added to the device and the "third memory device" determined to be the second error grade, and the device can be classified into the "fifth memory device" (K70). Therefore, the second error analysis unit 1304 adds a second error intensity to the remaining memory devices 1502, 1503, 1504, 1505, 1506, 1507, and 1508 for which the error grade has not been determined, and "fifth. It can be classified into "memory device" (K70).

そして、第２のエラー分析部１３０４は、第１のエラー分析部１３０３で第１のエラー等級と決定されたメモリ装置の場合（Ｋ１０のＹＥＳ）、すなわち、第１のエラー等級と決定された「第２のメモリ装置」の場合、追加にログ情報ＬＯＧ＿ＩＮＦＯ及びエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯを分析してエラーの形態及び個数を確認した後、エラー強度を決定できる。したがって、第２のエラー分析部１３０４は、第１のエラー等級と決定されて「第２のメモリ装置」に区分された１番目のメモリ装置１５０１に対して追加にログ情報ＬＯＧ＿ＩＮＦＯ及びエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯを分析してエラーの形態及び個数を確認した後、エラー強度を決定できる。 Then, the second error analysis unit 1304 is the case of the memory device determined to be the first error grade by the first error analysis unit 1303 (YES of K10), that is, the first error grade is determined to be ". In the case of the "second memory device", the error intensity can be determined after additionally analyzing the log information LOG_INFO and the error correction information ERR_CO_INFO to confirm the form and number of errors. Therefore, the second error analysis unit 1304 additionally logs information LOG_INFO and error correction information ERR_CO_INFO with respect to the first memory device 1501 determined to be the first error grade and classified as the "second memory device". After confirming the form and number of errors, the error intensity can be determined.

具体的に、第２のエラー分析部１３０４は、１番目のメモリ装置１５０１に対して追加にログ情報ＬＯＧ＿ＩＮＦＯ及びエラー訂正情報ＥＲＲ＿ＣＯ＿ＩＮＦＯを分析してエラーの形態が第３基準個数以上のコードワード単位にまたがっているか否かを確認できる（Ｋ３０）。その結果、１番目のメモリ装置１５０１で発生したエラーの形態が第３基準個数以上のコードワード単位にまたがっていることと仮定することができる（Ｋ３０のＹＥＳ）。したがって、第２のエラー分析部１３０４は、１番目のメモリ装置１５０１で第３基準個数以上のコードワード単位にまたがっているエラーに含まれたエラービット個数の合計が第４基準個数以上であるか否かを確認できる（Ｋ４０）。確認結果、１番目のメモリ装置１５０１で第３基準個数以上のコードワード単位にまたがっているエラーの個数が第４基準個数以上であることと仮定することができる（Ｋ４０のＹＥＳ）。したがって、第２のエラー分析部１３０３は、１番目のメモリ装置１５０１に対して第１のエラー強度を付加して「第４のメモリ装置」に区分することができる（Ｋ６０）。 Specifically, the second error analysis unit 1304 additionally analyzes the log information LOG_INFO and the error correction information ERR_CO_INFO for the first memory device 1501, and the form of the error is in codeword units of the third reference number or more. It can be confirmed whether or not it is straddling (K30). As a result, it can be assumed that the form of the error generated in the first memory device 1501 spans the codeword units of the third reference number or more (YES in K30). Therefore, in the second error analysis unit 1304, is the total number of error bits included in the error straddling the codeword unit of the third reference number or more in the first memory device 1501 equal to or more than the fourth reference number? It can be confirmed whether or not (K40). As a result of confirmation, it can be assumed that the number of errors straddling the codeword unit of the third reference number or more in the first memory device 1501 is the fourth reference number or more (YES of K40). Therefore, the second error analysis unit 1303 can add the first error strength to the first memory device 1501 and classify it into the "fourth memory device" (K60).

一方、コントローラ１３０に含まれた対応動作部１３０５は、第２のエラー分析部１３０４で決定された複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するエラー強度によって複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対して互いに異なるエラー対応動作を行うことができる。 On the other hand, the corresponding operation unit 1305 included in the controller 130 depends on the error strength for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 determined by the second error analysis unit 1304. It is possible to perform different error handling operations for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508.

具体的に、対応動作部１３０５は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８のうち、第２のエラー分析部１３０４で第１のエラー強度を付加して「第４のメモリ装置」に区分されたメモリ装置に対して第１のエラー対応動作を行うことができる。また、対応動作部１３０５は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８のうち、第２のエラー分析部１３０４で第２のエラー強度を付加して「第５のメモリ装置」に区分されたメモリ装置に対して第２のエラー対応動作を行うことができる。 Specifically, the corresponding operation unit 1305 adds the first error strength to the second error analysis unit 1304 among the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508. The first error handling operation can be performed on the memory device classified into the "fourth memory device". Further, the corresponding operation unit 1305 adds a second error strength to the second error analysis unit 1304 among the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508, and "fifth. The second error handling operation can be performed on the memory device classified into the "memory device".

１番目の動作は、「第４のメモリ装置」に区分されたメモリ装置でエラーが発生した領域を選択してアクセスを遮断する動作である。例えば、対応動作部１３０５は、「第４のメモリ装置」に区分された１番目のメモリ装置１５０１で特定ブロックまたは特定ワードラインまたは特定ビットラインを選択してアクセスを遮断できる。このとき、対応動作部１３０５は、アクセス遮断対象である特定ブロックまたは特定ワードラインまたは特定ビットラインに格納されたデータを複写して「他の領域」に格納した後、アクセス遮断動作を行うことができる。このとき、「他の領域」は、１番目のメモリ装置１５０１の他の正常なブロックまたは他の正常なワードラインまたは他の正常なビットラインになることができる。また、「他の領域」は、１番目のメモリ装置１５０１でない他のメモリ装置１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８に含まれた他の正常なブロックまたは他の正常なワードラインまたは他の正常なビットラインになることができる。参考までに、アクセス遮断対象である特定ブロックまたは特定ワードラインまたは特定ビットラインに格納されたデータを「他の領域」に格納する動作が正常に行われ得る理由は、特定ブロックまたは特定ワードラインまたは特定ビットラインの場合、隣接した未来時点に復旧不可能なエラーが発生する可能性が高いと予想してアクセス遮断動作対象と選択されただけであり、現在時点では、正常動作する状態または復旧可能なエラーのみ発生する状態であるためである。 The first operation is an operation of blocking access by selecting an area in which an error has occurred in the memory device classified into the "fourth memory device". For example, the corresponding operation unit 1305 can block access by selecting a specific block, a specific word line, or a specific bit line in the first memory device 1501 classified as the “fourth memory device”. At this time, the corresponding operation unit 1305 may perform the access blocking operation after copying the data stored in the specific block, the specific word line, or the specific bit line that is the access blocking target and storing it in the "other area". can. At this time, the "other area" can be another normal block or other normal wordline or other normal bitline of the first memory device 1501. Also, the "other area" is another normal block or other normal wordline contained in another memory device 1502, 1503, 1504, 1505, 1506, 1507, 1508 that is not the first memory device 1501. It can be another normal bitline. For reference, the reason why the operation of storing the data stored in the specific block or specific word line or specific bit line that is the access blocking target in the "other area" can be performed normally is because of the specific block or specific word line or In the case of a specific bitline, it was only selected as an access blocking operation target in anticipation that an unrecoverable error is likely to occur at an adjacent future point in time, and at the present time, it is in a normal operating state or recoverable. This is because it is in a state where only such errors occur.

２番目の動作は、「第４のメモリ装置」に区分されたメモリ装置でエラーが発生した領域を選択してリペア（ｒｅｐａｉｒ）する動作である。例えば、対応動作部１３０５は、「第４のメモリ装置」に区分された１番目のメモリ装置１５０１で特定ブロックまたは特定ワードラインまたは特定ビットラインを他の正常なリダンダンシーブロックまたはリダンダンシーワードラインまたはリダンダンシービットラインでリペアすることができる。このとき、コントローラ１３０は、リペア対象になる１番目のメモリ装置１５０１でリペア動作が完了するまでアクセスが中断されるようにすることができる。そして、リペア対象になる１番目のメモリ装置１５０１は、リペア対象である特定ブロックまたは特定ワードラインまたは特定ビットラインに格納されたデータを内部の情報格納領域ＰＡ１に複写した後、リペア動作を行うことができる。リペア動作が完了した後、１番目のメモリ装置１５０１は、情報格納領域ＰＡ１に複写されたデータをリペア完了したリダンダンシーブロックまたはリダンダンシーワードラインまたはリダンダンシービットラインに復旧することができる。参考までに、リペア対象である特定ブロックまたは特定ワードラインまたは特定ビットラインに格納されたデータを内部の情報格納領域ＰＡ１に複写する動作が正常に行われ得る理由は、特定ブロックまたは特定ワードラインまたは特定ビットラインの場合、隣接した未来時点に復旧不可能なエラーが発生する可能性が高いと予想してリペア対象と選択されただけであり、現在時点では、正常動作する状態または復旧可能なエラーのみ発生する状態であるためである。 The second operation is an operation of selecting and repairing an area in which an error has occurred in the memory device classified into the "fourth memory device". For example, the corresponding operation unit 1305 sets a specific block or a specific word line or a specific bit line in the first memory device 1501 classified as the “fourth memory device” into another normal redundancy block or redundancy word line or redundancy bit. Can be repaired on the line. At this time, the controller 130 can make the access interrupted until the repair operation is completed in the first memory device 1501 to be repaired. Then, the first memory device 1501 to be repaired performs the repair operation after copying the data stored in the specific block, the specific word line, or the specific bit line to be repaired to the internal information storage area PA1. Can be done. After the repair operation is completed, the first memory device 1501 can restore the data copied to the information storage area PA1 to the repaired redundancy block, redundancy sea word line, or redundancy bit line. For reference, the reason why the operation of copying the data stored in the specific block, the specific word line, or the specific bit line to be repaired to the internal information storage area PA1 can be normally performed is the specific block, the specific word line, or the specific bit line. For a particular bitline, it was only selected for repair in anticipation of an unrecoverable error likely to occur at an adjacent future point in time, and is currently in a working state or recoverable error. This is because it is a state in which only occurs.

３番目の動作は、「第４のメモリ装置」に区分されたメモリ装置でエラーが発生した領域を選択してディセーブル（ｄｉｓａｂｌｅ）させる動作である。例えば、対応動作部１３０５は、「第４のメモリ装置」に区分された１番目のメモリ装置１５０１で特定ブロックまたは特定ワードラインまたは特定ビットラインをディセーブルさせることができる。このとき、ディセーブル対象になる１番目のメモリ装置１５０１は、ディセーブル対象である特定ブロックまたは特定ワードラインまたは特定ビットラインに格納されたデータを複写して「他の領域」に格納することができ、データが「他の領域」へ移動したことをコントローラ１３０に通知することができる。このとき、「他の領域」は、１番目のメモリ装置１５０１の他の正常なブロックまたは他の正常なワードラインまたは他の正常なビットラインになることができる。また、「他の領域」は、１番目のメモリ装置１５０１でない他のメモリ装置１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８に含まれた他の正常なブロックまたは他の正常なワードラインまたは他の正常なビットラインになることができる。参考までに、ディセーブル対象である特定ブロックまたは特定ワードラインまたは特定ビットラインに格納されたデータを内部の「他の領域」に格納する動作が正常に行われ得る理由は、特定ブロックまたは特定ワードラインまたは特定ビットラインの場合、隣接した未来時点に復旧不可能なエラーが発生する可能性が高いと予想してディセーブル対象と選択されただけであり、現在時点では、正常動作する状態または復旧可能なエラーのみ発生する状態であるためである。 The third operation is an operation of selecting and disabling an area in which an error has occurred in the memory device classified into the "fourth memory device". For example, the corresponding operation unit 1305 can disable a specific block, a specific word line, or a specific bit line in the first memory device 1501 classified as the “fourth memory device”. At this time, the first memory device 1501 to be disabled may copy the data stored in the specific block, the specific word line, or the specific bit line to be disabled and store it in the "other area". It is possible to notify the controller 130 that the data has moved to "another area". At this time, the "other area" can be another normal block or other normal wordline or other normal bitline of the first memory device 1501. Also, the "other area" is another normal block or other normal wordline contained in another memory device 1502, 1503, 1504, 1505, 1506, 1507, 1508 that is not the first memory device 1501. It can be another normal bitline. For reference, the reason why the operation of storing the data stored in the specific block or specific word line or specific bit line to be disabled in the internal "other area" can be performed normally is because of the specific block or specific word. In the case of a line or a specific bitline, it was only selected as a disable target in anticipation of an unrecoverable error at an adjacent future point in time, and is currently in a working state or recovery. This is because only possible errors occur.

そして、第２のエラー対応動作は、「第５のメモリ装置」に区分されたメモリ装置に対するアクセス動作中にエラーが発生する場合、システムＥＣＣ１３０６を介してエラーが発生したコードワード単位のデータに対してエラー訂正コードを使用したエラー復旧動作を含むことができる。 Then, in the second error handling operation, when an error occurs during the access operation to the memory device classified into the "fifth memory device", the data in codeword units where the error occurs via the system ECC1306 is obtained. It can include error recovery operations using error correction codes.

そして、図１Ｃ及び図２に示すように、ログ情報ＬＯＧ＿ＩＮＦＯを分析して複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するエラー等級がどのような方式で決定されるか分かることができる。 Then, as shown in FIGS. 1C and 2, the log information LOG_INFO is analyzed to determine the error grade for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508. You can see if it will be done.

具体的に、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々は、アクセス動作、例えば、データの読み出し／書き込み動作を行う過程でエラーが発生して内部に備えられたメモリＥＣＣ１５１６の動作を介してエラーが復旧される場合、メモリＥＣＣ１５１６によりエラーが復旧されたデータに対するログ情報ＬＯＧ＿ＩＮＦＯを生成できる。 Specifically, each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 is prepared internally due to an error occurring in the process of performing an access operation, for example, a data read / write operation. When the error is recovered through the operation of the memory ECC1516, the log information LOG_INFO can be generated for the data for which the error is recovered by the memory ECC1516.

そして、エラー収集部１５１１及びエラー分析部１５１３は、ログ情報ＬＯＧ＿ＩＮＦＯを収集して分析することができる。すなわち、エラー収集部１５１１及びエラー分析部１５１３は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するエラー等級を決定できる。 Then, the error collecting unit 1511 and the error analysis unit 1513 can collect and analyze the log information LOG_INFO. That is, the error collecting unit 1511 and the error analysis unit 1513 can determine the error grade for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508.

具体的に、エラー分析部１５１３は、エラー収集部１５１１で収集されたログ情報ＬＯＧ＿ＩＮＦＯを分析して複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８のうち、エラー発生個数が第１基準個数以上であるメモリ装置を確認し、該当するメモリ装置を「第１のメモリ装置」に区分することができる（Ｓ１０）。 Specifically, the error analysis unit 1513 analyzes the log information LOG_INFO collected by the error collection unit 1511, and among a plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508, the number of error occurrences. Can be confirmed as having a number equal to or greater than the first reference number, and the corresponding memory device can be classified into the "first memory device" (S10).

例えば、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８のうち、１番目のメモリ装置１５０１に対するアクセス過程でメモリＥＣＣ１５１６により復旧されたエラーの個数が１２個であり、残りのメモリ装置１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するアクセス過程でメモリＥＣＣ１５１６により各々復旧されたエラーの個数が１０個未満であると仮定することができる。そして、第１基準個数は１０個であると仮定することができる。このような場合、エラー分析部１５１３は、１番目のメモリ装置１５０１を「第１のメモリ装置」に区分し、残りのメモリ装置１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対してはエラー等級を決定しないことができる。 For example, among the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508, the number of errors recovered by the memory ECC1516 in the access process to the first memory device 1501 is 12, and the remaining number. It can be assumed that the number of errors recovered by the memory ECC1516 in the access process to each of the memory devices 1502, 1503, 1504, 1505, 1506, 1507, and 1508 is less than 10. Then, it can be assumed that the first reference number is 10. In such a case, the error analysis unit 1513 classifies the first memory device 1501 into the "first memory device", and for each of the remaining memory devices 1502, 1503, 1504, 1505, 1506, 1507, and 1508. Therefore, the error grade cannot be determined.

より具体的に、エラー分析部１５１３は、ログ情報ＬＯＧ＿ＩＮＦＯを分析して「第１のメモリ装置」に区分されたメモリ装置で発生したエラーの種類を確認できる（Ｓ１０のＹＥＳ）。このとき、エラー分析部１５１３は、「第１のメモリ装置」に区分されたメモリ装置で発生したエラーの種類をワードライン単位で発生したエラー（Ｓ２０）と、シングルビット単位で発生したエラー（Ｓ３０）と、ビットライン単位で発生したエラー（Ｓ４０）と、その他のエラー（Ｓ５０）とに区分することができる。 More specifically, the error analysis unit 1513 can analyze the log information LOG_INFO and confirm the type of error that occurred in the memory device classified as the "first memory device" (YES in S10). At this time, the error analysis unit 1513 sets the types of errors that occurred in the memory device classified into the "first memory device" into an error that occurred in word line units (S20) and an error that occurred in single bit units (S30). ), An error (S40) generated in bit line units, and other errors (S50).

そして、エラー分析部１５１３において「第１のメモリ装置」に区分されたメモリ装置で発生したエラーの種類を確認した結果、ワードライン単位で発生したエラー（Ｓ２０）である場合、「第１のメモリ装置」に区分されたメモリ装置で同じワードラインで発生したエラーの個数をカウントすることができる（Ｓ６０）。カウント結果、エラーの個数が第２基準個数以上である場合（Ｓ７０のＹＥＳ）、当該メモリ装置を第１のエラー等級と決定して「第２のメモリ装置」に区分することができる（Ｓ９０）。カウント結果、エラーの個数が第２基準個数未満である場合（Ｓ７０のＮＯ）、当該メモリ装置を第２のエラー等級と決定して「第３のメモリ装置」に区分することができる（Ｓ８０）。 Then, as a result of confirming the type of error generated in the memory device classified into the "first memory device" in the error analysis unit 1513, if the error (S20) occurs in word line units, the "first memory" is displayed. It is possible to count the number of errors that have occurred in the same word line in the memory devices classified into "devices" (S60). As a result of counting, when the number of errors is equal to or greater than the second reference number (YES in S70), the memory device can be determined as the first error class and classified as the "second memory device" (S90). .. As a result of counting, when the number of errors is less than the second reference number (NO in S70), the memory device can be determined as the second error class and classified as the "third memory device" (S80). ..

そして、エラー分析部１５１３において「第１のメモリ装置」に区分されたメモリ装置で発生したエラーの種類を確認した結果、シングルビット単位で発生したエラー（Ｓ３０のＹＥＳ）と、ビットライン単位で発生したエラー（Ｓ４０のＹＥＳ）と、その他のエラー（Ｓ５０のＹＥＳ）とのみ存在するメモリ装置の場合、当該メモリ装置を第２のエラー等級と決定して「第３のメモリ装置」に区分することができる（Ｓ８０）。 Then, as a result of confirming the types of errors that occurred in the memory devices classified into the "first memory device" in the error analysis unit 1513, an error that occurred in a single bit unit (YES in S30) and an error that occurred in a bit line unit. In the case of a memory device in which only the error (YES in S40) and other errors (YES in S50) exist, the memory device is determined to be the second error grade and classified into the "third memory device". Can be done (S80).

例を挙げてまとめると、「第１のメモリ装置」に区分された１番目のメモリ装置１５０１で発生したエラーがワードライン単位で発生したエラーであり、同じワードラインで発生したエラーの個数が第２基準個数以上であることと仮定することができる。したがって、エラー分析部１５１３は、「第１のメモリ装置」に区分された１番目のメモリ装置１５０１を第１のエラー等級と決定して「第２のメモリ装置」に区分することができる。 To summarize by giving an example, the error that occurred in the first memory device 1501 classified as the "first memory device" is an error that occurred in word line units, and the number of errors that occurred in the same word line is the first. It can be assumed that the number is 2 or more. Therefore, the error analysis unit 1513 can determine the first memory device 1501 classified as the "first memory device" as the first error grade and classify it as the "second memory device".

そして、対応動作部１５１５は、エラー分析部１５１３で決定された複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対するエラー等級によって複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８の各々に対して互いに異なるエラー対応動作を行うことができる。 Then, the corresponding operation unit 1515 has a plurality of memory devices 1501, 1502, 1503 according to the error grades for each of the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508 determined by the error analysis unit 1513. , 1504, 1505, 1506, 1507, and 1508 can each perform different error handling operations.

具体的に、対応動作部１５１５は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８のうち、エラー分析部１５１３で第１のエラー等級を付加して「第２のメモリ装置」に区分されたメモリ装置に対して第１のエラー対応動作を行うことができる。また、対応動作部１５１５は、複数のメモリ装置１５０１、１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８のうち、エラー分析部１５１３で第２のエラー等級を付加して「第３のメモリ装置」に区分されたメモリ装置に対して第２のエラー対応動作を行うことができる。 Specifically, the corresponding operation unit 1515 adds the first error grade to the error analysis unit 1513 among the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508, and "second. The first error handling operation can be performed on the memory device classified into the "memory device". Further, the corresponding operation unit 1515 adds a second error grade to the error analysis unit 1513 among the plurality of memory devices 1501, 1502, 1503, 1504, 1505, 1506, 1507, and 1508, and "third memory device. The second error handling operation can be performed on the memory device classified into "."

１番目の動作は、「第２のメモリ装置」に区分されたメモリ装置でエラーが発生した領域を選択してアクセスを遮断する動作である。例えば、対応動作部１５１５は、「第２のメモリ装置」に区分された１番目のメモリ装置１５０１で特定ブロックまたは特定ワードラインまたは特定ビットラインを選択してアクセスを遮断できる。このとき、対応動作部１５１５は、アクセス遮断対象である特定ブロックまたは特定ワードラインまたは特定ビットラインに格納されたデータを複写して「他の領域」に格納した後、アクセス遮断動作を行うことができる。このとき、「他の領域」は、１番目のメモリ装置１５０１の他の正常なブロックまたは他の正常なワードラインまたは他の正常なビットラインになることができる。また、「他の領域」は、１番目のメモリ装置１５０１でない他のメモリ装置１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８に含まれた他の正常なブロックまたは他の正常なワードラインまたは他の正常なビットラインになることができる。参考までに、アクセス遮断対象である特定ブロックまたは特定ワードラインまたは特定ビットラインに格納されたデータを「他の領域」に格納する動作が正常に行われ得る理由は、特定ブロックまたは特定ワードラインまたは特定ビットラインの場合、隣接した未来時点に復旧不可能なエラーが発生する可能性が高いと予想してアクセス遮断動作対象と選択されただけであり、現在時点では、正常動作する状態または復旧可能なエラーのみ発生する状態であるためである。 The first operation is an operation of blocking access by selecting an area in which an error has occurred in the memory device classified into the "second memory device". For example, the corresponding operation unit 1515 can block access by selecting a specific block, a specific word line, or a specific bit line in the first memory device 1501 classified as the “second memory device”. At this time, the corresponding operation unit 1515 may perform the access blocking operation after copying the data stored in the specific block, the specific word line, or the specific bit line that is the access blocking target and storing it in the "other area". can. At this time, the "other area" can be another normal block or other normal wordline or other normal bitline of the first memory device 1501. Also, the "other area" is another normal block or other normal wordline contained in another memory device 1502, 1503, 1504, 1505, 1506, 1507, 1508 that is not the first memory device 1501. It can be another normal bitline. For reference, the reason why the operation of storing the data stored in the specific block or specific word line or specific bit line that is the access blocking target in the "other area" can be performed normally is because of the specific block or specific word line or In the case of a specific bitline, it was only selected as an access blocking operation target in anticipation that an unrecoverable error is likely to occur at an adjacent future point in time, and at the present time, it is in a normal operating state or recoverable. This is because it is in a state where only such errors occur.

２番目の動作は、「第２のメモリ装置」に区分されたメモリ装置でエラーが発生した領域を選択してリペア（ｒｅｐａｉｒ）する動作である。例えば、対応動作部１５１５は、「第２のメモリ装置」に区分された１番目のメモリ装置１５０１で特定ブロックまたは特定ワードラインまたは特定ビットラインを他の正常なリダンダンシーブロックまたはリダンダンシーワードラインまたはリダンダンシービットラインでリペアすることができる。このとき、１番目のメモリ装置１５０１は、リペア動作が行われる区間でアクセスが中断され得る。そして、リペア対象になる１番目のメモリ装置１５０１は、リペア対象である特定ブロックまたは特定ワードラインまたは特定ビットラインに格納されたデータを内部の情報格納領域ＰＡ１に複写した後、リペア動作を行うことができる。リペア動作が完了した後、１番目のメモリ装置１５０１は、情報格納領域ＰＡ１に複写されたデータをリペア完了したリダンダンシーブロックまたはリダンダンシーワードラインまたはリダンダンシービットラインに復旧することができる。参考までに、リペア対象である特定ブロックまたは特定ワードラインまたは特定ビットラインに格納されたデータを内部の情報格納領域ＰＡ１に複写する動作が正常に行われ得る理由は、特定ブロックまたは特定ワードラインまたは特定ビットラインの場合、隣接した未来時点に復旧不可能なエラーが発生する可能性が高いと予想してリペア対象と選択されただけであり、現在時点では、正常動作する状態または復旧可能なエラーのみ発生する状態であるためである。 The second operation is an operation of selecting and repairing an area in which an error has occurred in the memory device classified into the "second memory device". For example, the corresponding operation unit 1515 sets a specific block or a specific word line or a specific bit line in the first memory device 1501 classified as the “second memory device” into another normal redundancy block or redundancy word line or redundancy bit. Can be repaired on the line. At this time, the access of the first memory device 1501 may be interrupted in the section where the repair operation is performed. Then, the first memory device 1501 to be repaired performs the repair operation after copying the data stored in the specific block, the specific word line, or the specific bit line to be repaired to the internal information storage area PA1. Can be done. After the repair operation is completed, the first memory device 1501 can restore the data copied to the information storage area PA1 to the repaired redundancy block, redundancy sea word line, or redundancy bit line. For reference, the reason why the operation of copying the data stored in the specific block, the specific word line, or the specific bit line to be repaired to the internal information storage area PA1 can be normally performed is the specific block, the specific word line, or the specific bit line. For a particular bitline, it was only selected for repair in anticipation of an unrecoverable error likely to occur at an adjacent future point in time, and is currently in a working state or recoverable error. This is because it is a state in which only occurs.

３番目の動作は、「第２のメモリ装置」に区分されたメモリ装置でエラーが発生した領域を選択してディセーブル（ｄｉｓａｂｌｅ）させる動作である。例えば、対応動作部１５１５は、「第２のメモリ装置」に区分された１番目のメモリ装置１５０１で特定ブロックまたは特定ワードラインまたは特定ビットラインをディセーブルさせることができる。このとき、対応動作部１５１５は、ディセーブル対象である特定ブロックまたは特定ワードラインまたは特定ビットラインに格納されたデータを複写して「他の領域」に格納することができる。このとき、「他の領域」は、１番目のメモリ装置１５０１の他の正常なブロックまたは他の正常なワードラインまたは他の正常なビットラインになることができる。また、「他の領域」は、１番目のメモリ装置１５０１でない他のメモリ装置１５０２、１５０３、１５０４、１５０５、１５０６、１５０７、１５０８に含まれた他の正常なブロックまたは他の正常なワードラインまたは他の正常なビットラインになることができる。参考までに、ディセーブル対象である特定ブロックまたは特定ワードラインまたは特定ビットラインに格納されたデータを「他の領域」に格納する動作が正常に行われ得る理由は、特定ブロックまたは特定ワードラインまたは特定ビットラインの場合、隣接した未来時点に復旧不可能なエラーが発生する可能性が高いと予想してディセーブル対象と選択されただけであり、現在時点では、正常動作する状態または復旧可能なエラーのみ発生する状態であるためである。 The third operation is an operation of selecting and disabling an area in which an error has occurred in the memory device classified into the "second memory device". For example, the corresponding operation unit 1515 can disable a specific block, a specific word line, or a specific bit line in the first memory device 1501 classified as the “second memory device”. At this time, the corresponding operation unit 1515 can copy the data stored in the specific block, the specific word line, or the specific bit line to be disabled and store it in the “other area”. At this time, the "other area" can be another normal block or other normal wordline or other normal bitline of the first memory device 1501. Also, the "other area" is another normal block or other normal wordline contained in another memory device 1502, 1503, 1504, 1505, 1506, 1507, 1508 that is not the first memory device 1501. It can be another normal bitline. For reference, the reason why the operation of storing the data stored in the specific block or specific word line or specific bit line to be disabled in the "other area" can be performed normally is because of the specific block or specific word line or For a particular bitline, it was only selected as a disable target in anticipation of an unrecoverable error likely to occur at an adjacent future point in time, and is currently in a working state or recoverable. This is because only an error occurs.

そして、第２のエラー対応動作は、「第３のメモリ装置」に区分されたメモリ装置に対するアクセス動作中にエラーが発生する場合、システムＥＣＣ１５１６を介してエラーが発生したコードワード単位のデータに対してエラー訂正コードを使用したエラー復旧動作を含むことができる。 Then, in the second error handling operation, when an error occurs during the access operation to the memory device classified into the "third memory device", the data in codeword units in which the error occurs via the system ECC1516 It can include error recovery operations using error correction codes.

Claims

A memory system including a plurality of memory devices each including a plurality of cell array areas in which a plurality of memory cells are connected in an array form to a plurality of word lines and a plurality of bit lines, and a first error correction unit.
The second error correction section is provided, the error of the data transmitted from the memory system is corrected by the second error correction section, and error correction information for the error correction operation of the second error correction section is generated. The error strength is set for each of the plurality of memory devices by using the error correction information and the log information generated by each of the plurality of memory devices, and the plurality of memory devices are set according to the error strength. The host that performs error handling operation for each of
With
Each of the plurality of memory devices
Data in which the first error correction unit corrects an error in access data generated during an access operation to the plurality of cell array regions, and generates the log information for the error correction operation of the first error correction unit. Processing system.

Each of the plurality of memory devices
If an error occurs in the access data during the execution of the access operation including the read / write operation, the first error correction unit included inside is operated to correct the error, and the first error correction unit is operated. The raw data (raw data) of the data whose error has been corrected by the error correction unit is accumulated and stored in the internal information storage area to generate the log information, and the log information is stored in the memory system at the request of the host. The data processing system according to claim 1, wherein the data is output to the host via the above.

The host
An error collecting unit that collects the error correction information in real time or at a set time, and collects the log information from the memory system at the set time.
The log information and the error correction information are analyzed to confirm the number and types of errors that have occurred in each of the plurality of memory devices, and the error grade for each of the plurality of memory devices is determined according to the confirmation result. The first error analysis department and
Among the plurality of memory devices according to the error grade determined by the first error analysis unit, for some of the memory devices, the error form and the error form and the error form and the error are obtained through additional analysis of the log information and the error correction information. A second error analysis unit that confirms the number and determines the error intensity, and for the remaining memory devices, determines the error intensity so as to correspond to the error grade determined by the first error analysis unit. , A corresponding operation unit that performs the error handling operation for each of the plurality of memory devices according to the error intensity determined by the second error analysis unit.
The data processing system according to claim 2.

The first error analysis unit
Among the plurality of memory devices, the memory devices in which the number of errors generated internally is equal to or greater than the first reference number are classified into the first memory devices.
When the type of error generated in the first memory device is a first error generated in a word line equal to or larger than the second reference number, the corresponding first memory device has a first error grade. Divided into the second memory device
When the type of error generated in the first memory device is an error of another type other than the first error, the corresponding first memory device is referred to as a third memory having a second error grade. The data processing system according to claim 3, which is classified into devices.

Each of the first and second error correction units performs an error correction operation for data input / output from each of the plurality of memory devices with a code word (code) including an error correction code (ECC, Error Correction Code). word) unit,
The second error analysis unit
Among the second memory devices, when the form of the generated error is across codeword units of the third reference number or more and the total number of included errors is the fourth reference number or more, the corresponding said. The second memory device is divided into a fourth memory device having the first error strength.
Among the second memory devices, when the form of the generated error spans the codewords of the third reference number or more and the total number of errors included is less than the fourth reference number, or the third reference number. When the number of code words is less than the reference number, the corresponding second memory device is classified into a fifth memory device having a second error strength.
The data processing system according to claim 4, wherein the second error strength is imparted to the third memory device to classify the third memory device into the fifth memory device.

The corresponding operation unit is
The operation of selecting the area where the error occurred in the fourth memory device and blocking the access,
The operation of selecting and repairing the area where the error occurred in the fourth memory device, and
One of the operations of selecting and disabling the area in which the error occurred in the fourth memory device is selected as the error handling operation according to the state of the fourth memory device. The data processing system according to claim 5.

The host
An operation of designating a time point that is repeated at specific time intervals from the time when power is supplied to the memory system as the set time point.
The number of errors generated in the access data during the access operation to the memory system is counted, and each time the fifth reference number is exceeded, the time when the error is exceeded is specified as the set time, and then the number of errors is initially counted. And the behavior
Select at least one of the operations that specify the time point at which the error correction operation for correcting the error that occurred in the access data during the access operation to the memory system takes a specific time or more as the set time point. The data processing system according to claim 3.

Each of a plurality of cell array areas in which a plurality of memory cells are connected in an array form to a plurality of word lines and a plurality of bit lines and a first error correction section are provided, and access data generated during an access operation to the plurality of cell array areas. The first error correction section corrects the error, and a plurality of memory devices that generate log information for the error correction operation of the first error correction section, and a plurality of memory devices.
The second error correction section is provided, and the second error correction section corrects an error of data transmitted from the plurality of memory devices, and generates error correction information for the error correction operation of the second error correction section. A controller that sets an error strength for each of the plurality of memory devices by using the log information and the error correction information, and performs an error handling operation for each of the plurality of memory devices according to the error strength. When,
Memory system with.

Each of the plurality of memory devices
If an error occurs in the access data during the execution of the access operation including the read / write operation, the first error correction unit included inside is operated to correct the error, and the first error correction unit is operated. Raw data (raw data) of data whose error has been corrected by an error correction unit is accumulated and stored in an internal information storage area to generate the log information, and the log information is sent to the controller at the request of the controller. The memory system according to claim 8 for output.

The controller
An error collecting unit that collects the error correction information in real time or at a set time, and collects the log information from each of the plurality of memory devices at the set time.
The log information and the error correction information are analyzed to confirm the number and types of errors that have occurred in each of the plurality of memory devices, and the error grade for each of the plurality of memory devices is determined according to the confirmation result. The first error analysis department and
Among the plurality of memory devices according to the error grade determined by the first error analysis unit, for some of the memory devices, the error form and the error form and the error form and the error are obtained through additional analysis of the log information and the error correction information. A second error analysis unit that confirms the number and determines the error intensity, and for the remaining memory devices, determines the error intensity so as to correspond to the error grade determined by the first error analysis unit. ,
A corresponding operation unit that performs the error handling operation for each of the plurality of memory devices according to the error intensity determined by the second error analysis unit, and a corresponding operation unit.
9. The memory system according to claim 9.

The first error analysis unit
Among the plurality of memory devices, the memory devices in which the number of errors generated internally is equal to or greater than the first reference number are classified into the first memory devices.
When the type of error generated in the first memory device is a first error generated in a word line equal to or larger than the second reference number, the corresponding first memory device has a first error grade. Divided into the second memory device
When the type of error generated in the first memory device is an error of another type other than the first error, the corresponding first memory device is referred to as a third memory having a second error grade. The memory system according to claim 10, which is classified into devices.

Each of the first and second error correction units performs an error correction operation for data input / output from each of the plurality of memory devices with a code word (code) including an error correction code (ECC, Error Correction Code). word) unit,
The second error analysis unit
Among the second memory devices, when the form of the generated error is across codeword units of the third reference number or more and the total number of included errors is the fourth reference number or more, the corresponding said. The second memory device is divided into a fourth memory device having the first error strength.
Among the second memory devices, when the form of the generated error spans the codewords of the third reference number or more and the total number of errors included is less than the fourth reference number, or the third reference number. When the number of code words is less than the reference number, the corresponding second memory device is classified into a fifth memory device having a second error strength.
The memory system according to claim 11, wherein the second error strength is imparted to the third memory device to classify the third memory device into the fifth memory device.

The corresponding operation unit is
The operation of selecting the area where the error occurred in the fourth memory device and blocking the access,
The operation of selecting and repairing the area where the error occurred in the fourth memory device, and
One of the operations of selecting and disabling the area in which the error occurred in the fourth memory device is selected as the error handling operation according to the state of the fourth memory device. The memory system according to claim 12.

The controller
An operation that specifies a time point that is repeated at specific time intervals from the time when power is supplied as the set time point, and
The number of errors generated in the access data during the access operation to the plurality of memory devices is counted, and each time the fifth reference number is exceeded, the time when the error is exceeded is specified as the set time, and then the number of errors is counted. And the operation to initialize
Select at least one of the operations that specify the time point at which the error correction operation for correcting the error that occurred in the access data during the access operation to the plurality of memory devices takes a specific time or more as the set time point. The memory system according to claim 10.

In the operation method of a memory system including a plurality of memory devices each including a plurality of cell array areas in which a plurality of memory cells are connected in an array form to a plurality of word lines and a plurality of bit lines and an error correction unit.
A generation step in which the error correction unit corrects an error in access data generated during an access operation to each of the plurality of memory devices and generates log information for the error correction operation of the error correction unit.
An analysis step of setting an error grade for each of the plurality of memory devices using the log information, and
A response step for performing an error response operation for each of the plurality of memory devices according to the error grade, and
How the memory system works, including.

The generation step
When an error occurs in the access data during the execution of the access operation for each of the plurality of memory devices, the operation step of operating the error correction unit to correct the error, and the operation step.
A step of accumulating and storing raw data (raw data) for data whose error has been corrected by the error correction unit in the operation step in an information storage area included in each of the plurality of memory devices to generate the log information. When,
The operation method of the memory system according to claim 15.

The analysis step
A collection step for collecting the log information stored in the information storage area at each set time point, and
The log information collected in the collection step is analyzed to confirm the number and types of errors that have occurred in each of the plurality of memory devices, and the error grade for each of the plurality of memory devices is determined according to the confirmation result. Error analysis steps to determine and
16. The method of operating a memory system according to claim 16.

The error analysis step
Among the plurality of memory devices, a step of classifying a memory device in which the number of errors generated internally is equal to or greater than the first reference number into a first memory device, and a step of classifying the memory devices into the first memory device.
When the type of error generated in the first memory device is a first error generated in a word line equal to or larger than the second reference number, the corresponding first memory device has a first error grade. The step of dividing into the second memory device and
When the type of error generated in the first memory device is an error of another type other than the first error, the corresponding first memory device is referred to as a third memory having a second error grade. Steps to divide into devices and
The operation method of the memory system according to claim 17.

The corresponding step is
The operation of selecting the area where the error occurred in the second memory device and blocking the access,
The operation of selecting and repairing the area where the error occurred in the second memory device, and
One of the operations of selecting and disabling the area in which the error occurred in the second memory device is selected according to the state of the second memory device to deal with the error. The method of operating a memory system according to claim 18, which is performed as an operation.

A step of designating a time point that is repeated at specific time intervals from the time when power is supplied as the set time point, and
The number of errors generated in the access data during the access operation to the plurality of memory devices is counted, and each time the fifth reference number is exceeded, the time when the error is exceeded is specified as the set time, and then the number of errors is counted. And the steps to initialize
At least one step is further added to the step of designating a time point at which a specific time or more is required for the error correction operation for correcting an error generated in the access data during the access operation to the plurality of memory devices as the set time point. The method of operating the memory system according to claim 17, which includes.