WO2012001788A1 - Multicore processor system, restoration program, and method of restoration - Google Patents

Multicore processor system, restoration program, and method of restoration Download PDF

Info

Publication number
WO2012001788A1
WO2012001788A1 PCT/JP2010/061192 JP2010061192W WO2012001788A1 WO 2012001788 A1 WO2012001788 A1 WO 2012001788A1 JP 2010061192 W JP2010061192 W JP 2010061192W WO 2012001788 A1 WO2012001788 A1 WO 2012001788A1
Authority
WO
WIPO (PCT)
Prior art keywords
processor
context
stall
restoration
master
Prior art date
Application number
PCT/JP2010/061192
Other languages
French (fr)
Japanese (ja)
Inventor
浩一郎 山下
宏真 山内
鈴木 貴久
康志 栗原
Original Assignee
富士通株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 富士通株式会社 filed Critical 富士通株式会社
Priority to PCT/JP2010/061192 priority Critical patent/WO2012001788A1/en
Priority to JP2012522393A priority patent/JP5454686B2/en
Publication of WO2012001788A1 publication Critical patent/WO2012001788A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1415Saving, restoring, recovering or retrying at system level
    • G06F11/1441Resetting or repowering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/4401Bootstrapping
    • G06F9/4405Initialisation of multiprocessor systems

Definitions

  • the present invention relates to a multicore processor system, a restoration program, and a restoration method that cause a plurality of processors (multicore processors) to execute a program.
  • the OS initialization data is saved to the main memory write-protected area at the initial startup of the OS (Operating System), and the OS is restarted from a specific address and the saved initialization data is used when the OS is restarted when a failure is detected.
  • a technique for restoring the main memory is disclosed. Also disclosed is a technique for saving the contents of the main memory to a storage area of the main memory after the initial startup of the OS and restoring the contents of the storage area to the main memory upon restart (for example, Patent Documents 1 and 2 below). See).
  • An object of the present invention is to provide a multi-core processor system, a restoration program, and a restoration method capable of efficiently realizing a warm start during operation in order to solve the above-described problems caused by the prior art.
  • the first processor executing the master process among the multi-core processors accessing the shared memory detects that the context related to the master process in the shared memory is accessed, Each time access to the context is detected, a combination of difference data before and after detection in a section including the access destination in the context is stored, and when the first processor is stalled, the first of the multi-core processors A second processor selected from a group of processors other than one processor restores the context before the stall based on the combination of the difference data for each partition before the stall, and the first processor If the second processor stalls, the second program A multi-core processor system, a restoration program, which creates the master process to be executed by a second processor by the second processor, and associates the restored pre-stall context with the created master process by the second processor;
  • the restoration method is provided as an example.
  • the master function of the OS is protected by using a cryptographic engine that is standardly installed in an embedded terminal, particularly many portable terminals, and a snoop controller that is mounted in a multi-core processor. Continue system operation without a cold start when a failure occurs.
  • FIG. 1 is a block diagram illustrating a system configuration example of a multi-core processor system.
  • a multicore processor system 100 includes a multicore processor 101, a snoop controller 102, a watchdog timer 103, a cryptographic engine 104, and a shared memory 105 that are communicably connected via a bus 106.
  • the multi-core processor 101 is composed of a plurality of CPUs (Central Processing Units) (four in the figure as an example). Each of the CPUs # 0 to # 3 is provided with a cache memory (hereinafter simply referred to as “cache”) C. Each of the CPUs # 0 to # 3 executes the privileged mode program H.
  • the privileged mode program H is, for example, a hypervisor, and is a program that can directly access each of the CPUs # 0 to # 3 by privilege.
  • each of the CPUs # 0 to # 3 executes a kernel K, a master process MP, and a slave process SP.
  • the kernel K is a program that is the core of the OS, and executes memory management, file management, and management of the multi-core processor 101. Further, the kernel K generates a master process MP.
  • the master process MP is a program that manages the slave process SP.
  • the master process MP is generated and executed only by one CPU of the multi-core processor 101 (CPU # 0 as an example in FIG. 1).
  • a CPU that executes the master process MP is referred to as a master CPU.
  • the master CPU is set to CPU # 0.
  • migration may be performed according to the present embodiment, or other existing load distribution may be performed to shift to another CPU # 1 to CPU # 3. is there.
  • the remaining CPUs other than the master CPU in the multi-core processor 101 are referred to as slave CPUs.
  • the slave process SP is a program executed by the multi-core processor 101.
  • Specific examples of the slave process SP include system processes such as thread generation (scheduling), interrupt processing, and system resource access (driver).
  • system processes such as thread generation (scheduling), interrupt processing, and system resource access (driver).
  • information management of these slave processes SP is held by the master process MP, and the master process MP monitors these resource managements using the master context written in the system management area on the shared memory 105. To do.
  • the slave process SP notifies the master process MP of a change in its own state. If there is no response to the master process MP, a timeout occurs because a resource failure has occurred. After that, conventionally, the multi-core processor system 100 is hung or reset operation is performed through the watchdog timer 103. In this embodiment, the master process MP is restored.
  • a kernel K and a process (including at least the master process MP) in which the master process MP is operating are referred to as a master OS.
  • the kernel K and the slave process SP are referred to as a slave OS.
  • the OS includes a library common to application software (hereinafter simply referred to as “application”) AP, but is omitted here for the sake of simplicity.
  • applications AP operate on the master OS and the slave OS.
  • the snoop controller 102 is connected to the multi-core processor 101 and monitors the cache C of each CPU # 0 to CPU # 3. When a value in a certain cache C is updated, the snoop controller 102 performs a coherent process on another cache C. Specifically, the snoop controller 102 detects all memory accesses that pass through the cache C. Therefore, when the detected memory access is a memory access to the coherent corresponding area in the cache C, the snoop controller 102 executes the above-described coherent processing.
  • the watchdog timer 103 periodically checks the survival state of the multi-core processor 101, and responds to the non-responsive CPUs # 0 to # 3 in order to recover from the stop or runaway state of the program counter for illegal segment access. Issue a reset signal.
  • the cryptographic engine 104 is hardware that executes encryption processing of target data and decryption processing of encrypted data.
  • the cryptographic engine 104 is hardware that is provided as standard in the portable terminal.
  • the cryptographic engine 104 is a unit that is installed on the bus 106 to which the shared memory 105 is connected for the purpose of concealing personal data, and performs encryption / decryption.
  • AES Advanced Encryption Standard
  • DES Data Encryption Standard
  • Huffman coding a data reversible compression function such as Huffman coding
  • the encryption and decryption processes in the encryption engine 104 are omitted.
  • the compressed data is encrypted, the encrypted data is decrypted into the compressed data, and then decompressed. Shall be.
  • the encryption process and the decryption process need not be executed only by compression and decompression.
  • the reversible compression of the cryptographic engine 104 is performed on the updated data.
  • Data is backed up in a protected area in the shared memory 105 as compressed data by compressing using the function.
  • the cryptographic engine 104 calculates a hash value of the compressed data. Thereby, when the data is updated and the compressed data and the hash value thereof are obtained, it is understood that the compressed data is the same before and after the update if it matches the stored hash value. On the other hand, if the hash values do not match, the compressed data differs before and after the update.
  • the cryptographic engine 104 stores the old compressed data before the update and its hash value, and the new compressed data after the update and the hash value in the protected area in the shared memory 105. If the compressed data is the same before and after the update, the new compressed data after the update and its hash value are not saved. As a result, the latest two compressed data and their hash values are always stored. Since the cryptographic engine 104 is hardware, there is no influence on performance.
  • the shared memory 105 is a storage area shared by the multi-core processor 101, and includes a system management area 151, a protection area 152, and a general area 153.
  • the system management area 151 is an area where the master OS, slave OS, and privileged mode program H can read and write.
  • a master context MC related to the master process MP is stored in the system management area 151.
  • the master context MC is unique information obtained by the master process MP such as a stack pointer and a register value managed by the master process MP, a value of a program counter, a type of the generated slave process SP, and a generation destination CPU number of the slave process SP. is there.
  • the master context MC is divided into a plurality of sections. Specifically, for example, the data is stored separately for each unit area such as a cache line, a page, and a bank.
  • the protected area 152 is an area for storing the compressed data compressed by the privileged mode program H and its hash value.
  • a double buffer method is adopted.
  • the double buffer method is a method of storing two pieces of compressed data and hash values of data at the same location in time series.
  • the master context MC Since the master context MC is divided for each unit area, when data of a certain unit area in the master context MC is updated, compressed data (old compressed data) before the update and its hash are obtained as the compression information 160.
  • the value (old hash value), the updated compressed data (new compressed data), and the hash value (new hash value) are stored.
  • General area 153 is an area where the master OS, slave OS, and application AP store data. Specifically, the master OS and the slave OS store data other than data to be stored in the system management area 151 in the general area 153.
  • the application AP stores the calculation result in the general area 153.
  • FIG. 2 is an explanatory diagram showing an example of updating the compression information 160 in the protection area 152.
  • the master context MC is divided into eight and stored in the system management area 151 as partition data MC1 to MC8.
  • (A) shows the state when the system is started.
  • the partition data MC1 to MC8 are stored as the master context MC.
  • the cryptographic engine 104 compresses the divided data MC1 to MC8 to generate compressed data mc1 to mc8.
  • the cryptographic engine 104 generates hash values h1 to h8 of the compressed data mc1 to mc8.
  • the cryptographic engine 104 stores the generated compressed data mc1 to mc8 and their hash values h1 to h8 in the protected area 152.
  • partition data MC1 in the master context MC is updated from the state of (A) to become partition data MC11. That is, it is assumed that the memory access destination is an address in the partition data MC1.
  • the cryptographic engine 104 compresses the partition data MC11 according to an instruction from the privileged mode program H to generate compressed data mc11. Further, the cryptographic engine 104 generates a hash value h11 of the compressed data mc11.
  • the hash value h1 is compared with the hash value h11.
  • h1 ⁇ h11 there is a difference between the partition data MC1 and the partition data MC11, and the compressed data mc11 and its hash value h11 are stored.
  • (B) it is assumed that the compressed data mc11 and its hash value h11 are stored as h1 ⁇ h11. Also, for the partition data MC2 to MC8 that have not been updated this time, the cryptographic engine 104 does not compress or calculate a hash value.
  • (C) it is assumed that the partition data MC11 in the master context MC is updated to the partition data MC12 from the state of (B). That is, it is assumed that the memory access destination is an address in the partition data MC12.
  • the cryptographic engine 104 compresses the partition data MC12 according to an instruction from the privileged mode program H to generate compressed data mc12. Further, the cryptographic engine 104 generates a hash value h12 of the compressed data mc12.
  • the hash value h11 and the hash value h12 are compared.
  • h11 ⁇ h12 there is a difference between the partition data MC11 and the partition data MC12, and the compressed data mc12 and its hash value h12 are stored.
  • (C) it is assumed that the compressed data mc12 and its hash value h12 are stored as h11 ⁇ h12. Also, for the partition data MC2 to MC8 that have not been updated this time, the cryptographic engine 104 does not compress or calculate a hash value.
  • partition data MC5 of the master context MC is updated due to a failure from the state of (C) to become partition data MC51. That is, it is assumed that the memory access destination is an address in the partition data MC51.
  • the cryptographic engine 104 compresses the partition data MC51 according to an instruction from the privileged mode program H to generate compressed data mc51. Further, the cryptographic engine 104 generates a hash value h51 of the compressed data mc51.
  • the hash value h5 is compared with the hash value h51. Due to the failure, the partition data MC5 is updated to the partition data MC51. Therefore, h5 ⁇ h51, and there is a difference between the partition data MC5 and the partition data MC51, and the compressed data mc51 and its hash value h51 are stored.
  • the encryption engine 104 causes the compressed data mc5 and mc51 stored in the protection area 152 for the partition data MC51 that is the failure location according to an instruction from the privileged mode program H of the slave CPU. Among them, the compressed data mc5 before update due to the failure is expanded to generate partition data MC5.
  • the master context MC at the time of failure specifically, the partition data MC5, MC2 to MC4, MC51, MC6 to MC8, which is the master context MC of (D), is the partition data that becomes a failure.
  • the master context MC partition data MC12, MC2 to MC8 is restored by replacing the MC51 with the expanded partition data MC5.
  • the restored master context MC is in the state before the failure occurrence in (D), that is, the state (C).
  • the updated compressed data mc51 and its hash value h51 in (D) are also deleted from the protected area 152 and restored to the same state as in (C).
  • FIG. 3 is a flowchart showing a procedure for updating the compression information 160.
  • the snoop controller 102 detects a memory access passing through the cache C from the kernel K, master process MP, and slave process SP (step S301). Note that the memory access is also performed from the application AP via the kernel K, the master process MP, and the slave process SP.
  • the snoop controller 102 determines whether or not the memory access is a coherent corresponding area in the cache C (step S302). If it is not a coherent area (step S302: No), the process returns to step S301. On the other hand, when it is a coherent corresponding area (step S302: Yes), the snoop controller 102 performs cache coherent (step S303), and returns to step S301.
  • the privileged mode program H monitors memory access by the snoop controller 102.
  • step S301 when the snoop controller 102 detects memory access of the target data, the privileged mode program H waits for memory access to the system management area 151 in the shared memory 105 (step S304: No). If the memory access of the target data is a memory access to the system management area 151 (step S304: Yes), the privileged mode program H performs a cache flush of the target data (step S305).
  • the target data written to the cache C is also written to the shared memory 105 by write-through.
  • the target data is written in the non-coherent corresponding area of the cache C and is also written in the master context MC. Then, the non-coherent corresponding area of the cache C is cache flushed.
  • the privileged mode program H activates the cryptographic engine 104 and gives an instruction to compress the target data (step S306).
  • the same processing is performed even when access is made from another program, for example, the application AP, instead of access to the system management area 151 by the master process MP. That is, the privileged mode program H executes the processing of steps S304 to S306 regardless of what program the memory access source is.
  • the cryptographic engine 104 compresses the target data written in the access destination (step S307). For example, as shown in FIG. 2, when the partition data MC1 of the master context MC is the target data, the partition data MC1 is updated to the partition data MC11. Therefore, the cryptographic engine 104 compresses and compresses the partition data MC11.
  • the data is mc11.
  • the cryptographic engine 104 calculates a hash value of the compressed data (step S308). Then, the cryptographic engine 104 determines whether or not there is a difference between the compressed data compressed last time and the compressed data compressed this time for the partitioned data that is the access destination (step S309). Specifically, it is determined whether or not the hash value of the old compressed data and the hash value of the new compressed data are the same. If they are the same, there is no difference.
  • step S309: No the processing of the cryptographic engine 104 is terminated.
  • step S309: Yes the cryptographic engine 104 stores the new compressed data and its hash value in the protected area 152 as shown in FIG. 2 (step S310).
  • FIG. 4 is a flowchart showing the restoration processing procedure.
  • the watchdog timer 103 monitors the master CPU (step S401: No).
  • the privileged mode program H receives the notification of the abnormality detection of the master CPU and temporarily stops the slave OS to the slave CPU.
  • An instruction is notified (step S402).
  • the slave CPU that has received the temporary stop instruction temporarily stops the slave OS.
  • the privileged mode program H can execute the stall from among the slave CPUs that have received an instruction to pause the slave OS.
  • a restoration source CPU for restoring the CPU is selected (step S403). Specifically, for example, a CPU having a lower CPU number among the slave CPUs is selected as the restoration source CPU. In the example of FIG. 1, since CPU # 0 becomes a stall CPU, CPU # 1 becomes a restoration source CPU. Further, the CPU having the lowest load among the slave CPUs is selected as the restoration source CPU. The CPU load can be obtained by monitoring the slave OS or the privileged mode program H of the slave CPU.
  • the privileged mode program H of the restoration source CPU specifies partition data that becomes a failure location in the master context MC (step S404). Specifically, the privileged mode program H of the restoration source CPU does not specify whether or not the failure has occurred due to the stall, and it is not necessary.
  • the partition data having the master context MC is compressed by memory access detection, and the new compressed data and its hash value are stored.
  • the privileged mode program H of the restoration source CPU identifies the compression source partition data of the new compression data saved at this time as the failure location.
  • the privileged mode program H of the restoration source CPU extracts the old compressed data from the new and old compressed data of the partition data that is the specified failure location, to the cryptographic engine 104 (step S405), and extracts the extracted old compressed data. Extend (step S406). For example, as shown in FIG. 2D, when the failure location is the partition data MC51, the previous old compressed data mc5 is extracted instead of the new compressed data mc51.
  • the privileged mode program H of the restoration source CPU restores the master context MC to the state before the stall by overwriting the failed portion of the master context MC with the expanded partition data (step S407).
  • the extracted compressed data mc5 is expanded to partition data MC5
  • the privileged mode program H of the restoration source CPU updates the process table held by the kernel K in the slave OS that is temporarily stopped by the restoration source CPU (step S408). Specifically, an address specifying the restored master context MC is added to the process table.
  • the process table is stored in the cache C in the restoration source CPU or the system management area.
  • the privileged mode program H of the restoration source CPU releases the suspension only for the slave OS of the restoration source CPU, and notifies the generation instruction of the master process MP to the kernel K of the slave OS of the restoration source CPU (step S409).
  • the master process MP since the master process MP is generated by the restoration source CPU, it is associated with an address that designates the restored master context MC in the updated process table. In this way, the master process MP is migrated to the restoration source CPU.
  • the privileged mode program H of the restoring source CPU notifies the instruction to cancel the suspension of the slave OS of the remaining remaining slave CPU (step S410).
  • the operation of the OS is resumed by the slave CPU other than the stall CPU. Further, when the operation of the OS is resumed, the operation of the upper application AP is also resumed.
  • the privileged mode program H of the restoration source CPU reboots the stall CPU (step S411). Thereby, the restoration process ends.
  • FIG. 5 is a block diagram showing the system state after restoration.
  • the master process MP is migrated to the CPU # 1.
  • the CPU # 0, which is a stall CPU does not execute the OS and the application AP immediately after rebooting, but the CPU # 1, which is the master CPU after restoration, instructs the activation of the slave OS, so that the CPU # When 0, kernel K is started.
  • the load of the multi-core processor 101 can be balanced by allocating the slave process SP and the application AP by a known load balancing technique.
  • the master context MC is restored to the state before the stall and the master CPU is set to the slave CPU.
  • Process MP can be migrated. Therefore, it is not necessary to cold start the entire system, and warm start is possible. Further, not only the master process MP but also when the master context MC is rewritten due to an illegal operation or abnormal operation of the application AP, the warm start can be similarly performed.
  • the multi-core processor system 100 since the multi-core processor system 100 has a large number of CPUs, the chip area is inevitably widened. Therefore, gamma rays are easily irradiated, and bit inversion may occur in the CPU. Even when the value in the master context MC is changed due to such bit inversion, since the stall of the master CPU is detected by the watchdog timer 103, the warm start by the high-speed transition of the master process MP can be realized.
  • the master context MC can be restored by storing the old and new compressed data and the hash value thereof only for the updated partition data. Therefore, since the size of the protection area 152 can be reduced, the system management area 151 and the general area 153 can be increased accordingly.
  • the mobile terminal when the multi-core processor system 100 is mounted on a mobile terminal, the mobile terminal inevitably requires a high-speed startup as compared with a stationary terminal. Therefore, it is not necessary to restart the entire system due to a warm start after the master CPU stall. Therefore, the convenience of the mobile terminal can be improved.
  • the cryptographic engine 104 is hardware that is provided as standard equipment for communication authentication and data protection in portable terminals such as mobile phones, smartphones, game machines, and tablets. By diverting the encryption engine 104 provided as a standard for restoration, it is not necessary to install hardware dedicated for restoration. Therefore, the number of parts of the mobile terminal can be reduced, and an increase in size due to an increase in functions can be prevented.
  • Multi-core processor system 101 Multi-core processor 102 Snoop controller 103 Watchdog timer 104 Cryptographic engine 105 Shared memory 106 Bus 151 System management area 152 Protected area 153 General area 160 Compressed information AP Application C Cache H Privileged mode program MC Master context MP Master process SP Slave process

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Computer Security & Cryptography (AREA)
  • Retry When Errors Occur (AREA)

Abstract

In (D), an encryption engine (104) compresses block data (MC51) and generates compressed data (mc51) according to an instruction from a privilege mode program (H). Further, a hash value (h51) is generated for the compressed data (mc51). A hash value (h5) and the hash value (h51) are compared. Because a block data (MC5) is modified by the block data (MC51) in response to a fault, the compressed data (mc51) and the hash value (h51) are saved. In this case, the master CPU is stalling, so, of the compressed data (mc5, mc51), the compressed data (mc5) before modification as a result of the fault is expanded and a block data (MC5) is generated. Then, in (E) the faulty block data (MC51) is replaced by the expanded block data (MC5), and thereby a master context MC (block data (MC12), (MC2) - (MC8)) is restored.

Description

マルチコアプロセッサシステム、復元プログラム、および復元方法Multi-core processor system, restoration program, and restoration method
 本発明は、複数のプロセッサ(マルチコアプロセッサ)にプログラムを実行させるマルチコアプロセッサシステム、復元プログラム、および復元方法に関する。 The present invention relates to a multicore processor system, a restoration program, and a restoration method that cause a plurality of processors (multicore processors) to execute a program.
 従来、OS(Operating System)の初期起動時にOSの初期化データを主記憶の書込禁止領域に退避し、障害検出での再起動時にはOSを特定番地から再起動し退避済初期化データを使用して主記憶を復元する技術が開示されている。また、OSの初期起動後、主記憶の内容を主記憶の保存領域に退避し、再起動時に保存領域の内容を主記憶に復元する技術も開示されている(たとえば、下記特許文献1,2を参照。)。 Conventionally, the OS initialization data is saved to the main memory write-protected area at the initial startup of the OS (Operating System), and the OS is restarted from a specific address and the saved initialization data is used when the OS is restarted when a failure is detected. Thus, a technique for restoring the main memory is disclosed. Also disclosed is a technique for saving the contents of the main memory to a storage area of the main memory after the initial startup of the OS and restoring the contents of the storage area to the main memory upon restart (for example, Patent Documents 1 and 2 below). See).
特開平11-24936号公報Japanese Patent Laid-Open No. 11-24936 特開2002-258971号公報JP 2002-258971 A
 しかしながら、上述した従来技術のソフトウェア制御で、マルチコアプロセッサシステムを適用した携帯電話等の組み込みシステムで運用しようとすると、運用中にメモリイメージを時々刻々と保護することとなり、オーバーヘッドの要因になるという問題がある。また、メモリ上に運用状態の多重保存をおこなうと、必要メモリ量の増大化を招いてしまい、大型化により携帯性を損ねてしまうという問題がある。 However, if it is attempted to operate in an embedded system such as a mobile phone to which a multi-core processor system is applied by the above-described conventional software control, the memory image is protected every moment during operation, which causes an overhead factor. There is. In addition, if the operation state is stored on the memory in a multiple manner, the required memory amount increases, and there is a problem that portability is impaired due to an increase in size.
 本発明は、上述した従来技術による問題点を解消するため、運用中のウォームスタートを効率的に実現することができるマルチコアプロセッサシステム、復元プログラム、および復元方法を提供することを目的とする。 An object of the present invention is to provide a multi-core processor system, a restoration program, and a restoration method capable of efficiently realizing a warm start during operation in order to solve the above-described problems caused by the prior art.
 上述した課題を解決し、目的を達成するため、共有メモリにアクセスするマルチコアプロセッサのうちマスタプロセスを実行する第1のプロセッサが前記共有メモリ内の前記マスタプロセスに関するコンテキストにアクセスしたことを検出し、前記コンテキストへのアクセスが検出される都度、前記コンテキストのうちアクセス先を含む区画での検出前後の差分データの組み合わせを保存し、前記第1のプロセッサがストールした場合、前記マルチコアプロセッサのうち前記第1のプロセッサ以外の他のプロセッサ群の中から選ばれた第2のプロセッサにより、前記ストール前の前記区画ごとの前記差分データの組み合わせに基づいて、前記ストール前のコンテキストを復元し、前記第1のプロセッサがストールした場合、前記第2のプロセッサが実行する前記マスタプロセスを、前記第2のプロセッサにより生成し、復元された前記ストール前のコンテキストと生成されたマスタプロセスとを、前記第2のプロセッサにより関連付けるマルチコアプロセッサシステム、復元プログラム、および復元方法を、一例として提供する。 In order to solve the above-described problem and achieve the object, the first processor executing the master process among the multi-core processors accessing the shared memory detects that the context related to the master process in the shared memory is accessed, Each time access to the context is detected, a combination of difference data before and after detection in a section including the access destination in the context is stored, and when the first processor is stalled, the first of the multi-core processors A second processor selected from a group of processors other than one processor restores the context before the stall based on the combination of the difference data for each partition before the stall, and the first processor If the second processor stalls, the second program A multi-core processor system, a restoration program, which creates the master process to be executed by a second processor by the second processor, and associates the restored pre-stall context with the created master process by the second processor; The restoration method is provided as an example.
 上記マルチコアプロセッサシステム、復元プログラム、および復元方法によれば、運用中のウォームスタートを効率的に実現することができるという効果を奏する。 According to the above multi-core processor system, restoration program, and restoration method, it is possible to efficiently realize a warm start during operation.
マルチコアプロセッサシステムのシステム構成例を示すブロック図である。It is a block diagram which shows the system configuration example of a multi-core processor system. 保護領域内の圧縮情報の更新例を示す説明図である。It is explanatory drawing which shows the example of an update of the compression information in a protection area. 圧縮情報の更新処理手順を示すフローチャートである。It is a flowchart which shows the update process procedure of compression information. 復元処理手順を示すフローチャートである。It is a flowchart which shows a restoration process procedure. 復元後のシステム状態を示すブロック図である。It is a block diagram which shows the system state after decompression | restoration.
 以下に添付図面を参照して、本発明にかかるマルチコアプロセッサシステム、復元プログラム、および復元方法の実施の形態を詳細に説明する。 Hereinafter, embodiments of a multi-core processor system, a restoration program, and a restoration method according to the present invention will be described in detail with reference to the accompanying drawings.
 以下に示す実施の形態では、組み込み型端末、特に多くの携帯端末に標準的に実装されている暗号エンジンや、マルチコアプロセッサに実装されているスヌープコントローラを用いることで、OSのマスタ機能を保護し、障害発生時にコールドスタートすることなくシステムを継続運用する。 In the embodiment described below, the master function of the OS is protected by using a cryptographic engine that is standardly installed in an embedded terminal, particularly many portable terminals, and a snoop controller that is mounted in a multi-core processor. Continue system operation without a cold start when a failure occurs.
(システム構成の一例)
 図1は、マルチコアプロセッサシステムのシステム構成例を示すブロック図である。図1において、マルチコアプロセッサシステム100は、マルチコアプロセッサ101と、スヌープコントローラ102と、ウォッチドッグタイマ103と、暗号エンジン104と、共有メモリ105とが、バス106を介して通信可能に接続されている。
(Example of system configuration)
FIG. 1 is a block diagram illustrating a system configuration example of a multi-core processor system. In FIG. 1, a multicore processor system 100 includes a multicore processor 101, a snoop controller 102, a watchdog timer 103, a cryptographic engine 104, and a shared memory 105 that are communicably connected via a bus 106.
 マルチコアプロセッサ101は、複数のCPU(Central Processing Unit)(図では、例として4個)により構成されている。各CPU#0~CPU#3にはキャッシュメモリ(以下、単に、「キャッシュ」)Cが設けられている。また、各CPU#0~CPU#3は、特権モードプログラムHを実行する。特権モードプログラムHとは、たとえば、ハイパーバイザであり、特権により各々のCPU#0~CPU#3に直接アクセスすることができるプログラムである。 The multi-core processor 101 is composed of a plurality of CPUs (Central Processing Units) (four in the figure as an example). Each of the CPUs # 0 to # 3 is provided with a cache memory (hereinafter simply referred to as “cache”) C. Each of the CPUs # 0 to # 3 executes the privileged mode program H. The privileged mode program H is, for example, a hypervisor, and is a program that can directly access each of the CPUs # 0 to # 3 by privilege.
 また、各CPU#0~CPU#3は、カーネルK、マスタプロセスMP、およびスレーブプロセスSPを実行する。カーネルKは、OSの中核となるプログラムであり、メモリ管理やファイル管理、マルチコアプロセッサ101の管理を実行する。また、カーネルKは、マスタプロセスMPを生成する。 Further, each of the CPUs # 0 to # 3 executes a kernel K, a master process MP, and a slave process SP. The kernel K is a program that is the core of the OS, and executes memory management, file management, and management of the multi-core processor 101. Further, the kernel K generates a master process MP.
 マスタプロセスMPは、スレーブプロセスSPを管理するプログラムである。マスタプロセスMPは、マルチコアプロセッサ101のうち1つのCPU(図1では、例としてCPU#0)でのみ生成されて実行される。マスタプロセスMPを実行するCPUをマスタCPUと称す。初期起動時では、マスタCPUは、CPU#0とするが、本実施の形態によりマイグレーションをおこなったり、その他既存の負荷分散をおこなうことで、他のCPU#1~CPU#3に移行することがある。また、マルチコアプロセッサ101のうちマスタCPU以外の残余のCPUをスレーブCPUと称す。 The master process MP is a program that manages the slave process SP. The master process MP is generated and executed only by one CPU of the multi-core processor 101 (CPU # 0 as an example in FIG. 1). A CPU that executes the master process MP is referred to as a master CPU. At the time of initial startup, the master CPU is set to CPU # 0. However, migration may be performed according to the present embodiment, or other existing load distribution may be performed to shift to another CPU # 1 to CPU # 3. is there. Further, the remaining CPUs other than the master CPU in the multi-core processor 101 are referred to as slave CPUs.
 スレーブプロセスSPは、マルチコアプロセッサ101で実行されるプログラムである。スレーブプロセスSPとしては、具体的には、たとえば、スレッド生成(スケジューリング)、割込み処理、システムリソースアクセス(ドライバ)といったシステムプロセスが挙げられる。上述したように、これらのスレーブプロセスSPの情報管理は、マスタプロセスMPが握っており、マスタプロセスMPはこれらの資源管理を共有メモリ105上のシステム管理領域に書き込まれたマスタコンテキストを用いて監視する。 The slave process SP is a program executed by the multi-core processor 101. Specific examples of the slave process SP include system processes such as thread generation (scheduling), interrupt processing, and system resource access (driver). As described above, information management of these slave processes SP is held by the master process MP, and the master process MP monitors these resource managements using the master context written in the system management area on the shared memory 105. To do.
 また、スレーブプロセスSPは、自身の状態の変化をマスタプロセスMPに通知する。マスタプロセスMPに応答がない場合、資源障害が発生したとして、タイムアウトが発生する。このあと、従来では、ウォッチドッグタイマ103を通じて、マルチコアプロセッサシステム100をハングさせるか、リセット操作をおこなうことになるが、本実施の形態では、マスタプロセスMPを復元する。 In addition, the slave process SP notifies the master process MP of a change in its own state. If there is no response to the master process MP, a timeout occurs because a resource failure has occurred. After that, conventionally, the multi-core processor system 100 is hung or reset operation is performed through the watchdog timer 103. In this embodiment, the master process MP is restored.
 なお、マスタプロセスMPが動作しているカーネルKおよびプロセス(少なくともマスタプロセスMPを含む)をマスタOSと称す。また、カーネルKおよびスレーブプロセスSPをスレーブOSと称す。なお、OSにはアプリケーションソフトウェア(以下、単に、「アプリ」)APに共通なライブラリも含まれているが説明の簡略化のため、ここでは省略する。また、マスタOSおよびスレーブOS上では、各種アプリAPが動作する。 Note that a kernel K and a process (including at least the master process MP) in which the master process MP is operating are referred to as a master OS. The kernel K and the slave process SP are referred to as a slave OS. The OS includes a library common to application software (hereinafter simply referred to as “application”) AP, but is omitted here for the sake of simplicity. Various applications AP operate on the master OS and the slave OS.
 スヌープコントローラ102は、マルチコアプロセッサ101に接続されており、各CPU#0~CPU#3のキャッシュCを監視する。あるキャッシュC内の値が更新されると、スヌープコントローラ102は、他のキャッシュCに対してコヒーレント処理をおこなう。具体的には、スヌープコントローラ102は、キャッシュCを通過するメモリアクセスをすべて検出する。したがって、その検出されたメモリアクセスがキャッシュC内のコヒーレント対応領域に対するメモリアクセスである場合には、スヌープコントローラ102は、上述したコヒーレント処理を実行する。 The snoop controller 102 is connected to the multi-core processor 101 and monitors the cache C of each CPU # 0 to CPU # 3. When a value in a certain cache C is updated, the snoop controller 102 performs a coherent process on another cache C. Specifically, the snoop controller 102 detects all memory accesses that pass through the cache C. Therefore, when the detected memory access is a memory access to the coherent corresponding area in the cache C, the snoop controller 102 executes the above-described coherent processing.
 ウォッチドッグタイマ103は、周期的にマルチコアプロセッサ101の生存状態を確認し、不正なセグメントアクセスに対するプログラムカウンタの停止や暴走状態から復帰するために、応答のないCPU#0~CPU#3に対してリセットシグナルを発行する。 The watchdog timer 103 periodically checks the survival state of the multi-core processor 101, and responds to the non-responsive CPUs # 0 to # 3 in order to recover from the stop or runaway state of the program counter for illegal segment access. Issue a reset signal.
 暗号エンジン104は、対象データの暗号化処理や暗号化データの復号処理を実行するハードウェアである。暗号エンジン104は、携帯型端末に標準装備されているハードウェアである。暗号エンジン104は、通常個人データの秘匿のため共有メモリ105が接続されたバス106上に設置され、暗号化・復号化をおこなうユニットである。 The cryptographic engine 104 is hardware that executes encryption processing of target data and decryption processing of encrypted data. The cryptographic engine 104 is hardware that is provided as standard in the portable terminal. The cryptographic engine 104 is a unit that is installed on the bus 106 to which the shared memory 105 is connected for the purpose of concealing personal data, and performs encryption / decryption.
 暗号化にはAES(Advanced Encryption Standard)やDES(Data Encryption Standard)などが用いられるが、通常ハフマン符号化などによるデータ可逆圧縮機能が搭載されている。 For encryption, AES (Advanced Encryption Standard), DES (Data Encryption Standard), etc. are used, but usually a data reversible compression function such as Huffman coding is installed.
 なお、本実施の形態では、暗号エンジン104での暗号化および復号化の処理を省略して説明するが、圧縮データは暗号化され、暗号化データは圧縮データに復号されて、そののちに伸張されるものとする。また、圧縮と伸張のみで暗号化処理と復号処理を実行しなくてもよい。 In this embodiment, the encryption and decryption processes in the encryption engine 104 are omitted. However, the compressed data is encrypted, the encrypted data is decrypted into the compressed data, and then decompressed. Shall be. Also, the encryption process and the decryption process need not be executed only by compression and decompression.
 運用中、スヌープコントローラ102によるメモリ書き換えの検出、あるいは特権モードプログラムHがプロセスの変化(マスタプロセスMPによるシステム管理領域の更新)を検出した場合に、更新されたデータを、暗号エンジン104の可逆圧縮機能を用いて圧縮して、圧縮データとして共有メモリ105内の保護領域にデータバックアップする。 During operation, when the memory rewrite is detected by the snoop controller 102 or when the privileged mode program H detects a process change (update of the system management area by the master process MP), the reversible compression of the cryptographic engine 104 is performed on the updated data. Data is backed up in a protected area in the shared memory 105 as compressed data by compressing using the function.
 また、暗号エンジン104は、圧縮データのハッシュ値を算出する。これにより、データが更新されて圧縮データおよびそのハッシュ値が得られた場合、保存済みのハッシュ値と一致すれば、更新前後で圧縮データが同じであることがわかる。一方、ハッシュ値が不一致となった場合は、更新前後で圧縮データが異なることとなる。 Also, the cryptographic engine 104 calculates a hash value of the compressed data. Thereby, when the data is updated and the compressed data and the hash value thereof are obtained, it is understood that the compressed data is the same before and after the update if it matches the stored hash value. On the other hand, if the hash values do not match, the compressed data differs before and after the update.
 更新前後で圧縮データが異なる場合は、暗号エンジン104は、更新前の旧圧縮データおよびそのハッシュ値と、更新後の新圧縮データおよびそのハッシュ値とを共有メモリ105内の保護領域に保存する。更新前後で圧縮データが同一である場合は、更新後の新圧縮データおよびそのハッシュ値を保存しない。これにより、常に最新の2つの圧縮データおよびそのハッシュ値が保存されていることとなる。なお、暗号エンジン104は、ハードウェアであるため性能の影響がない。 When the compressed data is different before and after the update, the cryptographic engine 104 stores the old compressed data before the update and its hash value, and the new compressed data after the update and the hash value in the protected area in the shared memory 105. If the compressed data is the same before and after the update, the new compressed data after the update and its hash value are not saved. As a result, the latest two compressed data and their hash values are always stored. Since the cryptographic engine 104 is hardware, there is no influence on performance.
 共有メモリ105は、マルチコアプロセッサ101が共有する記憶領域であり、システム管理領域151と保護領域152と一般領域153とを有する。システム管理領域151は、マスタOS、スレーブOS、および特権モードプログラムHが読み書きできる領域である。システム管理領域151には、マスタプロセスMPに関するマスタコンテキストMCが保存される。 The shared memory 105 is a storage area shared by the multi-core processor 101, and includes a system management area 151, a protection area 152, and a general area 153. The system management area 151 is an area where the master OS, slave OS, and privileged mode program H can read and write. In the system management area 151, a master context MC related to the master process MP is stored.
 マスタコンテキストMCとは、マスタプロセスMPが管理するスタックポインタやレジスタの値、プログラムカウンタの値、生成したスレーブプロセスSPの種類、スレーブプロセスSPの生成先CPU番号といったマスタプロセスMPが得る固有な情報である。また、マスタコンテキストMCは、複数の区画に分割されている。具体的には、たとえば、キャッシュライン、ページ、バンクといった単位領域ごとに区分されて保存される。 The master context MC is unique information obtained by the master process MP such as a stack pointer and a register value managed by the master process MP, a value of a program counter, a type of the generated slave process SP, and a generation destination CPU number of the slave process SP. is there. The master context MC is divided into a plurality of sections. Specifically, for example, the data is stored separately for each unit area such as a cache line, a page, and a bank.
 保護領域152は、特権モードプログラムHにより圧縮された圧縮データとそのハッシュ値を保存する領域である。本実施の形態では、ダブルバッファ方式を採用する。ダブルバッファ方式とは、同一箇所のデータに対する圧縮データおよびそのハッシュ値を時系列で2個保存する方式である。 The protected area 152 is an area for storing the compressed data compressed by the privileged mode program H and its hash value. In this embodiment, a double buffer method is adopted. The double buffer method is a method of storing two pieces of compressed data and hash values of data at the same location in time series.
 マスタコンテキストMCは、単位領域ごとに区分されているため、マスタコンテキストMC内のある単位領域のデータが更新されると、圧縮情報160として、その更新前の圧縮データ(旧圧縮データ)およびそのハッシュ値(旧ハッシュ値)と、更新後の圧縮データ(新圧縮データ)およびそのハッシュ値(新ハッシュ値)を保存する。 Since the master context MC is divided for each unit area, when data of a certain unit area in the master context MC is updated, compressed data (old compressed data) before the update and its hash are obtained as the compression information 160. The value (old hash value), the updated compressed data (new compressed data), and the hash value (new hash value) are stored.
 一般領域153は、マスタOS、スレーブOS、アプリAPがデータ保存する領域である。具体的には、マスタOSおよびスレーブOSは、システム管理領域151に保存すべきデータ以外のデータを一般領域153に保存する。アプリAPはその演算結果などを一般領域153に保存する。 General area 153 is an area where the master OS, slave OS, and application AP store data. Specifically, the master OS and the slave OS store data other than data to be stored in the system management area 151 in the general area 153. The application AP stores the calculation result in the general area 153.
(保護領域152の更新例)
 図2は、保護領域152内の圧縮情報160の更新例を示す説明図である。図2では、例としてマスタコンテキストMCが8分割されて区画データMC1~MC8としてシステム管理領域151に保存されるものとする。
(Update example of protected area 152)
FIG. 2 is an explanatory diagram showing an example of updating the compression information 160 in the protection area 152. In FIG. 2, as an example, it is assumed that the master context MC is divided into eight and stored in the system management area 151 as partition data MC1 to MC8.
 (A)は、システム起動時の状態を示している。システム起動時では、マスタコンテキストMCの全保存領域にアクセスするため、全保存領域で差分が生じる。したがって、区画データMC1~MC8がマスタコンテキストMCとして保存される。そして、特権モードプログラムHからの指示により、区画データMC1~MC8ごとに、暗号エンジン104が圧縮して圧縮データmc1~mc8を生成する。また、暗号エンジン104は、圧縮データmc1~mc8のハッシュ値h1~h8を生成する。暗号エンジン104は、生成された圧縮データmc1~mc8とそのハッシュ値h1~h8を保護領域152に保存する。 (A) shows the state when the system is started. At the time of system startup, since all the storage areas of the master context MC are accessed, a difference occurs in all the storage areas. Therefore, the partition data MC1 to MC8 are stored as the master context MC. Then, according to an instruction from the privileged mode program H, the cryptographic engine 104 compresses the divided data MC1 to MC8 to generate compressed data mc1 to mc8. Also, the cryptographic engine 104 generates hash values h1 to h8 of the compressed data mc1 to mc8. The cryptographic engine 104 stores the generated compressed data mc1 to mc8 and their hash values h1 to h8 in the protected area 152.
 (B)において、(A)の状態からマスタコンテキストMCのうち区画データMC1が更新されて区画データMC11になったとする。すなわち、メモリアクセス先が区画データMC1内のアドレスであったとする。この場合、暗号エンジン104は、特権モードプログラムHからの指示により、区画データMC11を圧縮して圧縮データmc11を生成する。また、暗号エンジン104は、圧縮データmc11のハッシュ値h11を生成する。 In (B), it is assumed that the partition data MC1 in the master context MC is updated from the state of (A) to become partition data MC11. That is, it is assumed that the memory access destination is an address in the partition data MC1. In this case, the cryptographic engine 104 compresses the partition data MC11 according to an instruction from the privileged mode program H to generate compressed data mc11. Further, the cryptographic engine 104 generates a hash value h11 of the compressed data mc11.
 そして、ハッシュ値h1とハッシュ値h11とを比較する。h1=h11の場合は、圧縮データmc11およびそのハッシュ値h11は保存されない。一方、h1≠h11の場合は、区画データMC1と区画データMC11とは差分があることとなり、圧縮データmc11およびそのハッシュ値h11は保存される。(B)では、h1≠h11として、圧縮データmc11およびそのハッシュ値h11が保存されるものとする。また、今回更新されていない区画データMC2~MC8については、暗号エンジン104は圧縮もハッシュ値算出もしない。 Then, the hash value h1 is compared with the hash value h11. When h1 = h11, the compressed data mc11 and its hash value h11 are not stored. On the other hand, when h1 ≠ h11, there is a difference between the partition data MC1 and the partition data MC11, and the compressed data mc11 and its hash value h11 are stored. In (B), it is assumed that the compressed data mc11 and its hash value h11 are stored as h1 ≠ h11. Also, for the partition data MC2 to MC8 that have not been updated this time, the cryptographic engine 104 does not compress or calculate a hash value.
 (C)において、(B)の状態からマスタコンテキストMCのうち区画データMC11が更新されて区画データMC12になったとする。すなわち、メモリアクセス先が区画データMC12内のアドレスであったとする。この場合、暗号エンジン104は、特権モードプログラムHからの指示により、区画データMC12を圧縮して圧縮データmc12を生成する。また、暗号エンジン104は、圧縮データmc12のハッシュ値h12を生成する。 In (C), it is assumed that the partition data MC11 in the master context MC is updated to the partition data MC12 from the state of (B). That is, it is assumed that the memory access destination is an address in the partition data MC12. In this case, the cryptographic engine 104 compresses the partition data MC12 according to an instruction from the privileged mode program H to generate compressed data mc12. Further, the cryptographic engine 104 generates a hash value h12 of the compressed data mc12.
 そして、ハッシュ値h11とハッシュ値h12とを比較する。h11=h12の場合は、圧縮データmc12およびそのハッシュ値h12は保存されない。一方、h11≠h12の場合は、区画データMC11と区画データMC12とは差分があることとなり、圧縮データmc12およびそのハッシュ値h12は保存される。(C)では、h11≠h12として、圧縮データmc12およびそのハッシュ値h12が保存されるものとする。また、今回更新されていない区画データMC2~MC8については、暗号エンジン104は圧縮もハッシュ値算出もしない。 Then, the hash value h11 and the hash value h12 are compared. When h11 = h12, the compressed data mc12 and its hash value h12 are not stored. On the other hand, when h11 ≠ h12, there is a difference between the partition data MC11 and the partition data MC12, and the compressed data mc12 and its hash value h12 are stored. In (C), it is assumed that the compressed data mc12 and its hash value h12 are stored as h11 ≠ h12. Also, for the partition data MC2 to MC8 that have not been updated this time, the cryptographic engine 104 does not compress or calculate a hash value.
 (D)において、(C)の状態からマスタコンテキストMCのうち区画データMC5が障害により更新されて区画データMC51になったとする。すなわち、メモリアクセス先が区画データMC51内のアドレスであったとする。この場合、暗号エンジン104は、特権モードプログラムHからの指示により、区画データMC51を圧縮して圧縮データmc51を生成する。また、暗号エンジン104は、圧縮データmc51のハッシュ値h51を生成する。 In (D), it is assumed that the partition data MC5 of the master context MC is updated due to a failure from the state of (C) to become partition data MC51. That is, it is assumed that the memory access destination is an address in the partition data MC51. In this case, the cryptographic engine 104 compresses the partition data MC51 according to an instruction from the privileged mode program H to generate compressed data mc51. Further, the cryptographic engine 104 generates a hash value h51 of the compressed data mc51.
 そして、ハッシュ値h5とハッシュ値h51とを比較する。障害により区画データMC5は区画データMC51に更新されている。したがって、h5≠h51となり、区画データMC5と区画データMC51とは差分があり、圧縮データmc51およびそのハッシュ値h51は保存される。この場合、マスタCPUがストールしているため、スレーブCPUの特権モードプログラムHからの指示により、暗号エンジン104は、障害箇所である区画データMC51に対する保護領域152に保存されている圧縮データmc5,mc51のうち、障害による更新前の圧縮データmc5を伸張して、区画データMC5を生成する。 Then, the hash value h5 is compared with the hash value h51. Due to the failure, the partition data MC5 is updated to the partition data MC51. Therefore, h5 ≠ h51, and there is a difference between the partition data MC5 and the partition data MC51, and the compressed data mc51 and its hash value h51 are stored. In this case, since the master CPU is stalled, the encryption engine 104 causes the compressed data mc5 and mc51 stored in the protection area 152 for the partition data MC51 that is the failure location according to an instruction from the privileged mode program H of the slave CPU. Among them, the compressed data mc5 before update due to the failure is expanded to generate partition data MC5.
 そして、(E)において、障害発生時のマスタコンテキストMC、具体的には、(D)のマスタコンテキストMCである区画データMC12,MC2~MC4,MC51,MC6~MC8のうち、障害となる区画データMC51を伸張した区画データMC5に置き換えて、マスタコンテキストMC(区画データMC12,MC2~MC8)を復元する。この復元後のマスタコンテキストMCは、(D)での障害発生前の状態、すなわち、(C)の状態となる。また、(D)における更新後の圧縮データmc51とそのハッシュ値h51も保護領域152から削除されて、(C)と同一の状態に復元される。 Then, in (E), the master context MC at the time of failure, specifically, the partition data MC5, MC2 to MC4, MC51, MC6 to MC8, which is the master context MC of (D), is the partition data that becomes a failure. The master context MC (partition data MC12, MC2 to MC8) is restored by replacing the MC51 with the expanded partition data MC5. The restored master context MC is in the state before the failure occurrence in (D), that is, the state (C). Also, the updated compressed data mc51 and its hash value h51 in (D) are also deleted from the protected area 152 and restored to the same state as in (C).
(圧縮データおよびハッシュ値の更新処理)
 図3は、圧縮情報160の更新処理手順を示すフローチャートである。まず、スヌープコントローラ102は、カーネルK、マスタプロセスMP、スレーブプロセスSPからのキャッシュCを通過するメモリアクセスを検出する(ステップS301)。なお、メモリアクセスは、カーネルK、マスタプロセスMP、スレーブプロセスSPを介してアプリAPからも行われる。
(Compressed data and hash value update processing)
FIG. 3 is a flowchart showing a procedure for updating the compression information 160. First, the snoop controller 102 detects a memory access passing through the cache C from the kernel K, master process MP, and slave process SP (step S301). Note that the memory access is also performed from the application AP via the kernel K, the master process MP, and the slave process SP.
 つぎに、スヌープコントローラ102は、メモリアクセスを検出すると、そのメモリアクセスがキャッシュC内のコヒーレント対応領域であるか否かを判断する(ステップS302)。コヒーレント対応領域でない場合(ステップS302:No)、ステップS301に戻る。一方、コヒーレント対応領域である場合(ステップS302:Yes)、スヌープコントローラ102は、キャッシュコヒーレントをおこない(ステップS303)、ステップS301に戻る。 Next, when detecting the memory access, the snoop controller 102 determines whether or not the memory access is a coherent corresponding area in the cache C (step S302). If it is not a coherent area (step S302: No), the process returns to step S301. On the other hand, when it is a coherent corresponding area (step S302: Yes), the snoop controller 102 performs cache coherent (step S303), and returns to step S301.
 特権モードプログラムHは、スヌープコントローラ102によるメモリアクセスを監視している。そして、ステップS301において、スヌープコントローラ102が対象データのメモリアクセスを検出した場合、特権モードプログラムHは、共有メモリ105内のシステム管理領域151へのメモリアクセスを待ち受ける(ステップS304:No)。対象データのメモリアクセスがシステム管理領域151へのメモリアクセスである場合(ステップS304:Yes)、特権モードプログラムHは、対象データのキャッシュフラッシュをおこなう(ステップS305)。 The privileged mode program H monitors memory access by the snoop controller 102. In step S301, when the snoop controller 102 detects memory access of the target data, the privileged mode program H waits for memory access to the system management area 151 in the shared memory 105 (step S304: No). If the memory access of the target data is a memory access to the system management area 151 (step S304: Yes), the privileged mode program H performs a cache flush of the target data (step S305).
 この場合、キャッシュCに書き込まれた対象データは、ライトスルーにより、共有メモリ105にも書き込まれる。たとえば、マスタプロセスMPによるシステム管理領域151内のマスタコンテキストMCが更新される場合には、キャッシュCの非コヒーレント対応領域に対象データが書き込まれるとともに、マスタコンテキストMCにも書き込まれる。そして、キャッシュCの非コヒーレント対応領域はキャッシュフラッシュされる。このあと、特権モードプログラムHは、暗号エンジン104を起動して、対象データの圧縮指示をおこなう(ステップS306)。 In this case, the target data written to the cache C is also written to the shared memory 105 by write-through. For example, when the master context MC in the system management area 151 by the master process MP is updated, the target data is written in the non-coherent corresponding area of the cache C and is also written in the master context MC. Then, the non-coherent corresponding area of the cache C is cache flushed. Thereafter, the privileged mode program H activates the cryptographic engine 104 and gives an instruction to compress the target data (step S306).
 また、マスタプロセスMPによるシステム管理領域151内へのアクセスではなく、他のプログラム、たとえば、アプリAPからのアクセスであっても、同様な処理となる。すなわち、特権モードプログラムHは、メモリアクセス元が、どんなプログラムであるかを問わず、ステップS304~S306の処理を実行することとなる。 Further, the same processing is performed even when access is made from another program, for example, the application AP, instead of access to the system management area 151 by the master process MP. That is, the privileged mode program H executes the processing of steps S304 to S306 regardless of what program the memory access source is.
 暗号エンジン104は、特権モードプログラムHから圧縮指示を受けると、アクセス先に書き込まれた対象データを圧縮する(ステップS307)。たとえば、図2に示したように、マスタコンテキストMCの区画データMC1が対象データである場合、区画データMC1が区画データMC11に更新されるため、暗号エンジン104は、区画データMC11を圧縮して圧縮データmc11とする。 When receiving the compression instruction from the privileged mode program H, the cryptographic engine 104 compresses the target data written in the access destination (step S307). For example, as shown in FIG. 2, when the partition data MC1 of the master context MC is the target data, the partition data MC1 is updated to the partition data MC11. Therefore, the cryptographic engine 104 compresses and compresses the partition data MC11. The data is mc11.
 つぎに、暗号エンジン104は、圧縮データのハッシュ値を算出する(ステップS308)。そして、暗号エンジン104は、アクセス先となった区画データについて前回圧縮された圧縮データと今回圧縮された圧縮データとに差分があるか否かを判断する(ステップS309)。具体的には、旧圧縮データのハッシュ値と新圧縮データのハッシュ値が同一であるか否かを判断する。同一であれば差分がないこととなる。 Next, the cryptographic engine 104 calculates a hash value of the compressed data (step S308). Then, the cryptographic engine 104 determines whether or not there is a difference between the compressed data compressed last time and the compressed data compressed this time for the partitioned data that is the access destination (step S309). Specifically, it is determined whether or not the hash value of the old compressed data and the hash value of the new compressed data are the same. If they are the same, there is no difference.
 一方、同一でない場合、差分があることとなる。差分がない場合(ステップS309:No)、暗号エンジン104の処理を終了する。一方、差分がある場合(ステップS309:Yes)、暗号エンジン104は、図2に示したように、新圧縮データとそのハッシュ値とを保護領域152に格納する(ステップS310)。 On the other hand, if they are not identical, there will be a difference. If there is no difference (step S309: No), the processing of the cryptographic engine 104 is terminated. On the other hand, if there is a difference (step S309: Yes), the cryptographic engine 104 stores the new compressed data and its hash value in the protected area 152 as shown in FIG. 2 (step S310).
 このように、図3に示したフローチャートでは、システム管理領域151へのメモリアクセスが正常な更新処理であるかシステム管理領域151を障害するメモリアクセスであるかを問わず、システム管理領域151内のデータが更新されると、アクセス先における最新の圧縮データおよびそのハッシュ値と、1つ前の旧圧縮データおよびそのハッシュ値が保存されることとなる。 As described above, in the flowchart shown in FIG. 3, regardless of whether the memory access to the system management area 151 is a normal update process or a memory access that disturbs the system management area 151, When the data is updated, the latest compressed data and its hash value at the access destination, the previous previous compressed data and its hash value are stored.
(復元処理)
 図4は、復元処理手順を示すフローチャートである。まず、ウォッチドッグタイマ103がマスタCPUを監視する(ステップS401:No)。ウォッチドッグタイマ103は、マスタCPUのストールやハングにより異常を検出した場合(ステップS401:Yes)、特権モードプログラムHは、マスタCPUの異常検出の通知を受けて、スレーブCPUにスレーブOSの一時停止指示を通知する(ステップS402)。一時停止指示を受けたスレーブCPUは、そのスレーブOSを一時停止させる。
(Restore process)
FIG. 4 is a flowchart showing the restoration processing procedure. First, the watchdog timer 103 monitors the master CPU (step S401: No). When the watchdog timer 103 detects an abnormality due to a stall or hang of the master CPU (step S401: Yes), the privileged mode program H receives the notification of the abnormality detection of the master CPU and temporarily stops the slave OS to the slave CPU. An instruction is notified (step S402). The slave CPU that has received the temporary stop instruction temporarily stops the slave OS.
 このあと、マスタCPUはストールしているため(以下、ストールしたマスタCPUを「ストールCPU」と称す。)、特権モードプログラムHは、スレーブOSの一時停止指示を受けたスレーブCPUの中から、ストールCPUを復元させる復元元CPUを選択する(ステップS403)。具体的には、たとえば、スレーブCPUのうち、CPU番号が若いCPUを復元元CPUに選択する。図1の例では、CPU#0がストールCPUになるため、CPU#1が復元元CPUとなる。また、スレーブCPUのうち最も低負荷のCPUを復元元CPUに選択する。CPUの負荷は、スレーブOSまたはスレーブCPUの特権モードプログラムHが監視することで得ることができる。 After that, since the master CPU has stalled (hereinafter, the stalled master CPU is referred to as “stall CPU”), the privileged mode program H can execute the stall from among the slave CPUs that have received an instruction to pause the slave OS. A restoration source CPU for restoring the CPU is selected (step S403). Specifically, for example, a CPU having a lower CPU number among the slave CPUs is selected as the restoration source CPU. In the example of FIG. 1, since CPU # 0 becomes a stall CPU, CPU # 1 becomes a restoration source CPU. Further, the CPU having the lowest load among the slave CPUs is selected as the restoration source CPU. The CPU load can be obtained by monitoring the slave OS or the privileged mode program H of the slave CPU.
 そして、復元元CPUが選択されると、復元元CPUの特権モードプログラムHは、マスタコンテキストMC内の障害箇所となる区画データを特定する(ステップS404)。具体的には、復元元CPUの特権モードプログラムHは、ストールにより障害されたかどうかを特定せず、またその必要もない。ウォッチドッグタイマ103による異常検出があった場合、図2に示したように、メモリアクセス検出により、マスタコンテキストMCのある区画データが圧縮されて新圧縮データとそのハッシュ値が保存される。 Then, when the restoration source CPU is selected, the privileged mode program H of the restoration source CPU specifies partition data that becomes a failure location in the master context MC (step S404). Specifically, the privileged mode program H of the restoration source CPU does not specify whether or not the failure has occurred due to the stall, and it is not necessary. When an abnormality is detected by the watchdog timer 103, as shown in FIG. 2, the partition data having the master context MC is compressed by memory access detection, and the new compressed data and its hash value are stored.
 復元元CPUの特権モードプログラムHは、このときに保存された新圧縮データの圧縮元の区画データを障害箇所として特定することとなる。復元元CPUの特権モードプログラムHは、暗号エンジン104に対し、特定された障害箇所である区画データの新旧圧縮データのうち旧圧縮データを抽出して(ステップS405)、抽出された旧圧縮データを伸張する(ステップS406)。たとえば、図2の(D)に示したように、障害箇所が区画データMC51の場合は、その新圧縮データmc51ではなく、1つ前の旧圧縮データmc5を抽出することとなる。 The privileged mode program H of the restoration source CPU identifies the compression source partition data of the new compression data saved at this time as the failure location. The privileged mode program H of the restoration source CPU extracts the old compressed data from the new and old compressed data of the partition data that is the specified failure location, to the cryptographic engine 104 (step S405), and extracts the extracted old compressed data. Extend (step S406). For example, as shown in FIG. 2D, when the failure location is the partition data MC51, the previous old compressed data mc5 is extracted instead of the new compressed data mc51.
 復元元CPUの特権モードプログラムHは、マスタコンテキストMCの障害箇所を伸張された区画データで上書きすることで、マスタコンテキストMCをストール前の状態に復元する(ステップS407)。たとえば、図2の(D)に示したように、抽出された圧縮データmc5を区画データMC5に伸張し、図2の(D)のマスタコンテキストMCの区画データMC51に伸張された区画データMC5を上書きする(図2の(E)参照。)。 The privileged mode program H of the restoration source CPU restores the master context MC to the state before the stall by overwriting the failed portion of the master context MC with the expanded partition data (step S407). For example, as shown in FIG. 2D, the extracted compressed data mc5 is expanded to partition data MC5, and the partition data MC5 expanded to partition data MC51 of the master context MC in FIG. Overwriting (see (E) of FIG. 2).
 このあと、復元元CPUの特権モードプログラムHは、復元元CPUで一時停止中のスレーブOS内のカーネルKが持つプロセステーブルを更新する(ステップS408)。具体的には、復元後のマスタコンテキストMCを指定するアドレスをプロセステーブルに追加する。なお、プロセステーブルは、復元元CPU内のキャッシュCまたはシステム管理領域に格納されている。 Thereafter, the privileged mode program H of the restoration source CPU updates the process table held by the kernel K in the slave OS that is temporarily stopped by the restoration source CPU (step S408). Specifically, an address specifying the restored master context MC is added to the process table. The process table is stored in the cache C in the restoration source CPU or the system management area.
 そして、復元元CPUの特権モードプログラムHは、復元元CPUのスレーブOSのみ一時停止を解除して、復元元CPUのスレーブOSのカーネルKにマスタプロセスMPの生成指示を通知する(ステップS409)。これにより、復元元CPUでマスタプロセスMPが生成されるため、更新後のプロセステーブル内の復元後のマスタコンテキストMCを指定するアドレスと関連付けされる。このようにして、マスタプロセスMPが復元元CPUにマイグレートされたこととなる。 Then, the privileged mode program H of the restoration source CPU releases the suspension only for the slave OS of the restoration source CPU, and notifies the generation instruction of the master process MP to the kernel K of the slave OS of the restoration source CPU (step S409). Thereby, since the master process MP is generated by the restoration source CPU, it is associated with an address that designates the restored master context MC in the updated process table. In this way, the master process MP is migrated to the restoration source CPU.
 このあと、復元元CPUの特権モードプログラムHは、存命している残余のスレーブCPUのスレーブOSの一時停止の解除指示を通知する(ステップS410)。これにより、ストールCPU以外のスレーブCPUでOSの動作が再開することとなる。また、OSの動作が再開することにより、その上位のアプリAPの動作も再開することとなる。 After this, the privileged mode program H of the restoring source CPU notifies the instruction to cancel the suspension of the slave OS of the remaining remaining slave CPU (step S410). As a result, the operation of the OS is resumed by the slave CPU other than the stall CPU. Further, when the operation of the OS is resumed, the operation of the upper application AP is also resumed.
 このように、マスタプロセスMPのマイグレートの完了後に、復元元CPUの特権モードプログラムHは、ストールCPUのリブートをおこなう(ステップS411)。これにより、復元処理が終了する。 Thus, after the migration of the master process MP is completed, the privileged mode program H of the restoration source CPU reboots the stall CPU (step S411). Thereby, the restoration process ends.
 図5は、復元後のシステム状態を示すブロック図である。図5では、CPU#1が復元元CPUとなっているため、CPU#1にマスタプロセスMPがマイグレートされている。また、ストールCPUであるCPU#0は、リブート直後には、OS、アプリAPが実行されていないが、復元後のマスタCPUであるCPU#1がスレーブOSの起動指示をすることで、CPU#0でカーネルKが起動する。また、その後、周知の負荷分散技術によりスレーブプロセスSPやアプリAPを割り当てることで、マルチコアプロセッサ101の負荷の均衡を図ることができる。 FIG. 5 is a block diagram showing the system state after restoration. In FIG. 5, since the CPU # 1 is the restoration source CPU, the master process MP is migrated to the CPU # 1. The CPU # 0, which is a stall CPU, does not execute the OS and the application AP immediately after rebooting, but the CPU # 1, which is the master CPU after restoration, instructs the activation of the slave OS, so that the CPU # When 0, kernel K is started. Thereafter, the load of the multi-core processor 101 can be balanced by allocating the slave process SP and the application AP by a known load balancing technique.
 マスタプロセスMPやマスタCPUを保護する技術は多く開発されているが、いずれも、初期状態を保存しているものであり、動作中では、動的な変化状態を刻々保存するオーバーヘッドが大きいため、効率的な復元はできない。 Many technologies for protecting the master process MP and the master CPU have been developed, but all of them store the initial state, and during operation, the overhead of storing the dynamic change state every moment is large. It cannot be restored efficiently.
 これに対し、本実施の形態によれば、マルチコアプロセッサシステム100の動作中に、マスタCPUに異常が発生してストールした場合、マスタコンテキストMCをストール前の状態に復元して、スレーブCPUにマスタプロセスMPを移行することができる。したがって、システム全体をコールドスタートさせる必要はなく、ウォームスタートが可能となる。また、マスタプロセスMPに限らず、アプリAPの不正動作や異常動作によりマスタコンテキストMCが書き換えられた場合でも、同様にウォームスタートさせることができる。 On the other hand, according to the present embodiment, when an abnormality occurs in the master CPU during the operation of the multi-core processor system 100, the master context MC is restored to the state before the stall and the master CPU is set to the slave CPU. Process MP can be migrated. Therefore, it is not necessary to cold start the entire system, and warm start is possible. Further, not only the master process MP but also when the master context MC is rewritten due to an illegal operation or abnormal operation of the application AP, the warm start can be similarly performed.
 特に、マルチコアプロセッサシステム100は、CPU数が多いため、チップ面積が必然的に広くなる。したがって、ガンマ線が照射されやすく、CPU内でのビット反転が発生する可能性がある。このようなビット反転によりマスタコンテキストMC内の値が狂った場合でも、ウォッチドッグタイマ103によりマスタCPUのストールが検出されるため、マスタプロセスMPの高速移行によるウォームスタートを実現することができる。 Particularly, since the multi-core processor system 100 has a large number of CPUs, the chip area is inevitably widened. Therefore, gamma rays are easily irradiated, and bit inversion may occur in the CPU. Even when the value in the master context MC is changed due to such bit inversion, since the stall of the master CPU is detected by the watchdog timer 103, the warm start by the high-speed transition of the master process MP can be realized.
 また、マスタコンテキストMCが更新される都度、更新された区画データについてのみ、新旧圧縮データとそのハッシュ値を保存しておくことで、マスタコンテキストMCを復元することができる。したがって、保護領域152のサイズを小さくすることができるため、その分システム管理領域151や一般領域153を大きくとることができる。 Further, each time the master context MC is updated, the master context MC can be restored by storing the old and new compressed data and the hash value thereof only for the updated partition data. Therefore, since the size of the protection area 152 can be reduced, the system management area 151 and the general area 153 can be increased accordingly.
 また、本マルチコアプロセッサシステム100を携帯端末に搭載する場合、携帯端末では据え置き型端末に比べて、必然的に高速起動が要求される。したがって、マスタCPUのストール発生後のウォームスタートにより、システム全体を再起動する必要はない。したがって、携帯端末の利便性の向上を図ることができる。 Further, when the multi-core processor system 100 is mounted on a mobile terminal, the mobile terminal inevitably requires a high-speed startup as compared with a stationary terminal. Therefore, it is not necessary to restart the entire system due to a warm start after the master CPU stall. Therefore, the convenience of the mobile terminal can be improved.
 また、暗号エンジン104は、携帯電話、スマートフォン、ゲーム機、タブレットなどの携帯端末において通信認証やデータ保護のために標準装備されるハードウェアである。このような標準装備される暗号エンジン104を復元に流用することで、復元専用のハードウェアを搭載する必要がない。したがって、携帯端末の部品点数の低減化を図ることができ、機能増加にともなう大型化を防止することができる。 The cryptographic engine 104 is hardware that is provided as standard equipment for communication authentication and data protection in portable terminals such as mobile phones, smartphones, game machines, and tablets. By diverting the encryption engine 104 provided as a standard for restoration, it is not necessary to install hardware dedicated for restoration. Therefore, the number of parts of the mobile terminal can be reduced, and an increase in size due to an increase in functions can be prevented.
100 マルチコアプロセッサシステム
101 マルチコアプロセッサ
102 スヌープコントローラ
103 ウォッチドッグタイマ
104 暗号エンジン
105 共有メモリ
106 バス
151 システム管理領域
152 保護領域
153 一般領域
160 圧縮情報
AP アプリ
C キャッシュ
H 特権モードプログラム
MC マスタコンテキスト
MP マスタプロセス
SP スレーブプロセス
100 Multi-core processor system 101 Multi-core processor 102 Snoop controller 103 Watchdog timer 104 Cryptographic engine 105 Shared memory 106 Bus 151 System management area 152 Protected area 153 General area 160 Compressed information AP Application C Cache H Privileged mode program MC Master context MP Master process SP Slave process

Claims (7)

  1.  共有メモリにアクセスするマルチコアプロセッサのうちマスタプロセスを実行する第1のプロセッサが前記共有メモリ内の前記マスタプロセスに関するコンテキストにアクセスしたことを検出する検出手段と、
     前記検出手段によって前記コンテキストへのアクセスが検出される都度、前記コンテキストのうちアクセス先を含む区画での検出前後の差分データの組み合わせを保存する保存手段と、
     前記第1のプロセッサがストールした場合、前記マルチコアプロセッサのうち前記第1のプロセッサ以外の他のプロセッサ群の中から選ばれた第2のプロセッサにより、前記ストール前の前記区画ごとの前記差分データの組み合わせに基づいて、前記ストール前のコンテキストを復元する復元手段と、
     前記第1のプロセッサがストールした場合、前記第2のプロセッサが実行する前記マスタプロセスを、前記第2のプロセッサにより生成する生成手段と、
     前記復元手段によって復元された前記ストール前のコンテキストと前記生成手段によって生成されたマスタプロセスとを、前記第2のプロセッサにより関連付ける関連付け手段と、
     を備えることを特徴とするマルチコアプロセッサシステム。
    Detecting means for detecting that a first processor executing a master process among multi-core processors accessing a shared memory accesses a context relating to the master process in the shared memory;
    Each time the access to the context is detected by the detection means, a storage means for storing a combination of difference data before and after detection in a section including the access destination in the context;
    When the first processor is stalled, the second processor selected from a group of processors other than the first processor among the multi-core processors causes the difference data for each partition before the stall to be stored. Restoring means for restoring the context before the stall based on the combination;
    Generating means for generating, by the second processor, the master process executed by the second processor when the first processor is stalled;
    Association means for associating, by the second processor, the pre-stall context restored by the restoration means and the master process created by the creation means;
    A multi-core processor system comprising:
  2.  前記検出手段は、
     前記マルチコアプロセッサ内の各キャッシュのコヒーレント処理を実行するキャッシュコヒーレンシー機構での前記共有メモリへのメモリアクセスの検出結果に基づいて、前記マスタプロセスに関するコンテキストにアクセスしたことを検出することを特徴とする請求項1に記載のマルチコアプロセッサシステム。
    The detection means includes
    The access to a context related to the master process is detected based on a detection result of a memory access to the shared memory by a cache coherency mechanism that performs coherent processing of each cache in the multi-core processor. Item 4. The multi-core processor system according to Item 1.
  3.  前記差分データを圧縮する圧縮手段と、
     前記圧縮手段によって圧縮された圧縮差分データのハッシュ値を算出する算出手段と、
     前記算出手段によって算出された、前記コンテキストへのアクセス前後における同一箇所での前記圧縮差分データのハッシュ値どうしが、一致するか否かを判定する判定手段と、
     前記圧縮差分データを伸張する伸張手段と、を備え、
     前記保存手段は、
     前記判定手段によって不一致と判定された場合、不一致と判定された前記圧縮差分データの組み合わせを保存し、
     前記復元手段は、
     前記伸張手段によって前記ストール前の前記圧縮差分データから伸張された差分データに基づいて、前記ストール前のコンテキストを復元することを特徴とする請求項1または2に記載のマルチコアプロセッサシステム。
    Compression means for compressing the difference data;
    Calculating means for calculating a hash value of the compressed differential data compressed by the compressing means;
    Determining means for determining whether or not the hash values of the compressed difference data in the same location before and after accessing the context calculated by the calculating means match;
    Decompression means for decompressing the compressed differential data,
    The storage means includes
    If it is determined that there is a mismatch by the determination unit, the combination of the compressed difference data determined as a mismatch is stored,
    The restoration means includes
    3. The multi-core processor system according to claim 1, wherein the context before the stall is restored based on the differential data decompressed from the compressed differential data before the stall by the decompression unit.
  4.  前記他のプロセッサ群の各々の負荷に基づいて、前記他のプロセッサ群の中から前記第2のプロセッサを決定する決定手段を備え、
     前記復元手段は、
     前記第1のプロセッサがストールした場合、前記決定手段によって決定された第2のプロセッサにより、前記ストール前の前記差分データに基づいて、前記ストール前のコンテキストを復元することを特徴とする請求項1または2に記載のマルチコアプロセッサシステム。
    Determining means for determining the second processor from the other processor group based on the load of each of the other processor group;
    The restoration means includes
    2. The context before the stall is restored based on the difference data before the stall by the second processor determined by the determination unit when the first processor is stalled. Or the multi-core processor system of 2.
  5.  前記第1のプロセッサのストール後に、前記第2のプロセッサにより、前記第1のプロセッサをリブートするリブート手段を備えることを特徴とする請求項1または2に記載のマルチコアプロセッサシステム。 3. The multi-core processor system according to claim 1, further comprising a reboot unit configured to reboot the first processor by the second processor after the first processor is stalled.
  6.  共有メモリにアクセスするマルチコアプロセッサのうちマスタプロセスを実行する第1のプロセッサが前記共有メモリ内の前記マスタプロセスに関するコンテキストにアクセスしたことを検出する検出工程と、
     前記検出工程によって前記コンテキストへのアクセスが検出される都度、前記コンテキストのうちアクセス先を含む区画での検出前後の差分データの組み合わせを保存する保存工程と、
     前記第1のプロセッサがストールした場合、前記マルチコアプロセッサのうち前記一のプロセッサ以外の他のプロセッサ群の中から選ばれた第2のプロセッサにより、前記ストール前の前記区画ごとの前記差分データの組み合わせに基づいて、前記ストール前のコンテキストを復元する復元工程と、
     前記第1のプロセッサがストールした場合、前記第2のプロセッサが実行する前記マスタプロセスを、前記第2のプロセッサにより生成する生成工程と、
     前記復元工程によって復元された前記ストール前のコンテキストと前記生成工程によって生成されたマスタプロセスとを、前記第2のプロセッサにより関連付ける関連付け工程と、
     を前記マルチコアプロセッサに実行させることを特徴とする復元プログラム。
    A detecting step of detecting that a first processor executing a master process among multi-core processors accessing the shared memory has accessed a context relating to the master process in the shared memory;
    Each time the access to the context is detected by the detection step, a storage step of storing a combination of difference data before and after detection in a section including the access destination in the context;
    When the first processor is stalled, the second processor selected from a group of processors other than the one of the multi-core processors, and the combination of the difference data for each partition before the stall Based on the restoration process of restoring the context before the stall; and
    When the first processor is stalled, a generation step of generating the master process executed by the second processor by the second processor;
    An association step of associating, by the second processor, the pre-stall context restored by the restoration step and the master process created by the creation step;
    Is executed by the multi-core processor.
  7.  共有メモリにアクセスするマルチコアプロセッサのうちマスタプロセスを実行する第1のプロセッサが前記共有メモリ内の前記マスタプロセスに関するコンテキストにアクセスしたことを検出する検出工程と、
     前記検出工程によって前記コンテキストへのアクセスが検出される都度、前記コンテキストのうちアクセス先を含む区画での検出前後の差分データの組み合わせを保存する保存工程と、
     前記第1のプロセッサがストールした場合、前記マルチコアプロセッサのうち前記一のプロセッサ以外の他のプロセッサ群の中から選ばれた第2のプロセッサにより、前記ストール前の前記区画ごとの前記差分データの組み合わせに基づいて、前記ストール前のコンテキストを復元する復元工程と、
     前記第1のプロセッサがストールした場合、前記第2のプロセッサが実行する前記マスタプロセスを、前記第2のプロセッサにより生成する生成工程と、
     前記復元工程によって復元された前記ストール前のコンテキストと前記生成工程によって生成されたマスタプロセスとを、前記第2のプロセッサにより関連付ける関連付け工程と、
     を前記マルチコアプロセッサが実行することを特徴とする復元方法。
    A detecting step of detecting that a first processor executing a master process among multi-core processors accessing the shared memory has accessed a context relating to the master process in the shared memory;
    Each time the access to the context is detected by the detection step, a storage step of storing a combination of difference data before and after detection in a section including the access destination in the context;
    When the first processor is stalled, the second processor selected from a group of processors other than the one of the multi-core processors, and the combination of the difference data for each partition before the stall Based on the restoration process of restoring the context before the stall; and
    When the first processor is stalled, a generation step of generating the master process executed by the second processor by the second processor;
    An association step of associating, by the second processor, the pre-stall context restored by the restoration step and the master process created by the creation step;
    Is performed by the multi-core processor.
PCT/JP2010/061192 2010-06-30 2010-06-30 Multicore processor system, restoration program, and method of restoration WO2012001788A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/JP2010/061192 WO2012001788A1 (en) 2010-06-30 2010-06-30 Multicore processor system, restoration program, and method of restoration
JP2012522393A JP5454686B2 (en) 2010-06-30 2010-06-30 Multi-core processor system, restoration program, and restoration method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2010/061192 WO2012001788A1 (en) 2010-06-30 2010-06-30 Multicore processor system, restoration program, and method of restoration

Publications (1)

Publication Number Publication Date
WO2012001788A1 true WO2012001788A1 (en) 2012-01-05

Family

ID=45401543

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2010/061192 WO2012001788A1 (en) 2010-06-30 2010-06-30 Multicore processor system, restoration program, and method of restoration

Country Status (2)

Country Link
JP (1) JP5454686B2 (en)
WO (1) WO2012001788A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113467996A (en) * 2021-07-08 2021-10-01 咪咕音乐有限公司 Database backup method and device, computer equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006268204A (en) * 2005-03-23 2006-10-05 Fujitsu Ltd Shared memory device and processing system
JP2008192028A (en) * 2007-02-07 2008-08-21 Hitachi Ltd Storage control device and data management method
JP2009116699A (en) * 2007-11-07 2009-05-28 Toyota Motor Corp Information processing system
JP2010061518A (en) * 2008-09-05 2010-03-18 Nec Corp Apparatus and method for storing data and program

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2006268204A (en) * 2005-03-23 2006-10-05 Fujitsu Ltd Shared memory device and processing system
JP2008192028A (en) * 2007-02-07 2008-08-21 Hitachi Ltd Storage control device and data management method
JP2009116699A (en) * 2007-11-07 2009-05-28 Toyota Motor Corp Information processing system
JP2010061518A (en) * 2008-09-05 2010-03-18 Nec Corp Apparatus and method for storing data and program

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113467996A (en) * 2021-07-08 2021-10-01 咪咕音乐有限公司 Database backup method and device, computer equipment and storage medium
CN113467996B (en) * 2021-07-08 2024-04-19 咪咕音乐有限公司 Database backup method, device, computer equipment and storage medium

Also Published As

Publication number Publication date
JPWO2012001788A1 (en) 2013-08-22
JP5454686B2 (en) 2014-03-26

Similar Documents

Publication Publication Date Title
JP6310061B2 (en) Selective retention of application program data migrated from system memory to non-volatile data storage
US9870248B2 (en) Page table based dirty page tracking
TWI697805B (en) Loading and virtualizing cryptographic keys
US9703723B2 (en) Method and apparatus for performing mapping within a data processing system having virtual machines
CN110622138B (en) Data migration method and device
JP6736456B2 (en) Information processing device and program
CN106687938B (en) Method and system for maintaining safe processing environment across power supply period
JP4783392B2 (en) Information processing apparatus and failure recovery method
US9703651B2 (en) Providing availability of an agent virtual computing instance during a storage failure
EP4156008A1 (en) Seamless access to trusted domain protected memory by virtual machine manager using transformer key identifier
WO2011137769A1 (en) Multi-core online patching method and apparatus
US20140025903A1 (en) Multi-core processor system
JP2020520037A (en) Computer with isolated user computing section
WO2022193768A1 (en) Method for executing memory read-write instruction, and computing device
US9904567B2 (en) Limited hardware assisted dirty page logging
CN113127263B (en) Kernel crash recovery method, device, equipment and storage medium
JP5454686B2 (en) Multi-core processor system, restoration program, and restoration method
Potter et al. Autopod: Unscheduled system updates with zero data loss
EP3314502A1 (en) Protecting state information for virtual machines
JP6462540B2 (en) Driver device, information processing system, program, and method
JP5920509B2 (en) Controller control program and controller control method
US11604651B2 (en) Methods and devices for hardware characterization of computing devices
JP7074291B2 (en) Information processing equipment, information processing methods and programs
JP2013130999A (en) Multi-core processor

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10854085

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2012522393

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 10854085

Country of ref document: EP

Kind code of ref document: A1