CN103020003A - Multi-core program determinacy replay-facing memory competition recording device and control method thereof - Google Patents
Multi-core program determinacy replay-facing memory competition recording device and control method thereof Download PDFInfo
- Publication number
- CN103020003A CN103020003A CN2012105900267A CN201210590026A CN103020003A CN 103020003 A CN103020003 A CN 103020003A CN 2012105900267 A CN2012105900267 A CN 2012105900267A CN 201210590026 A CN201210590026 A CN 201210590026A CN 103020003 A CN103020003 A CN 103020003A
- Authority
- CN
- China
- Prior art keywords
- instruction
- memory
- memory contention
- time stamp
- contention
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Abstract
The invention discloses a multi-core program determinacy replay-facing memory competition recording device and a control method thereof, and relates to a memory competition recording device, in order to solve the problem that a memory competition recording method is high in cost. The memory competition recording device is a multi-core processor system based on a Cache coherence protocol, and records the memory competition when a multi-core program operates, wherein the recording method records an indirect dependency represented by present instructions of processor cores of both rival parties when the competition occurs, instead of directly recording the dependency corresponding to the memory competition, so as to record a memory competition log composed of the indirect dependencies for each thread; the indirect dependencies of the memory competition are recorded to store a segment timestamp with a smaller size for each instruction without storing an instruction count value corresponding to the memory operation instruction for each memory block. Meanwhile, the reduction of the memory competition log is realized by using a segmentation method, and the consumption of hardware resources is greatly reduced. The invention is used for multi-core program debugging, intrusion detection and fault tolerance.
Description
Technical field
The present invention relates to a kind of memory contention pen recorder, particularly a kind of memory contention pen recorder and control method thereof towards multinuclear program Deterministic Replay.
Background technology
Popular along with polycaryon processor, however the application of multinuclear program is more and more extensive., there is uncertainty in the result of multinuclear program operation, has brought numerous challenges for the application such as program debug, fault-tolerant processing, intrusion detection, has also restricted the development of parallel computation.Uncertain information when multinuclear program Deterministic Replay is moved by record multinuclear program can solve the uncertainty of multinuclear program operation.Wherein, the memory contention record is the gordian technique that realizes multinuclear program Deterministic Replay.The large problem of performance cost that the memory contention recording method that realizes at present exists.
Summary of the invention
The objective of the invention is to the invention provides a kind of memory contention pen recorder and control method thereof towards multinuclear program Deterministic Replay in order to solve the high problem of method cost of realization memory contention record.
Memory contention pen recorder towards multinuclear program Deterministic Replay of the present invention, it comprises a plurality of processor cores and shared L2 Data Cache, described a plurality of processor cores and shared L2 Data Cache are carried out exchanges data by the internet,
Processor core also comprises memory contention logging modle MRR, privately owned L1 Data Cache, privately owned L1 Instruction Cache, Cache consistency protocol controller and instruction pipelining;
Memory contention logging modle MRR is for detection of memory contention and record;
Privately owned L1 Data Cache is used for depositing the data that the processor kernel nearest was accessed;
Privately owned L1 Instruction Cache is used for depositing the instruction that the processor kernel nearest was accessed;
Cache consistency protocol controller, consistent for the copy that guarantees the shared data that all processor core Data Cache keep;
Instruction pipelining is used for when each register of processor core is worked simultaneously the order of control register deal with data;
Described memory contention logging modle comprises 64bits instruction counter, 56bits segment counter, (processor core number-1) * 56bits section time stamp vector SCV and steering logic module; The 56bits segment counter is used for the record segment time stamp;
The 64bits instruction counter is for the number of recording instruction;
Section time stamp vector SCV is used for depositing section time stamp corresponding to other processor cores, and the quantity of described section time stamp is (processor core number-1);
Each Cache piece in the described privately owned L1 Data Cache also comprises a field: section time stamp SC; Described section time stamp SC is used for recording up-to-date memory contention;
The steering logic module is used for the flow process that control memory contention logging modle detects memory contention and record.
The control method of said apparatus, the course of work of described steering logic module comprises the steps:
When the instruction of submitting to is the internal memory operation instruction, upgrades the value of 64bits instruction counter IC, and the step of section time stamp of the memory block of corresponding internal memory operation is set;
When receiving the request of requesting party's consistance, detect the step that whether has memory contention to occur by Cache consistency protocol controller;
When Cache consistency protocol controller detects memory contention, utilize segmentation method to judge whether described memory contention needs the step that records;
When memory contention need to record, in sending to requesting party's consistance response message, add record mark position and present instruction count value CIC, and finish old section, create new section, upgrade the step of the section time stamp of corresponding requests side;
When the requesting party of Cache consistency protocol controller receives the consistance response message, judge whether conflict record mark position is genuine step;
When conflict record mark position is true time, the indirect dependence of memory contention is recorded in the step in requesting party's the memory contention daily record.
The invention has the advantages that, the present invention realizes detection and the record of memory contention on the original Cache consistency protocol of multi-core processor system basis, the detection of memory contention is carried out at answer party, the requesting party that is recorded in of memory contention carries out, only when detecting memory contention, just can in response message, add a conflict record mark position and present instruction count value CIC, like this, greatly reduce the impact on original bandwidth.Compare previous memory contention recording method, the method has following outstanding advantages:
1, realize the record of memory contention with hardware, little to the performance impact of original system.
2, represent memory contention with indirect dependence, can cut approximately more memory contention, reduce the memory contention daily record.
3, with the time stamp of section time stamp mark internal memory operation instruction, hardware resource consumption is few.
4, need not to revise original Cache consistency protocol.
The present invention can be applied in multinuclear program debug, intrusion detection, the field such as fault-tolerant.
Description of drawings
Fig. 1 is the structural representation of the memory contention pen recorder towards multinuclear program Deterministic Replay of the present invention.
Fig. 2 be of the present invention in the memory contention pen recorder of multinuclear program Deterministic Replay the structural representation of processor core.
Fig. 3 is the synoptic diagram that represents memory contention in the specific embodiment of the invention three with indirect dependence.
Fig. 4 is the synoptic diagram that represents memory contention in the specific embodiment of the invention five with indirect dependence.
Fig. 5 is the schematic flow sheet of the memory contention of record diagram 4.
Embodiment
Embodiment one: present embodiment is described in conjunction with Fig. 1 and Fig. 2, the described memory contention pen recorder towards multinuclear program Deterministic Replay of present embodiment, it comprises a plurality of processor cores and shared L2 Data Cache, described a plurality of processor core and shared L2 Data Cache are carried out exchanges data by the internet
Processor core also comprises memory contention logging modle MRR, privately owned L1 Data Cache, privately owned L1 Instruction Cache, Cache consistency protocol controller and instruction pipelining;
Memory contention logging modle MRR is for detection of memory contention and record;
Privately owned L1 Data Cache is used for depositing the data that the processor kernel nearest was accessed;
Privately owned L1 Instruction Cache is used for depositing the instruction that the processor kernel nearest was accessed;
Cache consistency protocol controller be used for to guarantee that the copy of the shared data that the privately owned L1 Data Cache of all processor cores keeps is consistent;
Instruction pipelining is used for when each register of processor core is worked simultaneously the order of control register deal with data;
Described memory contention logging modle comprises 64bits instruction counter, 56bits segment counter, (processor core number-1) * 56bits section time stamp vector SCV and steering logic module; The 56bits segment counter is used for the record segment time stamp;
The 64bits instruction counter is for the number of recording instruction;
Section time stamp vector SCV is used for depositing section time stamp corresponding to other processor cores, and the quantity of described section time stamp is (processor core number-1);
Each Cache piece in the described privately owned L1 Data Cache also comprises a field: section time stamp SC; Described section time stamp SC is used for recording up-to-date memory contention;
The steering logic module is used for the flow process that control memory contention logging modle detects memory contention and record.
Cache is cache memory.The English spelling of MRR is Memory Race Recorder.
Embodiment two: present embodiment is the control method of the described memory contention pen recorder towards multinuclear program Deterministic Replay of embodiment one, and the course of work of described steering logic module comprises the steps:
When the instruction of submitting to is the internal memory operation instruction, upgrades the value of 64bits instruction counter IC, and the step of section time stamp of the memory block of corresponding internal memory operation is set;
When receiving the request of requesting party's consistance, detect the step that whether has memory contention to occur by Cache consistency protocol controller;
When Cache consistency protocol controller detects memory contention, utilize segmentation method to judge whether described memory contention needs the step that records;
When memory contention need to record, in sending to requesting party's consistance response message, add record mark position and present instruction count value CIC, and finish old section, create new section, upgrade the step of the section time stamp of corresponding requests side;
When the requesting party of Cache consistency protocol controller receives the consistance response message, judge whether conflict record mark position is genuine step;
When conflict record mark position is true time, the indirect dependence of memory contention is recorded in the step in requesting party's the memory contention daily record.
Embodiment three: present embodiment is that described indirect dependence is to the further restriction of the control method of the described memory contention pen recorder towards multinuclear program Deterministic Replay of embodiment two:
i:w→j:v;
W and v represent respectively when memory contention i:x → j:y occurs, the present instruction count value CIC of thread i and thread j;
X and y represent respectively when memory contention occurs, the value of 64bits instruction counter IC when thread i and thread j operate memory block.
Dependence represents with i:w → j:v indirectly.When w, v represent to clash i:x → j:y and occur, the present instruction count value CIC of thread i, j (CIC, Current dynamic Instruction Count).As shown in Figure 3, there are two thread i and j all z to be carried out write operation, thread j carries out write operation to z first, the instruction count value IC (IC of this operational correspondence, dynamic Instruction Count) value is 1, thread i carries out write operation (IC=3) to z more afterwards, at this moment detect internal storage conflict j:1 → i:3, and the instruction of the complete IC=2 of thread j executed this moment, be CIC=2, when the record memory contention, no longer record accurately memory contention dependence j:1 → i:3, but this indirect dependence of record j:2 → i:3.Equally, other memory contentions between thread i, j can adopt this indirect dependence to represent equally, and are middle with shown in the solid line of arrow, storage instruction count value IC in the 64bits instruction counter.
Embodiment four: present embodiment is the further restriction to the control method of the described memory contention pen recorder towards multinuclear program Deterministic Replay of embodiment two, described when Cache consistency protocol controller detects memory contention, the step of utilizing segmentation method to judge whether described memory contention needs to record is:
The section time stamp corresponding when requesting party's thread corresponding to described memory contention is not less than requesting party's up-to-date section time stamp, judges the step that memory contention need to record.
Represent memory contention with indirect dependence, exist some memory contention to derive out by the indirect dependence that has recorded, these memory contentions can be cut approximately, need not to be recorded in the memory contention daily record.As shown in Figure 3, memory contention i:1 → j:4 can be derived out by indirect dependence i:3 → j:3, and memory contention i:3 → j:5 also can be derived out by indirect dependence i:3 → j:3, and the rest may be inferred, and many memory contentions can cut approximately.The present invention subtracts memory contention approximately with segmentation method, and introduces the time stamp that the section time stamp comes every instruction of mark.Concrete operations are as follows: whether be not less than up-to-date section time stamp by the time stamp of judging the first side of the generation place section of conflicting, decide the indirect dependence of this conflict whether to need record.Thereby, can cut approximately many memory contentions of recording of not needing.
Embodiment five: present embodiment is the further restriction to the control method of the described memory contention pen recorder towards multinuclear program Deterministic Replay of embodiment two, when described instruction when submitting to is the internal memory operation instruction, upgrade the value of 64bits instruction counter IC, and arrange corresponding internal memory operation memory block the section time stamp step in the internal memory operation instruction be store instruction or load instruction
The store instruction is used for writing to storer the instruction of data;
The load instruction is used for the instruction from the storer read data.
Fig. 4 is for representing the synoptic diagram of memory contention with indirect dependence, Fig. 5 is the synoptic diagram of the described memory contention process of record.Two thread P1 and P2 operate in respectively on processor core i and the j, and thread i at first carries out write operation to x, and thread j carries out write operation to x again afterwards.In the time of after P1 writes x, the state of x place Cache piece is the M state, when P2 writes x, send consistance request GETX to consistency protocol mechanism, after catalogue is received this message, again this request message is transmitted to P1, after P1 receives request message, in conjunction with the type of message and the state of the variable x that self stores, detect memory contention (being write again after writing), P1 this moment up-to-date section time stamp of the j thread that records of section time stamp and it of new thread i more, and the current instruction count value CIC (at this moment CIC=3) of will conflict record mark and P together sends to requesting party P2 together with the content of x; After P2 received response message, at first whether the detection record sign was true, if be true, then recorded in the memory contention daily record of indirect dependence 3 → 2 to j threads of this internal memory.
Claims (5)
1. towards the memory contention pen recorder of multinuclear program Deterministic Replay, it comprises a plurality of processor cores and shared L2 Data Cache, and described a plurality of processor cores and shared L2 Data Cache are carried out exchanges data by the internet,
Processor core also comprises memory contention logging modle MRR, privately owned L1 Data Cache, privately owned L1 Instruction Cache, Cache consistency protocol controller and instruction pipelining;
Memory contention logging modle MRR is for detection of memory contention and record;
Privately owned L1 Data Cache is used for depositing the data that the processor kernel nearest was accessed;
Privately owned L1 Instruction Cache is used for depositing the instruction that the processor kernel nearest was accessed;
Cache consistency protocol controller, consistent for the copy that guarantees the shared data that all processor core Data Cache keep;
Instruction pipelining is used for when each register of processor core is worked simultaneously the order of control register deal with data;
Described memory contention logging modle comprises 64bits instruction counter, 56bits segment counter, (processor core number-1) * 56bits section time stamp vector SCV and steering logic module; The 56bits segment counter is used for the record segment time stamp;
The 64bits instruction counter is for the number of recording instruction;
Section time stamp vector SCV is used for depositing section time stamp corresponding to other processor cores, and the quantity of described section time stamp is (processor core number-1);
Each Cache piece in the described privately owned L1 Data Cache also comprises a field: section time stamp SC; Described section time stamp SC is used for recording up-to-date memory contention;
The steering logic module is used for the flow process that control memory contention logging modle detects memory contention and record.
2. the control method of the memory contention pen recorder towards multinuclear program Deterministic Replay according to claim 1 is characterized in that the course of work of described steering logic module comprises the steps:
When the instruction of submitting to is the internal memory operation instruction, upgrades the value of 64bits instruction counter IC, and the step of section time stamp of the memory block of corresponding internal memory operation is set;
When receiving the request of requesting party's consistance, detect the step that whether has memory contention to occur by Cache consistency protocol controller;
When Cache consistency protocol controller detects memory contention, utilize segmentation method to judge whether described memory contention needs the step that records;
When memory contention need to record, in sending to requesting party's consistance response message, add record mark position and present instruction count value CIC, and finish old section, create new section, upgrade the step of the section time stamp of corresponding requests side;
When the requesting party of Cache consistency protocol controller receives the consistance response message, judge whether conflict record mark position is genuine step;
When conflict record mark position is true time, the indirect dependence of memory contention is recorded in the step in requesting party's the memory contention daily record.
3. the control method of the memory contention pen recorder towards multinuclear program Deterministic Replay according to claim 2 is characterized in that described indirect dependence is:
i:w→j:v;
W and v represent respectively when memory contention i:x → j:y occurs, and process the present instruction count value CIC of nuclear i and processing nuclear j;
X and y represent respectively when memory contention occurs, and process nuclear i and process nuclear j the value of 64bits instruction counter IC when memory block is operated.
4. the control method of the memory contention pen recorder towards multinuclear program Deterministic Replay according to claim 2, it is characterized in that, described when Cache consistency protocol controller detects memory contention, the step of utilizing segmentation method to judge whether described memory contention needs to record is:
Check the up-to-date section time stamp that the section time stamp of answering is not less than the requesting party when requestor processes corresponding to described memory contention, judge the step that memory contention need to record.
5. the control method of the memory contention pen recorder towards multinuclear program Deterministic Replay according to claim 2 is characterized in that,
When described instruction when submitting to is the internal memory operation instruction, upgrade the value of 64bits instruction counter IC, and the internal memory operation instruction that arranges in the step of section time stamp of memory block of corresponding internal memory operation is store instruction or load instruction,
The store instruction is used for writing to storer the instruction of data;
The load instruction is used for the instruction from the storer read data.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2012105900267A CN103020003A (en) | 2012-12-31 | 2012-12-31 | Multi-core program determinacy replay-facing memory competition recording device and control method thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2012105900267A CN103020003A (en) | 2012-12-31 | 2012-12-31 | Multi-core program determinacy replay-facing memory competition recording device and control method thereof |
Publications (1)
Publication Number | Publication Date |
---|---|
CN103020003A true CN103020003A (en) | 2013-04-03 |
Family
ID=47968625
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2012105900267A Pending CN103020003A (en) | 2012-12-31 | 2012-12-31 | Multi-core program determinacy replay-facing memory competition recording device and control method thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103020003A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104252425A (en) * | 2013-06-28 | 2014-12-31 | 华为技术有限公司 | Management method for instruction cache and processor |
CN104915180A (en) * | 2014-03-10 | 2015-09-16 | 华为技术有限公司 | Data operation method and device |
CN105095110A (en) * | 2014-02-18 | 2015-11-25 | 新加坡国立大学 | Fusible and reconfigurable cache architecture |
WO2016045059A1 (en) * | 2014-09-25 | 2016-03-31 | Intel Corporation | Multicore memory data recorder for kernel module |
CN106815174A (en) * | 2015-11-30 | 2017-06-09 | 大唐移动通信设备有限公司 | Data access control method and node controller |
CN108021563A (en) * | 2016-10-31 | 2018-05-11 | 华为技术有限公司 | The detection method and device that a kind of inter-instruction data relies on |
CN110209509A (en) * | 2019-05-28 | 2019-09-06 | 北京星网锐捷网络技术有限公司 | Method of data synchronization and device between multi-core processor |
CN112347065A (en) * | 2019-08-07 | 2021-02-09 | 中国船舶工业系统工程研究院 | Record replay method and system for police preplan making process |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101297270A (en) * | 2005-08-23 | 2008-10-29 | 先进微装置公司 | Method for proactive synchronization within a computer system |
-
2012
- 2012-12-31 CN CN2012105900267A patent/CN103020003A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101297270A (en) * | 2005-08-23 | 2008-10-29 | 先进微装置公司 | Method for proactive synchronization within a computer system |
Non-Patent Citations (3)
Title |
---|
SUXIA ZHU ET AL.: "CTR:An efficient point-to-point memory race recorder implemented in chunks", 《MICROPROCESSORS AND MICROSYSTEMS》 * |
ZHU SUXIA ET AL.: "An Efficient Point-to-Point Deterministic Record-Replay Enhanced with Signatures", 《2012 13TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING,APPLICATIONS AND TECHNOLOGIES》 * |
朱素霞 等: "面向多核程序确定性重演的内存竞争记录机制研究", 《电子学报》 * |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104252425B (en) * | 2013-06-28 | 2017-07-28 | 华为技术有限公司 | The management method and processor of a kind of instruction buffer |
WO2014206217A1 (en) * | 2013-06-28 | 2014-12-31 | 华为技术有限公司 | Management method for instruction cache, and processor |
CN104252425A (en) * | 2013-06-28 | 2014-12-31 | 华为技术有限公司 | Management method for instruction cache and processor |
CN105095110A (en) * | 2014-02-18 | 2015-11-25 | 新加坡国立大学 | Fusible and reconfigurable cache architecture |
US9977741B2 (en) | 2014-02-18 | 2018-05-22 | Huawei Technologies Co., Ltd. | Fusible and reconfigurable cache architecture |
CN104915180A (en) * | 2014-03-10 | 2015-09-16 | 华为技术有限公司 | Data operation method and device |
CN104915180B (en) * | 2014-03-10 | 2017-12-22 | 华为技术有限公司 | A kind of method and apparatus of data manipulation |
WO2016045059A1 (en) * | 2014-09-25 | 2016-03-31 | Intel Corporation | Multicore memory data recorder for kernel module |
CN106575284A (en) * | 2014-09-25 | 2017-04-19 | 英特尔公司 | Multicore memory data recorder for kernel module |
US10649899B2 (en) | 2014-09-25 | 2020-05-12 | Intel Corporation | Multicore memory data recorder for kernel module |
CN106815174A (en) * | 2015-11-30 | 2017-06-09 | 大唐移动通信设备有限公司 | Data access control method and node controller |
CN106815174B (en) * | 2015-11-30 | 2019-07-30 | 大唐移动通信设备有限公司 | Data access control method and Node Controller |
CN108021563A (en) * | 2016-10-31 | 2018-05-11 | 华为技术有限公司 | The detection method and device that a kind of inter-instruction data relies on |
CN108021563B (en) * | 2016-10-31 | 2021-09-07 | 华为技术有限公司 | Method and device for detecting data dependence between instructions |
CN110209509A (en) * | 2019-05-28 | 2019-09-06 | 北京星网锐捷网络技术有限公司 | Method of data synchronization and device between multi-core processor |
CN112347065A (en) * | 2019-08-07 | 2021-02-09 | 中国船舶工业系统工程研究院 | Record replay method and system for police preplan making process |
CN112347065B (en) * | 2019-08-07 | 2023-08-18 | 中国船舶工业系统工程研究院 | Recording and replay method and system for police plan making process |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103020003A (en) | Multi-core program determinacy replay-facing memory competition recording device and control method thereof | |
US10474471B2 (en) | Methods and systems for performing a replay execution | |
US9479395B2 (en) | Model framework to facilitate robust programming of distributed workflows | |
US20130232495A1 (en) | Scheduling accelerator tasks on accelerators using graphs | |
CN114580344B (en) | Test excitation generation method, verification system and related equipment | |
US20100131720A1 (en) | Management of ownership control and data movement in shared-memory systems | |
CN110457261B (en) | Data access method, device and server | |
CN114925084B (en) | Distributed transaction processing method, system, equipment and readable storage medium | |
CN112860777B (en) | Data processing method, device and equipment | |
US9274875B2 (en) | Detecting memory hazards in parallel computing | |
US20120059997A1 (en) | Apparatus and method for detecting data race | |
CN103729166A (en) | Method, device and system for determining thread relation of program | |
US20120233410A1 (en) | Shared-Variable-Based (SVB) Synchronization Approach for Multi-Core Simulation | |
CN110196680B (en) | Data processing method, device and storage medium | |
US10198784B2 (en) | Capturing commands in a multi-engine graphics processing unit | |
CN117112522A (en) | Concurrent process log management method, device, equipment and storage medium | |
CN116561091A (en) | Log storage method, device, equipment and readable storage medium | |
CN103019829A (en) | Multi-core program memory competition recording and replaying method realized by signature | |
CN116167310A (en) | Method and device for verifying cache consistency of multi-core processor | |
CN114547206A (en) | Data synchronization method and data synchronization system | |
US11341159B2 (en) | In-stream data load in a replication environment | |
US20140257736A1 (en) | Implementing automated memory address recording in constrained random test generation for verification of processor hardware designs | |
CN109344136A (en) | A kind of access method of shared-file system, device and equipment | |
CN115658351B (en) | 2D copying method, device, electronic equipment and computer readable storage medium | |
CN114706715B (en) | Control method, device, equipment and medium for distributed RAID based on BMC |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C02 | Deemed withdrawal of patent application after publication (patent law 2001) | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20130403 |