CN109582474A

CN109582474A - A kind of novel cache optimization multithreading Deterministic Methods

Info

Publication number: CN109582474A
Application number: CN201811298378.9A
Authority: CN
Inventors: 王开宇; 季振洲; 吴倩倩; 张源悍; 王楷
Original assignee: Harbin Institute of Technology
Current assignee: Harbin Institute of Technology
Priority date: 2018-11-02
Filing date: 2018-11-02
Publication date: 2019-04-05

Abstract

The invention proposes the multithreading Deterministic Methods based on cache optimization, including thread certainty construction module, multi-threading correspondence isolation module, atomic transaction divided stages module, thread synchronization strategy study module and LIRS cache optimization.The thread certainty that the present invention can be used for support mission in multi-threaded system executes, and optimizes Deterministic Methods bring overhead, prevents because of the synchronous competition of thread uncertainty operation bring and data contention.The thread operation phase is divided as unit of affairs, parallel thread parallel executes, and thread communication is isolated, and setting fence carries out global synchronization.Serial stage thread obtains token by certainty sequence, and submission is successively executed into memory, carries out deterministic schedule.Because communicating isolation between thread, caching becomes the shared storage of most final stage, therefore using the LIRS cache replacement algorithm optimization system performance of multithreading is more suitable, guarantees that multithreading operation is deterministic while reducing overhead with this.

Description

A kind of novel cache optimization multithreading Deterministic Methods

Technical field

The present invention is applied to the guarantee thread under multi-thread environment and runs certainty.

Background technique

With the development of microelectric technique, chip multi-core processor has become the computing platform and research heat of current mainstream Point.Compared with previous single core processor, multi-core processor achieves explosive promotion, traditional serial journey in hardware performance Sequence cannot play its performance, and multiple programming is only and gives full play to the key of its multicore performance, is that mainstream applications can be allowed from more The unique programming mode benefited in core cpu performance.

It is supported in spite of java standard library, but compares conventional serial program, concurrent program is bringing promotion to calculated performance Meanwhile challenge also is brought to the exploitation of program and maintenance.Concurrent program usually completes one by multiple parallel execution individual collaborations A task, therefore the relationship that there is competition, interference between individual is executed, result in the uncertainty of concurrent program, i.e. journey Sequence is run multiple times under identical input may generate different results.It is this uncertain in many aspects to concurrent program Bring new challenge.Currently, certainty technology is considered as coping with the key technology of this challenge.It is parallel that there are mainly two types of shapes Formula, one is multi-threaded parallel, i.e., shared drive between each parallel individual；Another kind is that multi-process is parallel, each individual Between not shared drive, but communicated by other means.The purpose of certainty technology is eliminated by parallel caused not true It is qualitative, the development and maintenance cost of concurrent program is reduced, the reliability of concurrent program is improved.

Summary of the invention

Technical problems based on background technology, the invention proposes a kind of cache optimization multithreading Deterministic Methods.

A kind of cache optimization multithreading Deterministic Methods proposed by the present invention, the system comprises thread certainty structure moulds Block, multi-threading correspondence isolation module, atomic transaction divided stages module, thread synchronization strategy study module and LIRS caching replace Scaling method optimization

Preferably, thread certainty construction module guarantees that thread runs certainty for thread operation rule to be arranged.

Preferably, multi-threading correspondence isolation module prevents number for thread to be isolated in the communication interaction of parallel According to competition.

Preferably, atomic transaction divided stages module is for dividing the thread operation phase.

Preferably, thread synchronization strategy study module, for ensure thread the operation phase conversion when, it is suitable according to certainty Sequence obtains token, avoids the occurrence of synchronous competition.

Preferably, LIRS cache replacement algorithm optimizes the performance cost for optimizing deterministic system.

In the present invention, thread synchronization policy module is that thread establishes connection between serial stage and parallel operation, Thread operation setting transaction concepts of the present invention, thread operation is divided into serial stage and parallel two parts in a wheel affairs. All threads enter the serial stage after terminating parallel and reaching synchronous point, according to the sequence for obtaining token, and thread is serial After stage executes, new round affairs are opened after all threads terminate the serial stage in synchronous point obstruction and are executed.Passing through will The operation and submission data separating of thread, guarantee that thread guarantees multithreading operation by the certainty of the sequence of synchronous point really It is qualitative.And by using the cache replacement algorithm lifting system performance for being more suitable for multi-thread environment.

Detailed description of the invention

Fig. 1 is thread operation phase schematic diagram of the invention.

Fig. 2 is that certainty sequence of the invention submits schematic diagram.

Fig. 3 is overall operation flow diagram of the invention.

Specific embodiment

Combined with specific embodiments below the present invention is made further to explain.

Embodiment

It is thread operation setting fence in parallel, each parallel can only allow thread to execute with reference to Fig. 1 A certain number of instructions, thread is blocked by fence after execution, other threads is waited to enter simultaneously operating.

In the serial stage, thread obtains token according to network topology in synchronous point, and application locks memory, then mentions Hand over implementing result.Obtaining token and application locking is all mutual exclusion behavior, and per thread can only execute once in each round.Thread Blocked after having executed submission operation by fence, waits subsequent thread to enter the serial stage, when whole threads all terminate serially After stage, the respective privately owned page is submitted in the shared page by thread, is compared with the shared page, after obtaining epicycle execution Newest shared data starts to prepare for next round parallel.Token passing uses circle queue, and all threads are according to ID Sequence enters queue, successively obtains token and obtains the lock resource of read-write shared data permission, starts the read-write operation with memory.

With reference to Fig. 2, because having blocked the exchange between thread, guarantee that thread exists using the mode for establishing shared copy When each transaction phase starts, identical data are possessed in the privately owned page, guarantee data consistency.It is each in parallel Thread starts to execute according to the shared data of last round of copy, executes parallel between each thread, without data exchange.Work as whole After thread is blocked in synchronous point by fence, thread is successively submitted to the shared page according to the sequence for obtaining token, due to Weak deterministic guarantees thread synchronization sequence, thus the content of the shared page be it is deterministic, terminate when the serial stage, all After thread has been carried out submission operation, the privately owned page is submitted to by each thread carries out byte-by-byte comparison acquisition in the shared page Modification information.The privately owned page of per thread is actually equivalent to the local replica of the shared page, and the operation of thread is first submitted It to local replica, is submitted in the shared page, guarantees in the shared page according still further to certainty synchronizing sequence when reaching synchronous point Data have certainty.

With reference to Fig. 3, in each round affairs, multithread programs, which first pass through, guarantees program by the serial stage after parallel It is anti-to eliminate multithread programs using LIRS cache optimization when inwardly depositing into row read-write requests in the serial stage for the certainty of operation It is re-reading write caused by the high situation of cache invalidation rate, lifting system performance, therefore have compared to general Deterministic Methods smaller Overhead.

The foregoing is only a preferred embodiment of the present invention, but scope of protection of the present invention is not limited thereto, Anyone skilled in the art in the technical scope disclosed by the present invention, according to the technique and scheme of the present invention and its Inventive concept is subject to equivalent substitution or change, should be covered by the protection scope of the present invention.

Claims

1. a kind of cache optimization multithreading deterministic system, it is characterised in that: can guarantee under multi-thread environment, thread is according to true Qualitative sequence operation avoids the data contention generated due to thread memory access conflict from generating with competition by synchronous dot sequency same Step competition, while using the LIRS cache replacement algorithm optimization system performance of multi-thread environment is more suitable, reduce certainty band The expense come.The system comprises thread certainty construction module, multi-threading correspondence isolation module, atomic transaction divided stages moulds Block, thread synchronization strategy study module and LIRS cache optimization.

Thread certainty construction module, for thread operation rule to be arranged in multithreading certainty, in the serial stage and parallel Fence is set between the stage, thread is forced to synchronize；

Multi-threading correspondence isolation module, for prevented in parallel that thread is run data communication between thread and thread with Communication behavior between memory, prevents thread from accessing memory in a manner of uncertainty in parallel, so that it is competing to generate data It strives.

Atomic transaction divided stages module, for dividing the thread operation phase, by the way that the thread operation phase is divided into simultaneously row order Section and serial stage, the degree of parallelism of lifting system operation.Wherein parallel thread parallel executes, and blocks at fence and wait It is synchronous, token then is obtained according to certainty sequence and successively starts the serial stage, executes the interactive task with memory.

Thread synchronization strategy study module, for guarantee the thread at synchronous place according to the token passing in synchronization policy sequentially, according to Secondary acquisition token starts serial stage execution, avoids competing the synchronous competition occurred by synchronous point due to thread.

The optimization of LIRS cache replacement algorithm, is used for optimization system performance, reduces cache invalidation caused by multithreading is frequently read and write, changes Kind system is as guaranteeing overhead caused by certainty.

2. being applied to cache optimization multithreading deterministic system according to 1.It is characterized in that, the thread certainty structure Module；

Since in Posix thread multi-thread environment, thread structure can be divided into thread control, lock construction and thread in system In program, there is no control thread operating status structure, therefore thread the operation phase convert between need that fence is arranged It forces thread to synchronize, while ensuring the creation of thread, kill is all completed under the premise of obtaining token, the method includes Following steps:

Step 1: the fence of setting thread operation phase transfer point；

Step 2: guaranteeing that thread obtains token and obtains the consistency of lock resource；

Step 3: thread carries out creation sub thread and shutoff operation after obtaining token；

Step 4: thread starts the serial stage after parallel, according to the sequence for obtaining token；

Step 5: for thread after the serial stage, all threads reach synchronous point, start next round affairs and execute.

3. the cache optimization multithreading deterministic system system according to 1.It is characterized in that isolation thread is in parallel It interactively communicates.

The main reason for multithreading generation data contention, is sequence when thread contention access same memory address, so if The sequence that can control thread accesses memory can avoid the occurrence of data contention.This method is by isolation thread in parallel Interactively communicate, avoid the data exchange between thread between thread and memory, the data that this stage thread executes come from it is upper The data exchange of thread and memory is postponed till string by the memory copying of each thread, this method after the wheel affairs serial stage Row order section executes, and in serial stage thread according to the certainty sequential access memory for obtaining token, avoids the occurrence of data contention guarantor The certainty of thread operation is demonstrate,proved.

4. being applied to cache optimization multithreading deterministic system service system according to 1.It is characterized in that dividing thread fortune The operation phase of thread is divided into serial and parallel two parts by row order section, is thread operation setting fence in parallel, often One parallel can only allow thread to execute a certain number of instructions, and thread is blocked by fence after execution, waits it He enters simultaneously operating by thread.

In the serial stage, thread obtains token according to network topology in synchronous point, and application locks memory, then submits and holds Row result.Obtaining token and application locking is all mutual exclusion behavior, and per thread can only execute once in each round.Thread is being held It has gone after submission operates and has been blocked by fence, waited subsequent thread to enter the serial stage, when whole threads all terminate the serial stage Afterwards, the respective privately owned page is submitted in the shared page by thread, is compared with the shared page, is obtained newest after epicycle executes Shared data, start to prepare for next round parallel.

5. being applied to cache optimization multithreading deterministic system service system according to 1.It is characterized in that being passed by token It passs mechanism and guarantees that thread enters the serial stage according to certainty sequence.

Occur data contention in order to prevent, do not allow thread inwardly to deposit in parallel and be written and read, therefore thread enters The sequence in serial stage just determines the sequence of thread read/write memory, and sequentially there is certainty to ensure that multithreading will not go out for this Existing data contention.Thread obtains token according to Thread Id sequence in synchronous point, ensure that thread will not be because of competition in synchronous point There is the case where synchronous competition and deadlock by the sequence of fence, avoids the synchronous competition of appearance, guarantee multithreading operation Certainty.

6. being applied to cache optimization multithreading deterministic system service system according to 1.It is characterized in that having used LIRS The lru algorithm of cache replacement algorithm substitution script.Due to defect of the LRU algorithm in structure, prevent the algorithm is from good The demand read and write repeatedly is adapted to, therefore is showed under multi-thread environment poor.This method is using being more suitable multi-thread environment LIRS algorithm replace lru algorithm, improve because multithread programs read and write repeatedly caused by the low problem of cache hit rate, improvement by The overhead caused by setting Deterministic rules, so that this system has better performance index and widely applies ring Border.