CN103345451A - Data buffering method in multi-core processor - Google Patents

Data buffering method in multi-core processor Download PDF

Info

Publication number
CN103345451A
CN103345451A CN2013103010373A CN201310301037A CN103345451A CN 103345451 A CN103345451 A CN 103345451A CN 2013103010373 A CN2013103010373 A CN 2013103010373A CN 201310301037 A CN201310301037 A CN 201310301037A CN 103345451 A CN103345451 A CN 103345451A
Authority
CN
China
Prior art keywords
buffer
memory
data cached
buffer memory
processor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2013103010373A
Other languages
Chinese (zh)
Other versions
CN103345451B (en
Inventor
毛力
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan qianhang Technology Co., Ltd
Original Assignee
SICHUAN JIUCHENG INFORMATION TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SICHUAN JIUCHENG INFORMATION TECHNOLOGY Co Ltd filed Critical SICHUAN JIUCHENG INFORMATION TECHNOLOGY Co Ltd
Priority to CN201310301037.3A priority Critical patent/CN103345451B/en
Publication of CN103345451A publication Critical patent/CN103345451A/en
Application granted granted Critical
Publication of CN103345451B publication Critical patent/CN103345451B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The invention provides a data buffering method in a multi-core processor. The data buffering method in the multi-core processor comprises the steps of receiving a command for concurrently executing multiple threads; independently assigning each of the multiple threads to multiple cores of the processor respectively, wherein each of the multiple cores of the processor is assigned with one thread at most; responding to caching requests regarding each core, with the assigned thread, of the processor during the period that the threads are executed, and storing caching data to a coupled special buffer storage; when caching storages which are larger than or equal to a threshold value t in number store the same caching data, storing the same caching data to a general buffer storage. Through the data buffering method in the multi-core processer, the caching assess speed and the replacement speed are improved, and the problem of false sharing is overcome.

Description

A kind of in polycaryon processor the method for buffered data
Technical field
The present invention relates to field of data storage, relate in particular to a kind of in polycaryon processor the method for buffered data, further relate to a kind of in polycaryon processor the method for multi-buffer data.
Background technology
Gaps between their growth rates between processor and main memory are outstanding contradictions to polycaryon processor, therefore must use the multi-level buffer buffer memory to alleviate.The polycaryon processor of shared level cache, the polycaryon processor of sharing L2 cache and the polycaryon processor of shared main memory are arranged at present.Usually, polycaryon processor adopts the polycaryon processor structure of sharing L2 cache, and namely each processor core has privately owned level cache, and all processor cores are shared L2 cache.The architecture Design of buffer memory self also is directly connected to the entire system performance.But in the polycaryon processor structure, which is better and which is worse for shared buffer memory or exclusive buffer memory, need set up multi-level buffer at chip piece, and set up what buffer memory etc., because size, power consumption, layout, performance and operational efficiency etc. to entire chip have very big influence, thereby these all are the problems that need conscientiously study and inquire into.On the other hand, multi-level buffer causes consistency problem again.Adopt which kind of buffer consistency model and mechanism all will produce material impact to the polycaryon processor overall performance.The buffer consistency model that extensively adopts in traditional multicomputer system structure has: sequential consistency model, weak consistency model, release consistency model etc.Associated buffer consistency mechanism mainly contain bus intercept agreement and based on the directory protocol of catalogue.Present polycaryon processor system adopts the agreement of intercepting based on bus mostly.
Sometimes need to carry out data sharing between the program that each processor core of polycaryon processor processor is carried out with synchronously, so its hardware configuration must be supported internuclear communication.Communication mechanism is the high performance important leverage of polycaryon processor processor efficiently, at present relatively on the sheet of main flow efficient communication mechanism have two kinds, a kind of buffer structure that bus is shared, a kind of interconnection structure that is based on the sheet of being based on.Bus shared buffer memory structure refers to that each processor cores has shared secondary or three grades of buffer memorys, is used for preserving data relatively more commonly used, and communicates by the bus that connects core.The advantage of this system is simple in structure, the communication speed height, and it is relatively poor that shortcoming is based on the structure extensibility of bus.Structure based on on-chip interconnect refers to that each processor core has independently processing unit and buffer memory, and each processor core links together by modes such as cross bar switch or network-on-chips.Each processor core passes through message communicating in the heart.The advantage of this structure is that extensibility is good, and data bandwidth is guaranteed; Shortcoming is the hardware configuration complexity, and software alteration is bigger.Perhaps the two competition result is not replacement mutually but works in coordination, and for example adopts network-on-chip and the local bus mode that adopts in global scope, reaches the balance of performance and complicacy.
In the conventional microprocessor, buffer memory does not hit or the memory access event all can have a negative impact to the execution efficient of processor, and the work efficiency of Bus Interface Unit can determine this effect.Occur buffer memory simultaneously not during hit event when a plurality of processor cores require access memory or the intracardiac privately owned buffer memory of a plurality of processor core simultaneously, BIU has determined the overall performance of polycaryon processor system to the efficient of changing the mechanism of the arbitration mechanism of these a plurality of request of access and externally memory access.Therefore seek multiport Bus Interface Unit structure efficiently, the individual character visit of main memory is transferred the multinuclear heart to the visit of bursting more efficiently; Seeking the quantitative model of the visit word of once bursting of polycaryon processor processor whole efficiency the best and the arbitration mechanism of efficient multiport Bus Interface Unit visit simultaneously will be the important content of polycaryon processor processor research.
In present polycaryon processor system, no matter be L2 cache or three grades of buffer memorys, no matter be shared buffer memory or privately owned buffer memory, all there is the algorithm complex height in the algorithm that reads and replace of its buffer memory, hits technical matterss such as duration is long.
In addition, in the existing L2 cache mode, all there is backup usually in the L2 cache data of shared buffer memory data in privately owned level cache, so, after data cached in the level cache changed by different processor cores, the pseudo-sharing problem of high-speed cache can appear, need often reload thereby bring, increased access delay, the system that makes newly can descend.
Summary of the invention
For all there is the algorithm complex height in the algorithm that reads and replace that solves buffer memory in the existing polycaryon processor system, hits long, the pseudo-technical matters such as share of duration, the invention provides a kind of in polycaryon processor the method for buffered data, wherein said polycaryon processor comprises a plurality of processor cores, be formed centrally a plurality of dedicated buffer memories of coupled relation one by one with described a plurality of processor cores and be coupled in a general memory buffer of described a plurality of processor cores respectively, and described method comprises:
Receive the instruction of a plurality of threads of concurrent execution;
In described a plurality of threads each is distributed to described a plurality of processor core respectively independently, and wherein in the heart each of a plurality of processor cores is assigned with a thread at most;
Be assigned with the processor core of thread for each, in response to the cache request during the execution thread, will have treated data cached storing in the dedicated buffer memory that is coupled;
Same when data cached when all storing in the dedicated buffer memory of the quantity that is not less than a threshold value t, with same data cached storing in the general memory buffer.
Preferably, wherein with same data cached store in the general memory buffer after, removing stores same data cached in the same data cached dedicated buffer memory, and discharges and removed the same data cached shared storage space that stores in the same data cached dedicated buffer memory.
Preferably, when in the heart each of described a plurality of processor cores need read when data cached, from general memory buffer or dedicated buffer memory, read described data cached by the query caching mapping table.
Preferably, wherein t=s or Or t=2, wherein s is the total amount that is in the processor core under the state of activation, wherein
Figure BDA00003527760600032
Expression rounds up.
The invention also discloses a kind of in polycaryon processor the method for multi-buffer data, described polycaryon processor comprises 2 nIndividual processor core, wherein, and n+1 level buffer memory, wherein m level buffer memory comprises 2 N+1-mIndividual memory buffer; Wherein, i memory buffer of the 1st grade of buffer memory only is used for the data cached of the performed required buffering of thread of i processor core of storage; J memory buffer of s level buffer memory only is used for the 2nd of storage s-1 level buffer memory jThe-1 and the 2nd jWhat have in the individual memory buffer is data cached, and wherein, n is the integer greater than 1,1<=m<=n+1,2<=s<=n+1,1<=i<=2 n, 1<=j<=2 N+1-mDescribed method comprises:
Instruction in response to a plurality of threads of the same process of concurrent execution, each thread is distributed to different idle processor cores, activate the idle processor core that has been assigned with thread simultaneously, make the processor core that has been assigned with thread become busy condition from idle condition;
After i processor core is activated, in response to buffer memory instruction, at first check whether treat data cached the of the p level buffer memory that has been stored in
Figure BDA00003527760600041
In the individual memory buffer, if exist, send the affirmation message of indication buffer memory success, wherein 1<=i<=2 n, 2<=p<=n+1,
Figure BDA000035277606000410
Expression rounds operation, k=2 downwards P-1
If there is no, then will treat in data cached i the memory buffer that stores the 1st grade of buffer memory into; Judge step by step since the 1st grade of buffer memory to the n level buffer memory then, if t level buffer memory
Figure BDA00003527760600042
* 2-1 and * all store the described buffered data for the treatment of in 2 memory buffer, then the described buffered data for the treatment of is dumped to of t+1 level buffer memory In the individual memory buffer, and with of t level buffer memory
Figure BDA00003527760600045
* 2-1 and
Figure BDA00003527760600046
That * stores in 2 memory buffer treats data cached removing, discharges of t level buffer memory simultaneously
Figure BDA00003527760600047
* 2-1 and
Figure BDA00003527760600048
* described in 2 memory buffer treated data cached shared storage space, wherein
Figure BDA00003527760600049
1<=t<=n,
Figure BDA000035277606000411
Expression rounds up;
Send the affirmation message of indication buffer memory success.
The present invention utilizes the buffer memory mapping table, and treat based on execution thread and data cachedly to read in real time and to replace, especially in real time between dedicated buffer memory and general memory buffer, carry out data cached replacement, and utilize multi-level buffer to read in real time and replace data cached, realized lower algorithm complex, reduced and hit duration, improved the whole efficiency of the computer system with polycaryon processor.
Description of drawings
Included accompanying drawing is used for further understanding the present invention, its as an illustration book ingredient and explain principle of the present invention with instructions, in the accompanying drawings:
Fig. 1 is the structured flowchart of the polycaryon processor of first embodiment of the invention;
Fig. 2 is the process flow diagram of the method for buffered data in polycaryon processor of first embodiment of the invention;
Fig. 3 is the structured flowchart of the polycaryon processor of second embodiment of the invention;
Fig. 4 is the process flow diagram of the method for buffered data in polycaryon processor of second embodiment of the invention.
Embodiment
Fig. 1 shows the structured flowchart of the polycaryon processor that relates among the present invention, as shown in Figure 1, polycaryon processor among the present invention comprises a plurality of processor cores, be formed centrally a plurality of dedicated buffer memories of coupled relation one by one with described a plurality of processor cores, be coupled in a general memory buffer of described a plurality of processor cores respectively, wherein said a plurality of dedicated buffer memory only is used for relevant data cached of the performed thread of processor core that storage and described a plurality of dedicated buffer memories are coupled, and described one general memory buffer is for relevant data cached of the thread of storing and a plurality of processor cores are performed.Described polycaryon processor also comprises the mapping buffer, be used for the memory buffers mapping table, at least store the storage relation between data cached and each memory buffer (comprising dedicated buffer memory and dedicated buffer memory) in the described buffer memory mapping table, described storage relation comprise data cached be stored in which memory buffer and data cached which thread with which processor core related.Wherein said polycaryon processor also comprises cache controller, be used for the described a plurality of dedicated buffer memories of control, general memory buffer and mapping buffer, realize the writing of described a plurality of dedicated buffer memories, general memory buffer and mapping buffer, read, replace, operation such as inquiry.
In the present invention, the total amount of a plurality of threads of the same process that proposes is not more than the total amount of described a plurality of processor cores, and each in a plurality of threads of same process is distributed to different processor cores respectively to guarantee the concurrent execution of a plurality of threads of same process simultaneously.
Next 2 methods of buffered data in polycaryon processor of describing that the present invention propose in detail with reference to the accompanying drawings, described method comprises: polycaryon processor receives the instruction of a plurality of threads of the same process of concurrent execution; Continue to carry out subsequent step under situation about meeting the following conditions: the quantity of idle processor core is not less than the quantity of a plurality of threads of the concurrent execution of same process, exists the course allocation mode to make simultaneously each thread distributed to different idle processor cores and distributed the total resources of the idle processor core of thread to be not less than the needed processor resource total amount of the thread that distributes (namely all have distributed the processing power of the idle processor core of thread can both satisfy needs to the thread of its distribution); Otherwise return the information of indication mechanism inadequate resource.Instruction in response to a plurality of threads of the same process of concurrent execution, in described a plurality of threads each is distributed to described a plurality of processor core idle processor core in the heart respectively, wherein to distribute to different idle processor cores and allocation result be that the processing power of each processor core that has been assigned with thread satisfies the needs to the thread of its distribution fully to each thread, activate the idle processor core that has been assigned with thread simultaneously, make the processor core that has been assigned with thread become busy condition from idle condition.
After the idle processor core in the polycaryon processor is activated, the processor core that is activated enters the busy condition of execution thread, also therefore can carry out reading and writing of data according to the needs of execution thread, thereby need carry out the data buffer memory, the processor core under the busy condition can often send the buffer memory instruction to cache controller for this reason.Instruct in response to buffer memory, cache controller at first query caching mapping table checks whether treat data cached being stored in the general memory buffer, if exist then upgrade the buffer memory mapping table sends the success of indication buffer memory then to the processor core that sends the buffer memory instruction affirmation message, wherein upgrade the buffer memory mapping table and comprise and increase the described corresponding relation for the treatment of between the performed thread of the data cached and processor core that is activated and the processor core that is activated; Otherwise will treat data cached storing in the dedicated buffer memory that is coupled to the processor core that sends the buffer memory instruction, the storage relation that to treat data cached and dedicated buffer memory simultaneously is updated in the buffer memory mapping table, then to treat that data cached is that querying condition carries out query analysis in described buffer memory mapping table, if the query analysis result be not less than all store in the dedicated buffer memory of quantity of a threshold value t described treat data cached, wherein said threshold value t=s or
Figure BDA00003527760600071
Or t=2, wherein s is just in the total amount of the processor core of execution thread, wherein
Figure BDA00003527760600072
Expression rounds up, then cache controller is treated data cached storing in the general memory buffer with described, and with the data cached removing for the treatment of of storing in the described t dedicated buffer memory, discharge described in the described t dedicated buffer memory simultaneously and treat data cached shared storage space, final updating buffer memory mapping table, comprise increasing and describedly treat the relation of the storage between data cached and the general memory buffer and delete the described storage relation for the treatment of between data cached and t the dedicated buffer memory, send the affirmation information of indicating the buffer memory success by buffer control unit to sending the processor core that buffer memory instructs during the above-mentioned steps end.Wherein said buffer memory mapping table is stored in the mapping buffer.
In the present invention, after processor core is carried out a thread end in the heart, at first send clear instruction to cache controller.In response to clear instruction, cache controller at first query caching mapping table checks that the thread that whether exists in the general memory buffer with carrying out end is associated data cached.Be associated data cachedly if exist with carrying out the thread that finishes, then check whether exist greater than a threshold value t processor controller performed thread and described data cached being associated for each with the data cached continuation query caching mapping table that the thread of execution end is associated.If there is no greater than a threshold value t processor controller performed thread and described data cached being associated, then all have under the situation of the described data cached ability of storage at the dedicated buffer memory that all and the described data cached processor core that is associated are coupled, according to the record of buffer memory mapping table with described data cached unloading to all with dedicated buffer memory that the described data cached processor core that is associated is coupled in, delete described data cached in the general memory buffer, discharge the described data cached shared storage space in the general memory buffer; Upgrading the buffer memory mapping table makes the buffer memory mapping table reflect up-to-date storage relation.Those skilled in the art will know that after carrying out a thread end, also need to remove the related data in the main memory.After cache controller is carried out above step and is finished, send indication to the processor core that it is sent clear instruction and remove successful affirmation instruction, to confirm to instruct in response to this, this processor core enters idle condition by busy condition.
In the present invention, general memory buffer may in this case, need to adjust cache policy owing to constantly being written into the data cached state that reaches capacity.In order effectively to simplify the algorithm of high complexity of the prior art, the present invention proposes following method, when needs write in the general memory buffer when data cached, whether the total amount of at first judging the idle storage space of general memory buffer is not less than the data cached data volume that need write, if the total amount of the idle storage space of general memory buffer is not less than the data cached data volume that need write, data cached being written in the general memory buffer that directly needs is write then, if the data cached the data volume whether total amount of the idle storage space of general memory buffer writes less than needs, then at first with the data cached unloading in the general memory buffer to main memory, discharge the storage space of general memory buffer then, will need data cached being written in the general memory buffer that sucks at last.
In response to clear instruction, cache controller follows whether there be associated with the thread that carry out to finish data cached in the dedicated buffer memory that the query caching mapping table is checked with this processor core is coupled, if exist then remove inquire data cached and discharge this/these data cached shared storage spaces.Cache controller in the general memory buffer; Upgrading the buffer memory mapping table makes the buffer memory mapping table reflect up-to-date storage relation.
When needs read when data cached, determine from general memory buffer or dedicated buffer memory, to read described data cached by the query caching mapping table.Wherein for the buffer memory of unloading to the main memory, under the situation of query missed in the buffer memory mapping table, can further from main memory, read described data cached.Concrete read method is not done and is given unnecessary details.
3-4 with reference to the accompanying drawings describes the method for multi-buffer data in polycaryon processor of second embodiment of the invention in detail, and polycaryon processor comprises 2 nIndividual processor core (wherein n=2,3,4,5,6,7,8,9) and n+1 level buffer memory, wherein m level buffer memory comprises 2 N+1-mIndividual memory buffer (1<m<n+1).Wherein, the i(1<=i<=2 of the 1st grade of buffer memory n) individual memory buffer is used for storage the i(1<=i<=2 n) the performed required buffering of thread of individual processor core data cached, and only be used for storage the i(1<=i<=2 n) the performed required buffering of thread of individual processor core data cached; The i(1 of level 2 cache memory<=i<=2 N-1) individual memory buffer is used for the 2nd of the 1st grade of buffer memory of storage iThe-1 and the 2nd iWhat have in the individual memory buffer is data cached, and only is used for the 2nd of the 1st grade of buffer memory of storage iThe-1 and the 2nd iWhat have in the individual memory buffer is data cached, further, and when the 2nd of the 1st grade of buffer memory iThe-1 and the 2nd iWhen all storing same buffered data in the individual memory buffer, by cache controller will i memory buffer of this same data cached unloading to the level 2 cache memory in, remove the 2nd of the 1st grade of buffer memory simultaneously iThe-1 and the 2nd iSame data cached in the individual memory buffer discharges the 2nd of the 1st grade of buffer memory iThe-1 and the 2nd iSame data cached shared storage space in the individual memory buffer; In a word, the i(1<=i<=2 of the 1st grade of buffer memory n) individual memory buffer is used for the data cached of the performed required buffering of thread of i processor core of storage, and only be used for storing the data cached of the performed required buffering of thread of i processor core; The j(1<=j<=2 of the level of the s(2<=s<=n+1) buffer memory N+1-s) individual memory buffer is used for the 2nd of storage s-1 level buffer memory jThe-1 and the 2nd jWhat have in the individual memory buffer is data cached, and only is used for the 2nd of storage s level buffer memory jThe-1 and the 2nd jWhat have in the individual memory buffer is data cached.For example, when n=2, described polycaryon processor comprises 2 2=4 processor cores and 3 grades of buffer memorys, wherein the 1st grade of buffer memory comprises 2 2=4 first-level buffer storeies, level 2 cache memory comprises 2 1=2 level 2 buffering storeies, the 3rd level buffer memory comprises 2 0=1 three grades of memory buffer.Described polycaryon processor also comprises the mapping buffer, be used for the memory buffers mapping table, at least store the storage relation between data cached and each memory buffer (each memory buffer that comprises buffer memorys at different levels) in the described buffer memory mapping table, described storage relation comprise data cached be stored in which memory buffer and data cached which thread with which processor core related.Wherein said polycaryon processor also comprises cache controller, is used for each memory buffer and the mapping buffer of control buffer memorys at different levels, realizes the writing of each memory buffer of buffer memorys at different levels and mapping buffer, reads, replaces, operation such as inquiry.The wherein interconnected and communication with bus between each memory buffer of each processor core, buffer memorys at different levels, mapping buffer and the cache controller.
Receive the instruction of a plurality of threads of the same process of concurrent execution when polycaryon processor after; Continue to carry out subsequent step under situation about meeting the following conditions: the quantity of idle processor core is not less than the quantity of a plurality of threads of the concurrent execution of same process, exists the course allocation mode to make simultaneously each thread distributed to different idle processor cores and distributed the total resources of the idle processor core of thread to be not less than the needed processor resource total amount of the thread that distributes (namely all have distributed the processing power of the idle processor core of thread can both satisfy needs to the thread of its distribution); Otherwise return the information of indication mechanism inadequate resource.Instruction in response to a plurality of threads of the same process of concurrent execution, in described a plurality of threads each is distributed to described a plurality of processor core idle processor core in the heart respectively, wherein to distribute to different idle processor cores and allocation result be that the processing power of each processor core that has been assigned with thread satisfies the needs to the thread of its distribution fully to each thread, activate the idle processor core that has been assigned with thread simultaneously, make the processor core that has been assigned with thread become busy condition from idle condition.
When polycaryon processor (2 nIndividual processor core, wherein n is the integer greater than 1, for example, n=2,3,4,5,6,7,8 or 9) in the i(1<=i<=2 n) after individual processor core (idle processor core) is activated, the processor core that is activated enters the busy condition of execution thread, also therefore can carry out reading and writing of data according to the needs of execution thread, thereby need carry out the data buffer memory, the processor core under the busy condition can often send the buffer memory instruction to cache controller for this reason.In response to buffer memory instruction, cache controller at first query caching mapping table checks whether treat data cached the of the level buffer memory of the m(2<=m<=n+1) that has been stored in
Figure BDA00003527760600101
(
Figure BDA00003527760600102
Expression rounds operation, k=2 downwards M-1) in the individual memory buffer, if exist then upgrade the buffer memory mapping table sends the success of indication buffer memory then to i processor core affirmation message, wherein upgrade the buffer memory mapping table and comprise the described corresponding relation for the treatment of between the performed thread of data cached, the processor core that is activated and the processor core that is activated of increase; Otherwise will treat in data cached i the memory buffer that stores the 1st grade of buffer memory into, the storage relation that to treat i memory buffer of data cached and the 1st grade of buffer memory simultaneously is updated in the buffer memory mapping table, be that querying condition carries out query analysis in described buffer memory mapping table to treat data cached then, if the query analysis result is of the 1st grade of buffer memory
Figure BDA00003527760600111
* 2-1 and
Figure BDA00003527760600112
* 2(wherein
Figure BDA000035277606001110
Expression rounds up) all store same buffered data in the individual memory buffer, then cache controller is treated data cached the of the level 2 cache memory that dumps to described In the individual memory buffer, and with of the 1st grade of buffer memory
Figure BDA00003527760600114
* 2-1 and
Figure BDA00003527760600115
That * stores in 2 memory buffer treats data cached removing, discharges of the 1st grade of buffer memory simultaneously * 2-1 and
Figure BDA00003527760600117
* described in 2 memory buffer treated data cached shared storage space, final updating buffer memory mapping table comprises that increasing the described storage for the treatment of the relation of the storage between each memory buffer in data cached and the level 2 cache memory and deleting between described each memory buffer for the treatment of data cached and the 1st grade of buffer memory concerns.And the like, judge step by step and handle.For example working as n=2 is that polycaryon processor comprises under the situation of 4 processor cores, except carrying out under the situation of above-mentioned steps, also need to treat that data cached is that querying condition carries out query analysis in described buffer memory mapping table, if the query analysis result all stores same buffered data in the 1st and the 2nd memory buffer of level 2 cache memory, then cache controller is treated in data cached the 1st (only one) memory buffer that dumps to the 3rd level buffer memory described, and that stores in the 1st and 2 memory buffer with level 2 cache memory treats data cached removing, discharge described in the 1st and the 2nd memory buffer of level 2 cache memory simultaneously and treat data cached shared storage space, final updating buffer memory mapping table comprises that increasing the described storage for the treatment of the relation of the storage between first memory buffer in data cached and the 3rd level buffer memory and deleting between described each memory buffer for the treatment of data cached and level 2 cache memory concerns.In a word, judge step by step since the 1st grade of buffer memory to the n level buffer memory, if t level buffer memory
Figure BDA00003527760600118
* 2-1 and
Figure BDA00003527760600119
* 2(wherein
Figure BDA000035277606001111
Expression rounds up, 1<=r<=2 N+1-t, 1<=t<=n) all store the described buffered data for the treatment of in the individual memory buffer, then cache controller dumps to of t+1 level buffer memory with the described buffered data for the treatment of
Figure BDA00003527760600121
In the individual memory buffer, and with of t level buffer memory
Figure BDA00003527760600122
* 2-1 and
Figure BDA00003527760600123
That * stores in 2 memory buffer treats data cached removing, discharges of t level buffer memory simultaneously * 2-1 and
Figure BDA00003527760600125
* described in 2 memory buffer treated data cached shared storage space, and final updating buffer memory mapping table comprises increasing and describedly treats in data cached and the t+1 level buffer memory the
Figure BDA00003527760600126
Storage between individual memory buffer relation and delete described the of data cached and the t level buffer memory for the treatment of
Figure BDA00003527760600127
* 2-1 and * the relation of the storage between 2 memory buffer.When finishing, above-mentioned steps sent the affirmation message of indication buffer memory success to i processor core by buffer control unit.Wherein said buffer memory mapping table is stored in the mapping buffer.All parameters that those skilled in the art use in should clear and definite the present invention all are integers, parameter m for example, n, p, q, r, s, k, t, i, j.
In a second embodiment, carry out processing after thread finishes in the heart when processor core and be similar to disposal route among first embodiment, and all have under the situation that enough spatial cache storages treat data cached ability in memory buffer at different levels in this embodiment and implement.In this embodiment, read data cached method in also implementing with first to read data cached method similar.In addition, though be not shown specifically in the accompanying drawing 4 about the 1st grade of cycle criterion to of n level buffer memory
Figure BDA00003527760600129
* 2-1 and
Figure BDA000035277606001210
* all store the described buffered data for the treatment of in 2 memory buffer, yet art technology should understand it is to judge step by step herein according to the above-mentioned write up of instructions, for the emphasis of outstanding design of inventing, concrete circulating treatment procedure is also not shown.
Data cached in general memory buffer and the non-first-level buffer storer in the present invention, processor core can only read and can not revise, revising if desired then needs to carry out buffer memory again and writes, processor core that is to say when need be revised data cached in general memory buffer or the non-first-level buffer storer, with amended data cached as new treat data cached carry out of the present invention in polycaryon processor the method for buffered data, namely original data cached and amended data cached as the different data cached write operations of carrying out for the treatment of.
More than only be exemplary about description of the invention, and being primarily aimed at the related essential features of the technical problem to be solved in the present invention is described in detail, it should be clearly know that for those skilled in the art or not doing about other correlative details of the present invention of expecting easily given unnecessary details, for example, when causing storing, the idle storage space deficiency of dedicated buffer memory or general memory buffer treats under the data cached situation, need to replace the data cached of storage formerly by replacing algorithm, do not do at this and give unnecessary details.
Should be appreciated that, above-described embodiment is the detailed description of carrying out at specific embodiment, but the present invention is not limited to this embodiment, without departing from the spirit and scope of the present invention, can make various improvement and modification to the present invention, for example treating data cached weight when the indication of weight information is when hanging down weight, middle weight and high weight, without departing from the spirit and scope of the present invention, can further improve method data cached in memory buffer of the present invention.

Claims (5)

1. the method for a buffered data in polycaryon processor, wherein said polycaryon processor comprises a plurality of processor cores, be formed centrally a plurality of dedicated buffer memories of coupled relation one by one with described a plurality of processor cores and be coupled in a general memory buffer of described a plurality of processor cores respectively, and described method comprises:
Receive the instruction of a plurality of threads of concurrent execution;
In described a plurality of threads each is distributed to described a plurality of processor core respectively independently, and wherein in the heart each of a plurality of processor cores is assigned with a thread at most;
Be assigned with the processor core of thread for each, in response to the cache request during the execution thread, will have treated data cached storing in the dedicated buffer memory that is coupled;
Same when data cached when all storing in the dedicated buffer memory of the quantity that is not less than a threshold value t, with same data cached storing in the general memory buffer.
2. method according to claim 1, wherein with same data cached store in the general memory buffer after, removing stores same data cached in the same data cached dedicated buffer memory, and discharges and removed the same data cached shared storage space that stores in the same data cached dedicated buffer memory.
3. method according to claim 1 when in the heart each of described a plurality of processor cores need read when data cached, reads described data cached from general memory buffer or dedicated buffer memory by the query caching mapping table.
4. according to the method for claim 1-3, wherein t=s or
Figure FDA00003527760500011
Or t=2, wherein s is the total amount that is in the processor core under the state of activation, wherein
Figure FDA00003527760500012
Expression rounds up.
5. the method for multi-buffer data in polycaryon processor, described polycaryon processor comprises 2 nIndividual processor core, wherein, and n+1 level buffer memory, wherein m level buffer memory comprises 2 N+1-mIndividual memory buffer; Wherein, i memory buffer of the 1st grade of buffer memory only is used for the data cached of the performed required buffering of thread of i processor core of storage; J memory buffer of s level buffer memory only is used for the 2nd of storage s-1 level buffer memory jThe-1 and the 2nd jWhat have in the individual memory buffer is data cached, and wherein, n is the integer greater than 1,1<=m<=n+1,2<=s<=n+1,1<=i<=2 n, 1<=j<=2 N+1-mDescribed method comprises:
Instruction in response to a plurality of threads of the same process of concurrent execution, each thread is distributed to different idle processor cores, activate the idle processor core that has been assigned with thread simultaneously, make the processor core that has been assigned with thread become busy condition from idle condition;
After i processor core is activated, in response to buffer memory instruction, at first check whether treat data cached the of the p level buffer memory that has been stored in
Figure FDA00003527760500021
In the individual memory buffer, if exist, send the affirmation message of indication buffer memory success, wherein 1<=i<=2 n, 2<=p<=n+1,
Figure FDA00003527760500029
Expression rounds operation, k=2 downwards P-1
If there is no, then will treat in data cached i the memory buffer that stores the 1st grade of buffer memory into; Judge step by step since the 1st grade of buffer memory to the n level buffer memory then, if t level buffer memory
Figure FDA00003527760500022
* 2-1 and
Figure FDA00003527760500023
* all store the described buffered data for the treatment of in 2 memory buffer, then the described buffered data for the treatment of is dumped to of t+1 level buffer memory
Figure FDA00003527760500024
In the individual memory buffer, and with of t level buffer memory * 2-1 and
Figure FDA00003527760500026
That * stores in 2 memory buffer treats data cached removing, discharges of t level buffer memory simultaneously
Figure FDA00003527760500027
* 2-1 and * described in 2 memory buffer treated data cached shared storage space, wherein
Figure FDA000035277605000210
1<=t<=n,
Figure FDA000035277605000211
Expression rounds up;
Send the affirmation message of indication buffer memory success.
CN201310301037.3A 2013-07-18 2013-07-18 Data buffering method in multi-core processor Active CN103345451B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310301037.3A CN103345451B (en) 2013-07-18 2013-07-18 Data buffering method in multi-core processor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310301037.3A CN103345451B (en) 2013-07-18 2013-07-18 Data buffering method in multi-core processor

Publications (2)

Publication Number Publication Date
CN103345451A true CN103345451A (en) 2013-10-09
CN103345451B CN103345451B (en) 2015-05-13

Family

ID=49280249

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310301037.3A Active CN103345451B (en) 2013-07-18 2013-07-18 Data buffering method in multi-core processor

Country Status (1)

Country Link
CN (1) CN103345451B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105740170A (en) * 2016-01-22 2016-07-06 浪潮(北京)电子信息产业有限公司 Cache dirty page flashing method and apparatus
CN106598724A (en) * 2015-10-14 2017-04-26 三星电子株式会社 Computing system memory management method
CN107025064A (en) * 2016-01-30 2017-08-08 北京忆恒创源科技有限公司 A kind of high IOPS of low latency data access method
CN107729057A (en) * 2017-06-28 2018-02-23 西安微电子技术研究所 Flow processing method being buffered a kind of data block under multi-core DSP more
CN108121597A (en) * 2016-11-29 2018-06-05 迈普通信技术股份有限公司 A kind of big data guiding device and method
CN108132834A (en) * 2017-12-08 2018-06-08 西安交通大学 Method for allocating tasks and system under multi-level sharing cache memory framework
CN109117291A (en) * 2018-08-27 2019-01-01 惠州Tcl移动通信有限公司 Data dispatch processing method, device and computer equipment based on multi-core processor
CN110096455A (en) * 2019-04-26 2019-08-06 海光信息技术有限公司 The exclusive initial method and relevant apparatus of spatial cache
CN110609807A (en) * 2018-06-15 2019-12-24 伊姆西Ip控股有限责任公司 Method, apparatus, and computer-readable storage medium for deleting snapshot data
CN111858046A (en) * 2020-07-13 2020-10-30 海尔优家智能科技(北京)有限公司 Service request processing method and device, storage medium and electronic device
CN114138685A (en) * 2021-12-06 2022-03-04 海光信息技术股份有限公司 Cache resource allocation method and device, electronic device and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1517872A (en) * 2003-01-16 2004-08-04 国际商业机器公司 Method and device for dynamic allocation of computer resource
CN101088076A (en) * 2004-12-27 2007-12-12 英特尔公司 Predictive early write-back of owned cache blocks in a shared memory computer system
CN103198025A (en) * 2012-01-04 2013-07-10 国际商业机器公司 Method and system form near neighbor data cache sharing

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1517872A (en) * 2003-01-16 2004-08-04 国际商业机器公司 Method and device for dynamic allocation of computer resource
CN101088076A (en) * 2004-12-27 2007-12-12 英特尔公司 Predictive early write-back of owned cache blocks in a shared memory computer system
CN103198025A (en) * 2012-01-04 2013-07-10 国际商业机器公司 Method and system form near neighbor data cache sharing

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106598724A (en) * 2015-10-14 2017-04-26 三星电子株式会社 Computing system memory management method
CN106598724B (en) * 2015-10-14 2022-01-14 三星电子株式会社 Method for managing memory in a computing system
CN105740170A (en) * 2016-01-22 2016-07-06 浪潮(北京)电子信息产业有限公司 Cache dirty page flashing method and apparatus
CN107025064B (en) * 2016-01-30 2019-12-03 北京忆恒创源科技有限公司 A kind of data access method of the high IOPS of low latency
CN107025064A (en) * 2016-01-30 2017-08-08 北京忆恒创源科技有限公司 A kind of high IOPS of low latency data access method
CN108121597A (en) * 2016-11-29 2018-06-05 迈普通信技术股份有限公司 A kind of big data guiding device and method
CN107729057B (en) * 2017-06-28 2020-09-22 西安微电子技术研究所 Data block multi-buffer pipeline processing method under multi-core DSP
CN107729057A (en) * 2017-06-28 2018-02-23 西安微电子技术研究所 Flow processing method being buffered a kind of data block under multi-core DSP more
CN108132834A (en) * 2017-12-08 2018-06-08 西安交通大学 Method for allocating tasks and system under multi-level sharing cache memory framework
CN108132834B (en) * 2017-12-08 2020-08-18 西安交通大学 Task allocation method and system under multi-level shared cache architecture
CN110609807A (en) * 2018-06-15 2019-12-24 伊姆西Ip控股有限责任公司 Method, apparatus, and computer-readable storage medium for deleting snapshot data
CN110609807B (en) * 2018-06-15 2023-06-23 伊姆西Ip控股有限责任公司 Method, apparatus and computer readable storage medium for deleting snapshot data
CN109117291A (en) * 2018-08-27 2019-01-01 惠州Tcl移动通信有限公司 Data dispatch processing method, device and computer equipment based on multi-core processor
CN110096455A (en) * 2019-04-26 2019-08-06 海光信息技术有限公司 The exclusive initial method and relevant apparatus of spatial cache
CN111858046A (en) * 2020-07-13 2020-10-30 海尔优家智能科技(北京)有限公司 Service request processing method and device, storage medium and electronic device
CN111858046B (en) * 2020-07-13 2024-05-24 海尔优家智能科技(北京)有限公司 Service request processing method and device, storage medium and electronic device
CN114138685A (en) * 2021-12-06 2022-03-04 海光信息技术股份有限公司 Cache resource allocation method and device, electronic device and storage medium

Also Published As

Publication number Publication date
CN103345451B (en) 2015-05-13

Similar Documents

Publication Publication Date Title
CN103345451B (en) Data buffering method in multi-core processor
US10817201B2 (en) Multi-level memory with direct access
Aila et al. Architecture considerations for tracing incoherent rays
US7941591B2 (en) Flash DIMM in a standalone cache appliance system and methodology
US20150127691A1 (en) Efficient implementations for mapreduce systems
CN104090847B (en) Address distribution method of solid-state storage device
CN100421088C (en) Digital data processing device and method for managing cache data
CN104699631A (en) Storage device and fetching method for multilayered cooperation and sharing in GPDSP (General-Purpose Digital Signal Processor)
US11093410B2 (en) Cache management method, storage system and computer program product
CN105183662A (en) Cache consistency protocol-free distributed sharing on-chip storage framework
JP2009205335A (en) Storage system using two kinds of memory devices for cache and method for controlling the storage system
CN110188108A (en) Date storage method, device, system, computer equipment and storage medium
WO2024045585A1 (en) Method for dynamically sharing storage space in parallel processor, and corresponding processor
CN103345368B (en) Data caching method in buffer storage
US8914571B2 (en) Scheduler for memory
US10042773B2 (en) Advance cache allocator
CN107463510A (en) It is a kind of towards high performance heterogeneous polynuclear cache sharing amortization management method
CN110187832A (en) A kind of method, apparatus and system of data manipulation
CN109359063A (en) Caching replacement method, storage equipment and storage medium towards storage system software
US11157212B2 (en) Virtual controller memory buffer
US20200341764A1 (en) Scatter Gather Using Key-Value Store
EP4372563A1 (en) Systems, methods, and apparatus for operating computational devices
CN116775560B (en) Write distribution method, cache system, system on chip, electronic component and electronic equipment
CN104484136B (en) A kind of method of sustainable high concurrent internal storage data
CN104182281A (en) Method for implementing register caches of GPGPU (general purpose graphics processing units)

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C53 Correction of patent of invention or patent application
CB03 Change of inventor or designer information

Inventor after: Mao Li

Inventor after: Rong Qiang

Inventor before: Mao Li

COR Change of bibliographic data

Free format text: CORRECT: INVENTOR; FROM: MAO LI TO: MAO LI RONG QIANG

C14 Grant of patent or utility model
GR01 Patent grant
C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20151223

Address after: 610000, No. 2, unit 1, 57 community street, prosperous town, Tianfu New District, Sichuan, Chengdu, 8

Patentee after: Sichuan thousands of lines you and I Technology Co., Ltd.

Address before: 610041 A, building, No. two, Science Park, high tech Zone, Sichuan, Chengdu, China 103B

Patentee before: Sichuan Jiucheng Information Technology Co., Ltd.

CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: 610000, No. 2, unit 1, 57 community street, prosperous town, Tianfu New District, Sichuan, Chengdu, 8

Patentee after: Sichuan qianhang Technology Co., Ltd

Address before: 610000, No. 2, unit 1, 57 community street, prosperous town, Tianfu New District, Sichuan, Chengdu, 8

Patentee before: Sichuan thousands of lines you and I Technology Co., Ltd.