CN104252392A - Method for accessing data cache and processor - Google Patents

Method for accessing data cache and processor Download PDF

Info

Publication number
CN104252392A
CN104252392A CN201310269618.3A CN201310269618A CN104252392A CN 104252392 A CN104252392 A CN 104252392A CN 201310269618 A CN201310269618 A CN 201310269618A CN 104252392 A CN104252392 A CN 104252392A
Authority
CN
China
Prior art keywords
data
data buffer
thread
buffer memory
shared
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310269618.3A
Other languages
Chinese (zh)
Other versions
CN104252392B (en
Inventor
徐远超
范东睿
张�浩
叶笑春
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Institute of Computing Technology of CAS
Original Assignee
Huawei Technologies Co Ltd
Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd, Institute of Computing Technology of CAS filed Critical Huawei Technologies Co Ltd
Priority to CN201310269618.3A priority Critical patent/CN104252392B/en
Priority to PCT/CN2014/080063 priority patent/WO2014206218A1/en
Publication of CN104252392A publication Critical patent/CN104252392A/en
Application granted granted Critical
Publication of CN104252392B publication Critical patent/CN104252392B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0842Multiuser, multiprocessor or multiprocessing cache systems for multiprocessing or multitasking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The embodiment of the invention provides a method for accessing a data cache and a processor, and relates to the field of computers. According to the method and the processor, the range of data search can be narrowed, the access delay is reduced and the performances of the system are improved. A data register of the processor is a first-level cache, wherein the first-level cache comprises a private data cache and a shared data cache; the private data cache comprises a plurality of private caches and is used for storing private data of threads; the shared data cache is used for storing shared data among the threads; when data in the data register of the processor are accessed, data types of the data are determined according to additional flags, corresponding to the data, in physical addresses; the data types comprise the private data and the shared data; the threads corresponding to the data are determined according to the accessed data, and further data caches corresponding to the threads are accessed according to the threads and the data types, so that the data in the data caches are acquired. The embodiment of the invention is used for distinguishing the data caches and accessing the data caches.

Description

A kind of method of visit data buffer memory and processor
Technical field
The present invention relates to computer realm, particularly relate to a kind of method and processor of visit data buffer memory.
Background technology
After processor enters the multinuclear epoch, memory access is the bottleneck of system performance always, the growth rate of the serious delayed processor performance of growth rate of performance of memory system, the performance of the serious limit calculation speed of speed of memory access.Current multinuclear cache (memory buffer) is usually directed to as L1cache is privately owned cache, and other level is the multi-level hierarchy of cache sharing.
Polycaryon processor provides larger computation capability, multiple programs load can be run simultaneously, but have performance interference problem between the program simultaneously run on the multinuclear of cache sharing, mainly due to internuclear replacement that routine data occurs on cache sharing, program feature is affected, because need memory access again when the data be replaced reuse, add memory access latency and memory bandwidth, low and the program feature of resource utilization is made to be difficult to determine, intensive for memory access but the streaming applications that rate of reusing is low and memory access are not intensive but the program mixed running that rate of reusing is high time, problem can be more outstanding.
Therefore, will carry out reasonable management to cache, in prior art, a kind of implementation is for be divided into many parts by cache sharing, and the entity of every a corresponding association, this entity thread that normally operating system is minimum, as thread.But this division methods does not consider the possibility sharing data between thread, share data if existed but there is no cache sharing, multiple copy can be there is in the shared data between thread in privately owned cache, thus need more cache space, also need the cache consistance safeguarding multiple copy; A kind of implementation is also had to divide for being realized cache by page dye technology, but this method limits the physical memory space that each thread can use, can accomplish that good cache isolates for independently multiple process, but in the multithread programs of stream data application, have a lot of shared data between thread, it is unfavorable for carrying out completely isolated.That is page dye technology is applicable to the cache isolation between process, is not too applicable to the cache isolation between same in-process multiple thread.
Summary of the invention
Embodiments of the invention provide a kind of method and processor of visit data buffer memory, can reduce the scope of data search, reduce access delay, improve system performance.
For achieving the above object, embodiments of the invention adopt following technical scheme:
First aspect, a kind of processor is provided, comprise programmable counter, register file, instruction prefetch parts, Instruction decode unit, instruction issue unit, scalar/vector, ALU, share floating point unit, shared instruction buffer memory and internal bus, also comprise:
Data buffer, described data buffer is level cache, described level cache comprises private data buffer memory and shared data buffer storage, described private data buffer memory comprises multiple privately owned buffer memory, described privately owned buffer memory is used for the private data of storage thread, and described shared data buffer storage is for storing the shared data between described thread.
In conjunction with first aspect, in the first mode in the cards of first aspect, described processor is Simultaneous multithreaded architecture, described privately owned buffer memory and hardware thread one_to_one corresponding, and all hardware thread shares described shared data buffer storage.
Second aspect, provides a kind of method of visit data buffer memory, comprising:
During data in the data buffer of access processor, determine the data type of described data according to zone bit additional in the physical address that described data are corresponding, described data type comprises private data and shared data;
Data according to access determine the thread that described data are corresponding, and then access data buffer storage corresponding to described thread according to described thread and described data type, to obtain the data in described data buffer storage, described data buffer storage is private data buffer memory or shared data buffer storage.
In conjunction with second aspect, in the first mode in the cards of second aspect, described method also comprises:
If there are not described data in described private data buffer memory, then access primary memory, and the cache lines obtaining described data place from described primary memory is backfilled in private data buffer memory corresponding to described thread;
If there are not described data in described shared data buffer storage, then access described primary memory, and the cache lines obtaining described data place from described primary memory is backfilled in described shared data buffer storage.
In conjunction with the first mode in the cards of second aspect, in the second mode in the cards, zone bit additional in the described physical address corresponding according to described data determines that the data type of described data comprises:
If the zone bit in described physical address is the first mark, then determine that the data type of described data is private data;
If the zone bit in described physical address is the second mark, then determine that the data type of described data is for sharing data.
The second in conjunction with second aspect mode in the cards, in the third mode in the cards, described data buffer storage of accessing described thread corresponding according to described thread and described data type comprises:
If described data type is described private data, then access the private data buffer memory that described thread is corresponding;
If described data type is described shared data, then access described shared data buffer storage.
In conjunction with the third mode in the cards of second aspect, in the 4th kind of mode in the cards, described private data buffer memory comprises multiple privately owned buffer memory, described private data buffer memory is used for the private data of storage thread, and described shared data buffer storage is for storing the shared data between described thread;
Wherein, described privately owned buffer memory and hardware thread one_to_one corresponding, all hardware thread shares described shared data buffer storage.
The embodiment of the present invention provides a kind of method and processor of visit data buffer memory, this processor comprises programmable counter, register file, instruction prefetch parts, Instruction decode unit, instruction issue unit, scalar/vector, ALU, share floating point unit, shared instruction buffer memory and internal bus, also comprise data buffer, data buffer is level cache, level cache comprises private data buffer memory and shared data buffer storage, private data buffer memory comprises multiple privately owned buffer memory, privately owned buffer memory is used for the private data of storage thread, share data buffer storage for the shared data between storage thread, during data in the data buffer of access processor, according to the data type of zone bit determination data additional in the physical address that data are corresponding, data type comprises private data and shared data, the thread corresponding according to the data determination data of access, and then access data buffer storage corresponding to thread according to thread and data type, to obtain the data in data buffer storage, data buffer storage is private data buffer memory or shared data buffer storage, the scope of data search can be reduced, reduce access delay, improve system performance.
Accompanying drawing explanation
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, be briefly described to the accompanying drawing used required in embodiment or description of the prior art below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.
A kind of processor structure schematic diagram that Fig. 1 provides for the embodiment of the present invention;
A kind of buffer memory that Fig. 2 provides for the embodiment of the present invention divides schematic diagram;
The method flow schematic diagram of a kind of visit data buffer memory that Fig. 3 provides for the embodiment of the present invention.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.
The embodiment of the present invention provides a kind of processor 01, as shown in Figure 1, comprise programmable counter 011, register file 012, instruction prefetch parts 013, Instruction decode unit 014, instruction issue unit 015, scalar/vector 016, ALU 017, share floating point unit 018, shared instruction buffer memory 019 and internal bus, also comprise:
Data buffer 021, data buffer 021 is level cache, level cache comprises private data buffer memory 0211 and shared data buffer storage 0212, private data buffer memory 0211 comprises multiple privately owned buffer memory 0211a, private data buffer memory 0211 is for the private data of storage thread, and shared buffer memory 0212 is for the shared data of storage thread.
Wherein, processor 01 is Simultaneous multithreaded architecture, private data buffer memory 0211 and hardware thread one_to_one corresponding, the public shared data buffer storage 0212 of all hardware thread.This Simultaneous multithreaded architecture is allow the instruction of launching multiple thread within a clock period to perform to functional part, to improve the utilization factor of functional part.Privately owned buffer memory is that unique user is used, and shared buffer memory is that multiple user shares use.
PC (Program Counter, programmable counter) has 16, is PC0 ~ PC15, and in a processor core, the number of logic processor core (hardware thread) is consistent with the number of PC.
GRF (General Register File, register file), the logic processor in a processor core is checked and is answered a GRF, quantitatively consistent with the quantity of PC.
Fetch (instruction prefetch parts) is for obtaining instruction, Decoder (Instruction decode unit) is for decoding to instruction, Issue is instruction issue unit, for firing order, AGU (Address Generator Unit, scalar/vector) for carrying out the module of all address computation, generate an address for controlling access storer.ALU (Arithmetic Logic Unit, ALU) be CPU (Central Processing Unit, central processing unit) performance element, the ALU that can be made up of " And Gate " (with door) and " Or Gate " (or door).Share floating point unit (Shared Float Point Unit) for carrying out the circuit unit of floating-point operation arithmetic in processor specially, shared instruction buffer memory is for storing instruction, and internal bus is used for each parts in connection handling device.
Data buffer (Cache) 021 is the first order buffer memory L1Cache (Level 1 Cache) of processor 01, and this L1Cache comprises private data buffer memory 0211 and shared data buffer storage 0212.
Private data buffer memory 0211 comprises multiple independently privately owned buffer memory (D-Cache) 0211a, for storing the private data of each hardware thread, shares data buffer storage 0212 for the shared data between storage thread.
Privately owned cache and cache sharing are at same level L1, and when data are filled out cache by CPU, be stored in by the private data of thread in privately owned cache, the shared data between thread are stored in cache sharing.
It will be understood by those skilled in the art that current existing multinuclear cache is generally L1cache is privately owned cache, and the levels such as other L2, L3 are the multi-level hierarchy of cache sharing, and the present invention does not adopt multi-level hierarchy, retains L1cache.Like this, each hardware thread has oneself privately owned cache, the public cache sharing of all hardware thread.
For example, this processor 01 can be many-core processor, and each processor core is Simultaneous multithreaded architecture, and buffer 021 is the ingredient of each processor core.The hardware implementation of this cache can be as shown in Figure 2.Inner at the processor core of a Simultaneous multithreaded architecture, have multiple hardware thread, the corresponding privately owned buffer memory 0211 of each hardware thread, all hardware threads share a shared buffer memory 0212.Privately owned buffer memory 0211 and shared buffer memory 0212 belong to same level.
Therefore, the embodiment of the present invention provides a kind of processor, this processor comprises programmable counter, register file, instruction prefetch parts, Instruction decode unit, instruction issue unit, scalar/vector, ALU, share floating point unit, shared instruction buffer memory and internal bus, also comprise data buffer, this data buffer is level cache, level cache comprises private data buffer memory and shared data buffer storage, private data buffer memory and shared data buffer storage belong to same level, private data buffer memory comprises multiple privately owned buffer memory, private data buffer memory is used for the private data of storage thread, share data buffer storage for the shared data between storage thread, like this, the scope of data search can be reduced, reduce access delay, improve system performance.
The embodiment of the present invention provides a kind of method of access cache data, as shown in Figure 3, comprising:
During data 101, in the data buffer of processor access processor, processor is according to the data type of zone bit determination data additional in physical address corresponding to data, and data type comprises private data and shared data.
Exemplary, the data type of data can be determined by page table entry (Page Table Entry, the PTE) mark in retouching operation system.The Paging system can supported by compiling, is stored in thread private data in the exclusive page frame (Page Frame) of each thread, is stored in by the thread data sharing belonging to a process in the shared page frame of thread.
Concrete, operating system storage allocation space is in units of page, the base address of page frame writes in page table entry, is home zone or shared region, defines the zone bit of a bit in the reservation position of page table entry in order to what identify that page frame points to, whether this zone bit is home zone for distinguishing physics page frame corresponding to this page table entry, exemplary, if home zone, mark position 1, if shared region, mark position 0.Here the mark position of home zone and shared region is not limited.As shown in table 1, for example, illustrate with the page table entry structure of 4KB size, it is home zone or shared region that the zone bit that can define a bit in 9-14 position distinguishes physics page frame corresponding to page table entry.
The page table entry structure of table 1 4KB size
Virtual address is used when CPU access cache data, first search TLB (Translation Lookaside Buffer, bypass conversion buffered district) table, this table is the cache tables of virtual address and physical address, for obtaining physical address according to virtual address.If there is no corresponding virtual address in TLB, then enter paging process to obtain physical address, and be stored in the zone bit share_flag in page table entry, and this zone bit is left in the physical address of TLB table; If have corresponding virtual address item in TLB table, then directly obtain the physical address in TLB table and zone bit share_flag, and this zone bit is added in physical address.As shown in table 2 is the composition of physical address, comprises zone bit share_flag, tag (label) set index (group index number), block offset (block skew) and byte offset (byte offset).
Table 2 physical address forms
Share-flag tag set?index block?offset byte?offset
Like this, by defining reservation position, a certain position in page table entry, and the zone bit of definition is passed to CPU as the additional bit of physical address, CPU can determine according to zone bit additional in physical address the data type wanting visit data.If the zone bit in physical address is the first mark, then determine that the data type of data is private data; If the zone bit in physical address is the second mark, then determine that the data type of data is for sharing data.Such as, this first zone bit 1, the second is masked as 0.
102, the thread that processor is corresponding according to the data determination data of access, and then access buffer memory corresponding to thread according to thread and data type, to obtain the data in buffer memory, buffer memory is privately owned buffer memory or shared buffer memory.
Concrete, after the data type of the data that will access determined, according to the data of access, CPU can determine this data access by which hardware thread is initiated, and then the buffer zone that will access is determined according to this hardware thread and data type, if share_flag is 1, data type is private data, then the private data buffer memory that access hardware thread is corresponding; If share_flag is 0, data type is for sharing data, then the shared data buffer storage in access cache device, to obtain the data in buffer memory.Private data buffer memory and shared data buffer storage are completed by hardware synchronization.
Wherein, the corresponding private data buffer memory of each hardware thread, all hardware threads share a shared data buffer storage, and private data buffer memory and shared data buffer storage belong to same level L1cache.
In addition, if the private data buffer memory that CPU access hardware thread is corresponding does not hit, namely there are not the data that will access in private data buffer memory, primary memory then in CPU access memory, and from primary memory, obtain the cache lines at the data place that will access, this cache lines is backfilled in private data buffer memory corresponding to this hardware thread; If the shared data buffer storage of CPU access hardware thread does not hit, namely share in data buffer storage and there are not the data that will access, primary memory then in CPU access memory, and from primary memory, obtain the cache lines at the data place that will access, this cache lines is backfilled in the shared shared data buffer storage of all hardware thread.Wherein, when cache lines is backfilled in data buffer storage, if when data buffer storage fills up, LRU (Least Recently Used can be passed through, least recently used to) cache lines that will backfill replaces the least recently used cache lines arrived, or when there is not this cache lines in the buffer, directly this cache lines is backfilled in data buffer storage.The replacement policy adopted during backfill is same as the prior art, repeats no more here.
So, because high throughput applications program threads similarity is high, data sharing degree is low, by repartitioning data buffer storage, thread private data is separately distributed in respective privately owned buffer memory, without any interference, shared data are stored in shared data buffer storage, when searching the data in buffer memory at CPU, can according to the zone bit of physical address directly determine to search to as if private data buffer memory still share data buffer storage, reduce seek scope, reduce access delay, improve system performance.
Therefore, the embodiment of the present invention provides a kind of method of visit data buffer memory, during data in the data buffer of access processor, according to the data type of the zone bit determination data in the physical address that data are corresponding, data type comprises private data and shared data, the thread corresponding according to the data determination data of access, and then access data buffer storage corresponding to thread according to thread and data type, to obtain the data in data buffer storage, data buffer storage is private data buffer memory or shared data buffer storage, the scope of data search can be reduced, reduce access delay, improve system performance.
In several embodiments that the application provides, should be understood that disclosed processor and method can realize by another way.Such as, apparatus embodiments described above is only schematic, such as, the division of described unit, be only a kind of logic function to divide, actual can have other dividing mode when realizing, such as multiple unit or assembly can in conjunction with or another system can be integrated into, or some features can be ignored, or do not perform.Another point, shown or discussed coupling each other or direct-coupling or communication connection can be by some interfaces, and the indirect coupling of device or unit or communication connection can be electrical, machinery or other form.
In addition, in the processor in each embodiment of the present invention, each functional unit can be integrated in a processing unit, also can be that the independent physics of unit comprises, also can two or more unit in a unit integrated.And above-mentioned each unit both can adopt the form of hardware to realize, the form that hardware also can be adopted to add SFU software functional unit had realized.
The all or part of step realizing said method embodiment can have been come by the hardware that programmed instruction is relevant, and aforesaid program can be stored in a computer read/write memory medium, and this program, when performing, performs the step comprising said method embodiment; And aforesaid storage medium comprises: USB flash disk, portable hard drive, ROM (read-only memory) (Read Only Memory, be called for short ROM), random access memory (Random Access Memory, be called for short RAM), magnetic disc or CD etc. various can be program code stored medium.
The above; be only the specific embodiment of the present invention, but protection scope of the present invention is not limited thereto, is anyly familiar with those skilled in the art in the technical scope that the present invention discloses; change can be expected easily or replace, all should be encompassed within protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion with the protection domain of described claim.

Claims (7)

1. a processor, comprise programmable counter, register file, instruction prefetch parts, Instruction decode unit, instruction issue unit, scalar/vector, ALU, share floating point unit, shared instruction buffer memory and internal bus, it is characterized in that, also comprise:
Data buffer, described data buffer is level cache, described level cache comprises private data buffer memory and shared data buffer storage, described private data buffer memory comprises multiple privately owned buffer memory, described private data buffer memory is used for the private data of storage thread, and described shared data buffer storage is for storing the shared data between described thread.
2. processor core according to claim 1, is characterized in that, described processor is Simultaneous multithreaded architecture, described privately owned buffer memory and hardware thread one_to_one corresponding, and all hardware thread shares described shared data buffer storage.
3. a method for visit data buffer memory, is characterized in that, comprising:
During data in the data buffer of access processor, determine the data type of described data according to zone bit additional in the physical address that described data are corresponding, described data type comprises private data and shared data;
Data according to access determine the thread that described data are corresponding, and then access data buffer storage corresponding to described thread according to described thread and described data type, to obtain the data in described data buffer storage, described data buffer storage is private data buffer memory or shared data buffer storage.
4. method according to claim 3, is characterized in that, described method also comprises:
If there are not described data in described private data buffer memory, then access primary memory, and the cache lines obtaining described data place from described primary memory is backfilled in private data buffer memory corresponding to described thread;
If there are not described data in described shared data buffer storage, then access described primary memory, and the cache lines obtaining described data place from described primary memory is backfilled in described shared data buffer storage.
5. method according to claim 4, is characterized in that, zone bit additional in the described physical address corresponding according to described data determines that the data type of described data comprises:
If the zone bit in described physical address is the first mark, then determine that the data type of described data is private data;
If the zone bit in described physical address is the second mark, then determine that the data type of described data is for sharing data.
6. method according to claim 5, is characterized in that, described data buffer storage of accessing described thread corresponding according to described thread and described data type comprises:
If described data type is described private data, then access the private data buffer memory that described thread is corresponding;
If described data type is described shared data, then access described shared data buffer storage.
7. method according to claim 6, is characterized in that, described private data buffer memory comprises multiple privately owned buffer memory, and described privately owned buffer memory is used for the private data of storage thread, and described shared data buffer storage is for storing the shared data between described thread;
Wherein, described privately owned buffer memory and hardware thread one_to_one corresponding, all hardware thread shares described shared data buffer storage.
CN201310269618.3A 2013-06-28 2013-06-28 A kind of method and processor accessing data buffer storage Active CN104252392B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201310269618.3A CN104252392B (en) 2013-06-28 2013-06-28 A kind of method and processor accessing data buffer storage
PCT/CN2014/080063 WO2014206218A1 (en) 2013-06-28 2014-06-17 Method and processor for accessing data cache

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310269618.3A CN104252392B (en) 2013-06-28 2013-06-28 A kind of method and processor accessing data buffer storage

Publications (2)

Publication Number Publication Date
CN104252392A true CN104252392A (en) 2014-12-31
CN104252392B CN104252392B (en) 2019-06-18

Family

ID=52141029

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310269618.3A Active CN104252392B (en) 2013-06-28 2013-06-28 A kind of method and processor accessing data buffer storage

Country Status (2)

Country Link
CN (1) CN104252392B (en)
WO (1) WO2014206218A1 (en)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105677581A (en) * 2016-01-05 2016-06-15 上海斐讯数据通信技术有限公司 Internal storage access device and method
CN105743803A (en) * 2016-01-21 2016-07-06 华为技术有限公司 Data processing device for shared caches
CN106815174A (en) * 2015-11-30 2017-06-09 大唐移动通信设备有限公司 Data access control method and node controller
CN107037260A (en) * 2016-11-24 2017-08-11 国网河南省电力公司周口供电公司 A kind of telecommunication declines power network electric energy meter
CN107943743A (en) * 2017-11-17 2018-04-20 江苏微锐超算科技有限公司 Information storage, read method and the shared virtual medium carrying chip of computing device
CN109840410A (en) * 2017-12-28 2019-06-04 中国科学院计算技术研究所 The method and system of data isolation and protection in a kind of process
CN110046053A (en) * 2019-04-19 2019-07-23 上海兆芯集成电路有限公司 Processing system and its access method to the task of distributing
CN110083387A (en) * 2019-04-19 2019-08-02 上海兆芯集成电路有限公司 Use the processing system and its access method of polling mechanism
CN110083388A (en) * 2019-04-19 2019-08-02 上海兆芯集成电路有限公司 Processing system and its access method for scheduling
CN110865968A (en) * 2019-04-17 2020-03-06 成都海光集成电路设计有限公司 Multi-core processing device and data transmission method between cores thereof
CN112199217A (en) * 2020-10-23 2021-01-08 无锡江南计算技术研究所 Software and hardware cooperative thread private data access optimization method
US10929187B2 (en) 2019-04-19 2021-02-23 Shanghai Zhaoxin Semiconductor Co., Ltd. Processing system and heterogeneous processor acceleration method
US11216304B2 (en) 2019-04-19 2022-01-04 Shanghai Zhaoxin Semiconductor Co., Ltd. Processing system for scheduling and distributing tasks and its acceleration method
CN114035847A (en) * 2021-11-08 2022-02-11 海飞科(南京)信息技术有限公司 Method and apparatus for parallel execution of core programs
CN114036084A (en) * 2021-11-17 2022-02-11 海光信息技术股份有限公司 Data access method, shared cache, chip system and electronic equipment
CN114217861A (en) * 2021-12-06 2022-03-22 海光信息技术股份有限公司 Data processing method and device, electronic device and storage medium
CN114265812A (en) * 2021-11-29 2022-04-01 山东云海国创云计算装备产业创新中心有限公司 Method, device, equipment and medium for reducing access delay of RISC-V vector processor
US11294716B2 (en) 2019-04-19 2022-04-05 Shanghai Zhaoxin Semiconductor Co., Ltd. Processing system for managing process and its acceleration method
CN114327777A (en) * 2021-12-30 2022-04-12 元心信息科技集团有限公司 Method and device for determining global page directory, electronic equipment and storage medium
WO2022199357A1 (en) * 2021-03-23 2022-09-29 北京灵汐科技有限公司 Data processing method and apparatus, electronic device, and computer-readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5895487A (en) * 1996-11-13 1999-04-20 International Business Machines Corporation Integrated processing and L2 DRAM cache
CN101510191A (en) * 2009-03-26 2009-08-19 浙江大学 Multi-core system structure with buffer window and implementing method thereof
CN101571843A (en) * 2008-04-29 2009-11-04 国际商业机器公司 Method, apparatuses and system for dynamic share high-speed cache in multi-core processor
CN102270180A (en) * 2011-08-09 2011-12-07 清华大学 Multicore processor cache and management method thereof
CN103092788A (en) * 2012-12-24 2013-05-08 华为技术有限公司 Multi-core processor and data access method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7827357B2 (en) * 2007-07-31 2010-11-02 Intel Corporation Providing an inclusive shared cache among multiple core-cache clusters

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5895487A (en) * 1996-11-13 1999-04-20 International Business Machines Corporation Integrated processing and L2 DRAM cache
CN101571843A (en) * 2008-04-29 2009-11-04 国际商业机器公司 Method, apparatuses and system for dynamic share high-speed cache in multi-core processor
CN101510191A (en) * 2009-03-26 2009-08-19 浙江大学 Multi-core system structure with buffer window and implementing method thereof
CN102270180A (en) * 2011-08-09 2011-12-07 清华大学 Multicore processor cache and management method thereof
CN103092788A (en) * 2012-12-24 2013-05-08 华为技术有限公司 Multi-core processor and data access method

Cited By (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106815174A (en) * 2015-11-30 2017-06-09 大唐移动通信设备有限公司 Data access control method and node controller
CN106815174B (en) * 2015-11-30 2019-07-30 大唐移动通信设备有限公司 Data access control method and Node Controller
CN105677581A (en) * 2016-01-05 2016-06-15 上海斐讯数据通信技术有限公司 Internal storage access device and method
CN105743803A (en) * 2016-01-21 2016-07-06 华为技术有限公司 Data processing device for shared caches
CN105743803B (en) * 2016-01-21 2019-01-25 华为技术有限公司 A kind of data processing equipment of shared buffer memory
CN107037260A (en) * 2016-11-24 2017-08-11 国网河南省电力公司周口供电公司 A kind of telecommunication declines power network electric energy meter
CN107943743A (en) * 2017-11-17 2018-04-20 江苏微锐超算科技有限公司 Information storage, read method and the shared virtual medium carrying chip of computing device
CN109840410A (en) * 2017-12-28 2019-06-04 中国科学院计算技术研究所 The method and system of data isolation and protection in a kind of process
CN109840410B (en) * 2017-12-28 2021-09-21 中国科学院计算技术研究所 Method and system for isolating and protecting data in process
CN110865968A (en) * 2019-04-17 2020-03-06 成都海光集成电路设计有限公司 Multi-core processing device and data transmission method between cores thereof
CN110865968B (en) * 2019-04-17 2022-05-17 成都海光集成电路设计有限公司 Multi-core processing device and data transmission method between cores thereof
CN110046053A (en) * 2019-04-19 2019-07-23 上海兆芯集成电路有限公司 Processing system and its access method to the task of distributing
US11294716B2 (en) 2019-04-19 2022-04-05 Shanghai Zhaoxin Semiconductor Co., Ltd. Processing system for managing process and its acceleration method
US10929187B2 (en) 2019-04-19 2021-02-23 Shanghai Zhaoxin Semiconductor Co., Ltd. Processing system and heterogeneous processor acceleration method
CN110083388A (en) * 2019-04-19 2019-08-02 上海兆芯集成电路有限公司 Processing system and its access method for scheduling
CN110083388B (en) * 2019-04-19 2021-11-12 上海兆芯集成电路有限公司 Processing system for scheduling and access method thereof
US11216304B2 (en) 2019-04-19 2022-01-04 Shanghai Zhaoxin Semiconductor Co., Ltd. Processing system for scheduling and distributing tasks and its acceleration method
CN110083387A (en) * 2019-04-19 2019-08-02 上海兆芯集成电路有限公司 Use the processing system and its access method of polling mechanism
US11301297B2 (en) 2019-04-19 2022-04-12 Shanghai Zhaoxin Semiconductor Co., Ltd. Processing system for dispatching tasks and memory access method thereof
US11256633B2 (en) 2019-04-19 2022-02-22 Shanghai Zhaoxin Semiconductor Co., Ltd. Processing system with round-robin mechanism and its memory access method
CN112199217A (en) * 2020-10-23 2021-01-08 无锡江南计算技术研究所 Software and hardware cooperative thread private data access optimization method
CN112199217B (en) * 2020-10-23 2022-07-12 无锡江南计算技术研究所 Software and hardware cooperative thread private data access optimization method
WO2022199357A1 (en) * 2021-03-23 2022-09-29 北京灵汐科技有限公司 Data processing method and apparatus, electronic device, and computer-readable storage medium
CN114035847B (en) * 2021-11-08 2023-08-29 海飞科(南京)信息技术有限公司 Method and apparatus for parallel execution of kernel programs
CN114035847A (en) * 2021-11-08 2022-02-11 海飞科(南京)信息技术有限公司 Method and apparatus for parallel execution of core programs
CN114036084A (en) * 2021-11-17 2022-02-11 海光信息技术股份有限公司 Data access method, shared cache, chip system and electronic equipment
CN114265812A (en) * 2021-11-29 2022-04-01 山东云海国创云计算装备产业创新中心有限公司 Method, device, equipment and medium for reducing access delay of RISC-V vector processor
CN114265812B (en) * 2021-11-29 2024-02-02 山东云海国创云计算装备产业创新中心有限公司 Method, device, equipment and medium for reducing access delay of RISC-V vector processor
CN114217861A (en) * 2021-12-06 2022-03-22 海光信息技术股份有限公司 Data processing method and device, electronic device and storage medium
CN114327777A (en) * 2021-12-30 2022-04-12 元心信息科技集团有限公司 Method and device for determining global page directory, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN104252392B (en) 2019-06-18
WO2014206218A1 (en) 2014-12-31

Similar Documents

Publication Publication Date Title
CN104252392A (en) Method for accessing data cache and processor
US11645135B2 (en) Hardware apparatuses and methods for memory corruption detection
US9921972B2 (en) Method and apparatus for implementing a heterogeneous memory subsystem
US10802987B2 (en) Computer processor employing cache memory storing backless cache lines
US8732711B2 (en) Two-level scheduler for multi-threaded processing
US8560781B2 (en) Technique for using memory attributes
EP2831749B1 (en) Hardware profiling mechanism to enable page level automatic binary translation
US6513107B1 (en) Vector transfer system generating address error exception when vector to be transferred does not start and end on same memory page
WO2017172354A1 (en) Hardware apparatuses and methods for memory performance monitoring
EP2542973A1 (en) Gpu support for garbage collection
JP2007293839A (en) Method for managing replacement of sets in locked cache, computer program, caching system and processor
US6553486B1 (en) Context switching for vector transfer unit
US20140189192A1 (en) Apparatus and method for a multiple page size translation lookaside buffer (tlb)
US11531562B2 (en) Systems, methods, and apparatuses for resource monitoring
CN112148641A (en) System and method for tracking physical address accesses by a CPU or device
CN112948285A (en) Priority-based cache line eviction algorithm for flexible cache allocation techniques
US10013352B2 (en) Partner-aware virtual microsectoring for sectored cache architectures
US6625720B1 (en) System for posting vector synchronization instructions to vector instruction queue to separate vector instructions from different application programs
EP4020233B1 (en) Automated translation lookaside buffer set rebalancing
Gupta et al. A comparative study of cache optimization techniques and cache mapping techniques
CN115934584A (en) Memory access tracker in device private memory

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant