WO2018036364A1 - 支持多数据流的tlb装置和tlb模块的更新方法 - Google Patents
支持多数据流的tlb装置和tlb模块的更新方法 Download PDFInfo
- Publication number
- WO2018036364A1 WO2018036364A1 PCT/CN2017/095845 CN2017095845W WO2018036364A1 WO 2018036364 A1 WO2018036364 A1 WO 2018036364A1 CN 2017095845 W CN2017095845 W CN 2017095845W WO 2018036364 A1 WO2018036364 A1 WO 2018036364A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- page
- tlb
- address
- module
- frame
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/10—Address translation
- G06F12/1027—Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0844—Multiple simultaneous or quasi-simultaneous cache accessing
- G06F12/0846—Cache with multiple tag or data arrays being simultaneously accessible
- G06F12/0848—Partitioned cache, e.g. separate instruction and operand caches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/10—Address translation
- G06F12/1009—Address translation using page tables, e.g. page table structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/10—Address translation
- G06F12/1027—Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB]
- G06F12/1036—Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB] for multiple virtual address spaces, e.g. segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/10—Providing a specific technical effect
- G06F2212/1016—Performance improvement
- G06F2212/1021—Hit rate improvement
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/65—Details of virtual memory and virtual address translation
- G06F2212/654—Look-ahead translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/65—Details of virtual memory and virtual address translation
- G06F2212/657—Virtual address space management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/68—Details of translation look-aside buffer [TLB]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/68—Details of translation look-aside buffer [TLB]
- G06F2212/684—TLB miss handling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2212/00—Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
- G06F2212/72—Details relating to flash memory management
- G06F2212/7201—Logical to physical mapping or translation of blocks or pages
Definitions
- the present disclosure relates to the field of artificial intelligence, and more particularly to a method for updating a TLB device and a TLB module supporting multiple data streams.
- the page table of the MMU that is, the logical address and the physical address.
- the conversion table is stored in memory. Since the conversion of the logical address to the physical address requires multiple accesses to the memory, the performance of the data access is greatly reduced, and a Translation Lookaside Buffer (TLB) module appears.
- TLB Translation Lookaside Buffer
- the TLB module stores a part of the page items in the page table. When the data processing device issues a logical address, the MMU first accesses the TLB module.
- TLB hit If the TLB module contains a page that can convert the logical address, that is, a TLB hit (TLB hit), the direct use This page performs address translation. Otherwise, it is called TLB miss.
- TLB hit The MMU accesses the page table in memory to find the address translation page for address translation. At the same time, this page is updated to the TLB module.
- an object of the present disclosure is to provide a TLB device that supports multiple data streams, and another object of the present disclosure is to provide a method for updating a TLB module to solve the problem.
- the present disclosure provides an apparatus for supporting a TLB of a multi-data stream, including:
- the control unit corresponding to the k data streams of the streaming application to be processed, sets k TLB modules, each TLB module has a one-to-one correspondence between the page and the page frame, and completes the logical address direction through the mapping relationship between the page and the page frame. Conversion of physical addresses; where k is a natural number.
- the present disclosure further provides a method for updating a TLB module, including the following steps:
- each TLB module has a one-to-one correspondence between the page and the page frame, and the logical address to the physical address is completed through the mapping relationship between the page and the page frame. Conversion; where k is a natural number;
- control unit accesses the page table in memory to find the page of the relevant address translation for address translation, and at the same time, updates the page to the corresponding TLB module, replacing one page item in the TLB page.
- the present disclosure also provides a method for updating a TLB page, including the following steps:
- each TLB module has a one-to-one correspondence between the page and the page frame, and the logical address to the physical address is completed through the mapping relationship between the page and the page frame. Conversion; where k is a natural number;
- control unit accesses the page table in the memory to find the page of the relevant address translation for address conversion, and at the same time, replaces the page and the consecutive page items after the page with all the original page items in the TLB module.
- the present disclosure also provides a method for updating a TLB page, including the following steps:
- each TLB module has a one-to-one correspondence between the page and the page frame, and the logical address to the physical address is completed through the mapping relationship between the page and the page frame.
- k is a natural number
- the page items stored in the TLB of the k TLB modules are consecutive page items, that is, the logical addresses represented by the stored page items are consecutive
- the current TLB storage page item is ⁇ P 1 , F 1 >, ⁇ P 2 , F 2 >, ⁇ P 3 , F 3 >, ..., ⁇ P n , F n >, where ⁇ P, F> represents the mapping relationship between the page number and the page frame number, P represents the page number, and F represents The page frame number, the page numbers P 1 , P 2 , P 3 , ..., P n correspond to a range of logical addresses;
- the page that completed the address translation is replaced, ie ⁇ P 1 , F 1 > is replaced, and the replaced page is the next consecutive page item of consecutive page items stored in the current TLB, ie ⁇ P n+ 1, F n+1 >, the page items stored in the TLB after replacement are ⁇ P n+1 , F n+1 >, ⁇ P 2 , F 2 >, ⁇ P 3 , F 3 >,..., ⁇ P n , F n >, still a continuous page item.
- the device and the method of the present disclosure have the following beneficial effects: since different TLB modules are used to correspond to different data streams, each TLB module can store consecutive page entries of logical addresses, which is beneficial to subsequent update operations. For the characteristics of streaming applications, the TLB module with continuous logical address is adopted. Such a TLB module update method can greatly reduce the occurrence of TLB miss, that is, it can reduce the page table of the MMU search memory, greatly improving The performance of data access.
- FIG. 1 is an example block diagram of an overall flow of an apparatus in accordance with an embodiment of the present disclosure
- FIG. 2 is a block diagram showing an example of a page table structure of a memory management unit (MMU) according to an embodiment of the present disclosure
- FIG. 3 is a diagram showing a method of representing a logical address according to an embodiment of the present disclosure
- FIGS. 4 through 6 are example block diagrams of a method of updating a TLB module according to an embodiment of the present disclosure.
- the present disclosure discloses a TLB device supporting multiple data streams, including:
- the control unit corresponding to the k data streams of the streaming application to be processed, sets k TLB modules, each TLB module has a one-to-one correspondence between the page and the page frame, and completes the logical address direction through the mapping relationship between the page and the page frame. Conversion of physical addresses; where k is a natural number.
- a flow ID field is included, and the number of bits occupied by the flow ID field is k represents the number of TLBs, Indicates that the orientation takes an integer operation.
- control unit sets four TLB modules corresponding to four data streams of weights, inputs, outputs, sections, and the like.
- the present disclosure also discloses methods for updating several TLB modules, wherein:
- each TLB module has a one-to-one correspondence between the page and the page frame, and the logical address to the physical address is completed through the mapping relationship between the page and the page frame. Conversion; where k is a natural number;
- Method one also includes the following steps:
- control unit accesses the page table in memory to find the page of the relevant address translation for address translation, and at the same time, updates the page to the corresponding TLB module, replacing one page item in the TLB page.
- the step of replacing the page item employs a random replacement algorithm or an LRU algorithm.
- Method 2 also includes the following steps:
- control unit accesses the page table in the memory to find the page of the relevant address translation for address conversion, and at the same time, replaces the page and the consecutive page items after the page with all the original page items in the TLB module.
- Method 3 also includes the following steps:
- the page items stored in the TLB modules in the TLB module are set as consecutive page items, that is, the logical addresses indicated by the stored page items are consecutive, and the current TLB storage page items are ⁇ P 1 , F 1 >, ⁇ P 2 , F 2 >, ⁇ P 3 , F 3 >,..., ⁇ P n ,F n >, where ⁇ P,F> represents the mapping relationship between the page number and the page frame number, P represents the page number, F represents the page frame number, and the page number P 1 , P 2 , P 3 , ..., P n corresponding to the logical address range is continuous;
- the page that completed the address translation is replaced, ie ⁇ P 1 , F 1 > is replaced, and the replaced page is the next consecutive page item of consecutive page items stored in the current TLB, ie ⁇ P n+ 1 , F n+1 >
- the page items stored in the TLB after replacement are ⁇ P n+1 , F n+1 >, ⁇ P 2 , F 2 >, ⁇ P 3 , F 3 >,..., ⁇ P n , F n >, still a continuous page item.
- An apparatus for supporting a TLB of a multi-data stream may be used for an application of an artificial neural network.
- an artificial neural network there are generally four data streams of weights, inputs, outputs, and partial sums, according to actual applications. You can also set up different data streams as well.
- the TLB module in the example block diagram can be set to four, and each TLB module corresponds to one data stream.
- the data processing device transmits a logical address, according to the stream ID part in the logical address, the corresponding TLB is selected to find whether there is a page item corresponding to the logical address, and if there is a corresponding page item, the TLB hit is completed, and the address conversion is completed. If the TLB needs to be updated, the update operation is performed. If there is no corresponding page item, it is a TLB miss.
- the page table from the memory always finds the corresponding page item, and the update operation is performed on the TLB.
- FIG. 2 is a block diagram showing an example of a page table structure (TLB module structure) of a memory management unit (MMU) according to an embodiment of the present disclosure.
- TLB module structure page table structure
- MMU memory management unit
- the logical address is divided into a series of equal-sized parts, called pages, and each page is numbered in turn.
- the size of the page in the example block diagram is 4kB, or other sizes, according to specific needs.
- each page uses a number to represent the page size of 4kB, not the page needs to occupy 4kB size space, for example, page number 0 represents a logical address range of 0-4kB size.
- page number 0 represents a logical address range of 0-4kB size.
- For physical memory it is also divided into contiguous parts of the same size, called a frame, and each page frame is numbered.
- the page table stores a mapping relationship between the page number and the page frame number. For example, if the logical address 8640 corresponds to a range of [8k, 16k], the corresponding page number is 2, and the page is mapped to the page according to the mapping relationship in the example block diagram.
- the physical address range corresponding to the frame number is 6, and the physical address after the logical address conversion is determined according to the offset address.
- the page item stored in each TLB in FIG. 1 is a partial page item in the page table, and the number of TLB storage page items is determined by the size of the TLB storage space.
- FIG. 3 illustrates a method for representing a logical address according to an embodiment of the present disclosure.
- the method for representing a logical address of the present disclosure adds a bit to indicate a stream ID for a data stream and a corresponding TLB module.
- the mapping the example diagram only shows the parts that need to be concerned.
- the page size set in Figure 2 is 4kB, Therefore, among the 32-bit logical addresses, 20 bits correspond to the page number, and 12 bits correspond to the offset address. According to the correspondence between the page number and the page frame number, the first 20 bits of the corresponding physical address can be obtained, and the corresponding physical address is obtained. 12 bits are consistent with 12 bits after the logical address.
- FIGS. 4-6 are example block diagrams of a method of updating a TLB page in accordance with an embodiment of the present disclosure.
- the example block diagram shows only one update method of a TLB, and any one of the TLBs of the device can be used or a plurality of update methods can be mixed.
- FIG. 4 is a first update method of a TLB page according to an embodiment of the present disclosure.
- the first update method 1 is the same as the conventional TLB page update method.
- the MMU accesses the page table in the memory to find the address of the relevant address translation for address translation, and updates the page to the TLB. Replace one page item in the TLB.
- the replacement page item here may use an alternative replacement algorithm, such as a random replacement algorithm, an LRU algorithm, or the like.
- the random replacement algorithm determines the TLB page item to be replaced according to the random number generated by the data processing device; the LRU algorithm selects the replaced page item according to the length of the access time from the current time, and the last access time is replaced by the current longest page item. Page item.
- FIG. 5 is a second update method of a TLB page according to an embodiment of the present disclosure.
- the second update method takes advantage of the streaming application that the data stream logical addresses are contiguous and each data is accessed only once.
- the update method is that each time a TLB miss occurs, the MMU accesses the page table in the memory to find the page of the relevant address translation for address conversion, and at the same time, replaces the page and the consecutive page items after the page with all the original page items of the TLB. .
- P, P1, P2, P3... are page items with consecutive page numbers.
- FIG. 6 is a third update method of a TLB page according to an embodiment of the present disclosure.
- the third update method also takes advantage of the features of streaming applications.
- the premise of the update is that the page item stored in the TLB is a continuous page item, that is, the logical address indicated by the stored page item is continuous, and the current TLB storage page item is ⁇ P 1 , F 1 >, ⁇ P 2 , F 2 >, ⁇ P 3 , F 3 >,..., ⁇ P n ,F n >, where ⁇ P,F> represents the mapping relationship between the page number and the page frame number, P represents the page number, F represents the page frame number, page number P 1 , P 2 , P 3 , ..., P n corresponds to a logical address range consecutively.
- the page that completed the address translation is replaced, ie ⁇ P 1 , F 1 > is replaced, and the replaced page is the next consecutive page item of consecutive page items stored in the current TLB, ie ⁇ P n+ 1 , F n+1 >
- the page items stored in the TLB after replacement are ⁇ P n+1 , F n+1 >, ⁇ P 2 , F 2 >, ⁇ P 3 , F 3 >,..., ⁇ P n , F n >, still a continuous page item, this update is similar to the sliding mechanism of a sliding window.
- An apparatus for supporting a TLB of a multi-data stream may be used for related streaming applications of machine learning, such as natural speech processing, handwriting recognition, face recognition, and the like, and there are k data in this embodiment.
- machine learning such as natural speech processing, handwriting recognition, face recognition, and the like.
- the size of the stream, k depends on the needs of different applications.
- the TLB module in the example block diagram can be set to k, and each TLB module corresponds to one data stream.
- the corresponding TLB is selected to find whether there is a page item corresponding to the logical address, and if there is a corresponding page item, the TLB hit is completed, and the address conversion is completed. If the TLB needs to be updated, the update operation is performed. If there is no corresponding page item, it is a TLB miss.
- the page table from the memory always finds the corresponding page item, and the update operation is performed on the TLB.
- FIG. 2 is an example block diagram of a page table structure of a memory management unit (MMU), in accordance with an embodiment of the present disclosure.
- MMU memory management unit
- the logical address is divided into a series of equal-sized parts, called pages, and each page is numbered in turn.
- the size of the page in the example block diagram is 4kB, or other sizes, according to specific needs.
- each page uses a number to represent the page size of 4kB, not the page needs to occupy 4kB size space, for example, page number 0 represents a logical address range of 0-4kB size.
- page number 0 represents a logical address range of 0-4kB size.
- For physical memory it is also divided into contiguous parts of the same size, called a frame, and each page frame is numbered.
- the page table stores a mapping relationship between the page number and the page frame number. For example, if the logical address 8640 corresponds to a range of [8k, 16k], the corresponding page number is 2, and the page is mapped to the page according to the mapping relationship in the example block diagram.
- the physical address range corresponding to the frame number is 6, and the physical address after the logical address conversion is determined according to the offset address.
- the page item stored in each TLB in FIG. 1 is a partial page item in the page table, and the number of TLB storage page items is determined by the size of the TLB storage space.
- FIG. 3 illustrates a method for representing a logical address according to an embodiment of the present disclosure.
- the method for representing a logical address of the present disclosure adds a bit to indicate a stream ID for a data stream and a corresponding TLB module.
- the mapping the example diagram only shows the parts that need to be concerned.
- the page size set in Figure 2 is 4kB, Therefore, among the 32-bit logical addresses, 20 bits correspond to the page number, and 12 bits correspond to the offset address. According to the correspondence between the page number and the page frame number, the first 20 bits of the corresponding physical address can be obtained, and the corresponding physical address is obtained. 12 bits are consistent with 12 bits after the logical address.
- FIGS. 4-6 are example block diagrams of a method of updating a TLB page in accordance with an embodiment of the present disclosure.
- the example block diagram shows only one update method of a TLB, and any one of the TLBs of the device can be used or a plurality of update methods can be mixed.
- FIG. 4 is a first update method of a TLB page according to an embodiment of the present disclosure.
- the first update method is the same as the traditional TLB page update method.
- the MMU accesses the page table in the memory to find the page of the relevant address translation for address translation, and at the same time, updates the page to the TLB and replaces it.
- the replacement page item here may use an alternative replacement algorithm, such as a random replacement algorithm, an LRU algorithm, or the like.
- the random replacement algorithm determines the TLB page item to be replaced according to the random number generated by the data processing device; the LRU algorithm selects the replaced page item according to the length of the access time from the current time, and the last access time is replaced by the current longest page item. Page item.
- FIG. 5 is a second update method of a TLB page according to an embodiment of the present disclosure.
- the second update method takes advantage of the streaming application that the data stream logical addresses are contiguous and each data is accessed only once.
- the update method is that each time a TLB miss occurs, the MMU accesses the page table in the memory to find the page of the relevant address translation for address conversion, and at the same time, replaces the page and the consecutive page items after the page with all the original page items of the TLB. .
- P, P1, P2, P3... are page items with consecutive page numbers.
- FIG. 6 is a third update method of a TLB page according to an embodiment of the present disclosure.
- the third update method also takes advantage of the features of streaming applications.
- the premise of the update is that the page item stored in the TLB is a continuous page item, that is, the logical address indicated by the stored page item is continuous, and the current TLB storage page item is ⁇ P 1 , F 1 >, ⁇ P 2 , F 2 >, ⁇ P 3 , F 3 >,..., ⁇ P n ,F n >, where ⁇ P,F> represents the mapping relationship between the page number and the page frame number, P represents the page number, F represents the page frame number, page number P 1 , P 2 , P 3 , ..., P n corresponds to a logical address range consecutively.
- the page that completed the address translation is replaced, ie ⁇ P 1 , F 1 > is replaced, and the replaced page is the next consecutive page item of consecutive page items stored in the current TLB, ie ⁇ P n+ 1 , F n+1 >
- the page items stored in the TLB after replacement are ⁇ P n+1 , F n+1 >, ⁇ P 2 , F 2 >, ⁇ P 3 , F 3 >,..., ⁇ P n , F n >, still a continuous page item, this update is similar to the sliding mechanism of a sliding window.
- each TLB can store consecutive page entries of logical addresses, and logically continuous for page entries.
- the TLB using the update method of the TLB page described in the present disclosure, can greatly reduce the occurrence of TLB miss, that is, the number of times of searching from the page table can be greatly reduced, and the performance of data access is greatly improved.
Abstract
一种支持多数据流的TLB装置和TLB模块的更新方法,该装置包括:控制单元,对应待处理的流式应用的k个数据流,设置k个TLB模块,每一个TLB模块具有一一对应的页与页框两部分,通过页与页框的映射关系完成逻辑地址向物理地址的转换;其中k为自然数。该装置和方法针对于流式应用数据流本身的特点,可以极大减少逻辑地址与物理地址转换过程中TLB miss情况的发生,从而减少访问内存的次数,极大提高数据访问的性能。
Description
本公开涉及人工智能领域,更具体地涉及一种支持多数据流的TLB装置和TLB模块的更新方法。
内存管理单元(Memory Management Unit,MMU)的出现,通过逻辑地址向物理地址的映射,使得程序的数据、堆栈的总的大小可以超过物理存储器的大小,MMU的页表,即逻辑地址与物理地址的转换表,存储在内存中。由于逻辑地址到物理地址的转换需要多次访问内存,大大降低数据访问的性能,于是出现了传输后备缓存(Translation Lookaside Buffer,TLB)模块。TLB模块存储页表中的一部分页项,当数据处理装置发出一个逻辑地址时,MMU首先访问TLB模块,如果TLB模块中含有能转换这个逻辑地址的页,即TLB命中(TLB hit),直接利用此页进行地址转换,否则称为TLB失败(TLB miss),MMU访问内存中的页表找到相关地址转换的页进行地址转换,同时,将这个页更新到TLB模块里。TLB模块的出现减少访问内存的频率,极大提高数据访问的性能,特别是在某些数据需要经常访问的情况下。
对于类似于人工神经网络的流式应用,其特点是存在多个逻辑地址连续的数据流,并且对每个数据只访问一次。对于传统的TLB模块,由于流式应用数据通常只访问一次,因此经常会出现TLB miss的情况,每次TLB miss都要从内存中寻找相关地址转换的页并更新TLB模块,因此,对于流式应用,传统的TLB模块并不能很好的提高数据访问的性能。
发明内容
有鉴于此,本公开的一个目的在于提供一种支持多数据流的TLB装置,本公开的另一个目的在于提供一种TLB模块的更新方法,以解决传
统TLB模块在流式应用上的不足。
为了实现上述目的,作为本公开的一个方面,本公开提供了一种支持多数据流的TLB的装置,包括:
控制单元,对应待处理的流式应用的k个数据流,设置k个TLB模块,每一个TLB模块具有一一对应的页与页框两部分,通过页与页框的映射关系完成逻辑地址向物理地址的转换;其中k为自然数。
作为本公开的另一个方面,本公开还提供了一种TLB模块的更新方法,包括以下步骤:
对应待处理的流式应用的k个数据流,设置k个TLB模块,每一个TLB模块具有一一对应的页与页框两部分,通过页与页框的映射关系完成逻辑地址向物理地址的转换;其中k为自然数;
当发生TLB miss的情况时,控制单元访问内存中的页表找到相关地址转换的页进行地址转换,同时,将所述页更新到对应TLB模块里,替换掉所述TLB页中一个页项。
作为本公开的再一个方面,本公开还提供了一种TLB页的更新方法,包括以下步骤:
对应待处理的流式应用的k个数据流,设置k个TLB模块,每一个TLB模块具有一一对应的页与页框两部分,通过页与页框的映射关系完成逻辑地址向物理地址的转换;其中k为自然数;
每次发生TLB miss时,控制单元访问内存中的页表找到相关地址转换的页进行地址转换,同时,将所述页及所述页之后连续的页项替换TLB模块中原有的所有页项。
作为本公开的还一个方面,本公开还提供了一种TLB页的更新方法,包括以下步骤:
对应待处理的流式应用的k个数据流,设置k个TLB模块,每一个TLB模块具有一一对应的页与页框两部分,通过页与页框的映射关系完成逻辑地址向物理地址的转换;其中k为自然数,且所述k个TLB模块中TLB存放的页项为连续页项,即所存页项表示的逻辑地址连续,设当前TLB存放页项为<P1,F1>,<P2,F2>,<P3,F3>,…,<Pn,Fn>,其中<P,F>表示页号与页
框号的映射关系,P表示页号,F表示页框号,页号P1,P2,P3,…,Pn所对应的逻辑地址范围连续;
在每次发生TLB hit后,完成地址转换的页被替换,即<P1,F1>被替换,替换的页为当前TLB存放的连续页项的下一个连续页项,即<Pn+1,Fn+1>,替换后TLB存放的页项为<Pn+1,Fn+1>,<P2,F2>,<P3,F3>,…,<Pn,Fn>,仍为连续页项。
基于上述技术方案可知,本公开的装置和方法具有如下有益效果:由于采用不同的TLB模块对应不同的数据流,使得每个TLB模块中可以存放逻辑地址连续的页项,有利于后续的更新操作;针对于流式应用的特性,采用逻辑地址连续的TLB模块,如此的TLB模块更新方法可以极大的减小TLB miss的发生,也就是可以减少MMU搜索内存中的页表,极大的提高数据访问的性能。
图1为根据本公开一实施例的装置的总体流程的示例框图;
图2为根据本公开一实施例的内存管理单元(MMU)的页表结构的示例框图;
图3为根据本公开一实施例的逻辑地址的表示方法;
图4至图6为根据本公开一实施例的TLB模块的更新方法的示例框图。
为使本公开的目的、技术方案和优点更加清楚明白,以下结合具体实施例,并参照附图,对本公开作进一步的详细说明。
在本说明书中,下述用于描述本公开原理的各种实施例只是说明,不应该以任何方式解释为限制发明的范围。参照附图的下述描述用于帮助全面理解由权利要求及其等同物限定的本公开的示例性实施例。下述描述包括多种具体细节来帮助理解,但这些细节应认为仅仅是示例性的。因此,本领域普通技术人员应认识到,在不悖离本公开的范围和精神的情况下,可以对本文中描述的实施例进行多种改变和修改。此外,为了
清楚和简洁起见,省略了公知功能和结构的描述。此外,贯穿整个附图,相同附图标记用于相似功能和操作。
本公开公开了一种支持多数据流的TLB装置,包括:
控制单元,对应待处理的流式应用的k个数据流,设置k个TLB模块,每一个TLB模块具有一一对应的页与页框两部分,通过页与页框的映射关系完成逻辑地址向物理地址的转换;其中k为自然数。
在一些实施例中,对于人工神经网络,该控制单元设置4个TLB模块,分别对应权值、输入、输出、部分和等四个数据流。
本公开还公开了几种TLB模块的更新方法,其中:
对应待处理的流式应用的k个数据流,设置k个TLB模块,每一个TLB模块具有一一对应的页与页框两部分,通过页与页框的映射关系完成逻辑地址向物理地址的转换;其中k为自然数;
方法一还包括以下步骤:
当发生TLB miss的情况时,控制单元访问内存中的页表找到相关地址转换的页进行地址转换,同时,将所述页更新到对应TLB模块里,替换掉所述TLB页中一个页项。
在一些实施例中,替换页项的步骤采用随机替换算法或LRU算法。
方法二还包括以下步骤:
每次发生TLB miss时,控制单元访问内存中的页表找到相关地址转换的页进行地址转换,同时,将所述页及所述页之后连续的页项替换TLB模块中原有的所有页项。
方法三还包括以下步骤:
将k个TLB模块中TLB存放的页项设为连续页项,即所存页项表示的逻辑地址连续,设当前TLB存放页项为<P1,F1>,<P2,F2>,<P3,F3>,…,<Pn,Fn>,其中<P,F>表示页号与页框号的映射关系,P表示页号,F表示页框号,页号P1,P2,P3,…,Pn所对应的逻辑地址范围连续;
在每次发生TLB hit后,完成地址转换的页被替换,即<P1,F1>被替换,替换的页为当前TLB存放的连续页项的下一个连续页项,即<Pn+1,Fn+1>,替换后TLB存放的页项为<Pn+1,Fn+1>,<P2,F2>,<P3,F3>,…,<Pn,Fn>,仍为连续页项。
下面通过具体实施例对本公开的技术方案进行进一步阐述说明。
具体实施例一
根据本公开一实施例的支持多数据流的TLB的装置,可用于人工神经网络的相关应用,对于人工神经网络,一般存在权值、输入、输出和部分和等四个数据流,根据实际应用需要也可以设置其他不同的数据流。
图1为根据本公开一实施例的装置的总体流程的示例框图。对于人工神经网络,存在权值、输入、输出和部分和四种数据流,因此,示例框图中TLB模块可以设置为4个,每个TLB模块对应一个数据流。当数据处理装置传入一个逻辑地址时,根据逻辑地址中的流ID部分,选择对应TLB查找是否有该逻辑地址对应的页项,若有对应页项,则为TLB hit,完成地址转换,若需要更新该TLB,则执行更新操作,若没有对应页项,则为TLB miss,从内存的页表总查找对应页项,同时对该TLB执行更新操作。
图2为根据本公开一实施例的内存管理单元(MMU)的页表结构(TLB模块结构)的示例框图。通过分页存储管理将逻辑地址划分成一系列同等大小的部分,称为页(page),并依次对每个页做编号,示例框图中页面的大小为4kB,也可以是其他大小,根据具体的需求而定,每个页用一个编号代表4kB的页面大小,并非该页需占用4kB大小空间,例如页号0代表0-4kB大小的逻辑地址范围。对于物理内存也划分成同样大小的连续的部分,称为页框(frame),并对每个页框做编号。页表存储的为页号与页框号的一个映射关系,例如,逻辑地址8640对应的范围为[8k,16k],则对应的页号为2,按照示例框图中的映射关系,映射为页框号为6所对应的物理地址范围,再根据偏移地址确定最终该逻辑地址转换后的物理地址。图1中的每个TLB存储的页项为页表中的部分页项,TLB存储页项的数目由TLB存储空间大小决定。
图3为根据本公开一实施例的逻辑地址的表示方法,本公开的逻辑
地址的表示方法相对于传统的逻辑地址表示方法,添加了一些比特位表示流ID,用于数据流与对应TLB模块的映射,示例图中只画出需要关心的部分。对于人工神经网络存在4个数据流,而因此流ID只需要2bit就可以区分四个数据流。由于图2中设定的页面大小为4kB,因此32位的逻辑地址中,有20位对应为页号,12位对应为偏置地址,根据页号与页框号的对应关系,可以获得对应物理地址的前20位,对应物理地址的后12位与逻辑地址后12位一致。
图4-6为根据本公开一实施例的TLB页的更新方法的示例框图。示例框图只示出一个TLB的一种更新方法,对装置任何TLB都可以使用任何其中一种或混用多种更新方法。
图4为根据本公开一实施例的TLB页的第一更新方法。第一更新方法1与传统的TLB页的更新方法一样,当发生TLB miss的情况时,MMU访问内存中的页表找到相关地址转换的页进行地址转换,同时,将这个页更新到TLB里,替换掉TLB中一个页项。此处替换页项可以采用不用的替换算法,例如,随机替换算法、LRU算法等。随机替换算法根据数据处理装置生成的随机数决定要替换的TLB页项;LRU算法选择替换的页项根据访问时间离当前时间的长短决定,上一次访问时间距当前最长的页项作为被替换的页项。
图5为根据本公开一实施例的TLB页的第二更新方法。第二更新方法利用流式应用的特点,即数据流逻辑地址连续且每个数据只访问一次。其更新方法为,每次发生TLB miss时,MMU访问内存中的页表找到相关地址转换的页进行地址转换,同时,将这个页及该页之后连续的页项替换TLB原有的所有页项。其中,图中P、P1、P2、P3…为页号连续的页项。
图6为根据本公开一实施例的TLB页的第三更新方法。第三更新方法同样利用了流式应用的特点。其更新前提是TLB存放的页项为连续页项,即所存页项表示的逻辑地址连续,设当前TLB存放页项为<P1,F1>,<P2,F2>,<P3,F3>,…,<Pn,Fn>,其中<P,F>表示页号与页框号的映射关系,P表示页号,F表示页框号,页号P1,P2,P3,…,Pn所对应的逻辑地址范围连
续。在每次发生TLB hit后,完成地址转换的页被替换,即<P1,F1>被替换,替换的页为当前TLB存放的连续页项的下一个连续页项,即<Pn+1,Fn+1>,替换后TLB存放的页项为<Pn+1,Fn+1>,<P2,F2>,<P3,F3>,…,<Pn,Fn>,仍为连续页项,该更新方式类似于滑动窗口的滑动机制。
具体实施例二
根据本公开一实施例的支持多数据流的TLB的装置,可用于机器学习的相关流式应用,例如自然语音处理、手写识别、人脸识别等相关应用,设本实施例中存在k个数据流,k的大小根据不同应用的需求而定。
图1为根据本公开一实施例的装置的总体流程的示例框图。对于存在k个数据流的流式应用,因此,示例框图中TLB模块可以设置为k个,每个TLB模块对应一个数据流。当数据处理装置传入一个逻辑地址时,根据逻辑地址中的流ID部分,选择对应TLB查找是否有该逻辑地址对应的页项,若有对应页项,则为TLB hit,完成地址转换,若需要更新该TLB,则执行更新操作,若没有对应页项,则为TLB miss,从内存的页表总查找对应页项,同时对该TLB执行更新操作。
图2为根据本公开一实施例的内存管理单元(MMU)的页表结构的示例框图。通过分页存储管理将逻辑地址划分成一系列同等大小的部分,称为页(page),并依次对每个页做编号,示例框图中页面的大小为4kB,也可以是其他大小,根据具体的需求而定,每个页用一个编号代表4kB的页面大小,并非该页需占用4kB大小空间,例如页号0代表0-4kB大小的逻辑地址范围。对于物理内存也划分成同样大小的连续的部分,称为页框(frame),并对每个页框做编号。页表存储的为页号与页框号的一个映射关系,例如,逻辑地址8640对应的范围为[8k,16k],则对应的页号为2,按照示例框图中的映射关系,映射为页框号为6所对应的物理地址范围,再根据偏移地址确定最终该逻辑地址转换后的物理地址。图1中的每个TLB存储的页项为页表中的部分页项,TLB存储页项的数目由TLB存储空间大小决定。
图3为根据本公开一实施例的逻辑地址的表示方法,本公开的逻辑
地址的表示方法相对于传统的逻辑地址表示方法,添加了一些比特位表示流ID,用于数据流与对应TLB模块的映射,示例图中只画出需要关心的部分。对于人工神经网络存在k个数据流,因此流ID只需要就可以区分k个数据流。由于图2中设定的页面大小为4kB,因此32位的逻辑地址中,有20位对应为页号,12位对应为偏置地址,根据页号与页框号的对应关系,可以获得对应物理地址的前20位,对应物理地址的后12位与逻辑地址后12位一致。
图4-6为根据本公开一实施例的TLB页的更新方法的示例框图。示例框图只示出一个TLB的一种更新方法,对装置任何TLB都可以使用任何其中一种或混用多种更新方法。
图4为根据本公开一实施例的TLB页的第一更新方法。第一更新方法与传统的TLB页的更新方法一样,当发生TLB miss的情况是,MMU访问内存中的页表找到相关地址转换的页进行地址转换,同时,将这个页更新到TLB里,替换掉TLB中一个页项。此处替换页项可以采用不用的替换算法,例如,随机替换算法、LRU算法等。随机替换算法根据数据处理装置生成的随机数决定要替换的TLB页项;LRU算法选择替换的页项根据访问时间离当前时间的长短决定,上一次访问时间距当前最长的页项作为被替换的页项。
图5为根据本公开一实施例的TLB页的第二更新方法。第二更新方法利用流式应用的特点,即数据流逻辑地址连续且每个数据只访问一次。其更新方法为,每次发生TLB miss时,MMU访问内存中的页表找到相关地址转换的页进行地址转换,同时,将这个页及该页之后连续的页项替换TLB原有的所有页项。其中,图中P、P1、P2、P3…为页号连续的页项。
图6为根据本公开一实施例的TLB页的第三更新方法。第三更新方法同样利用了流式应用的特点。其更新前提是TLB存放的页项为连续页项,即所存页项表示的逻辑地址连续,设当前TLB存放页项为<P1,F1>,<P2,F2>,<P3,F3>,…,<Pn,Fn>,其中<P,F>表示页号与页框号的映射关系,P表示页号,F表示页框号,页号P1,P2,P3,…,Pn所对应的逻辑地址范围连
续。在每次发生TLB hit后,完成地址转换的页被替换,即<P1,F1>被替换,替换的页为当前TLB存放的连续页项的下一个连续页项,即<Pn+1,Fn+1>,替换后TLB存放的页项为<Pn+1,Fn+1>,<P2,F2>,<P3,F3>,…,<Pn,Fn>,仍为连续页项,该更新方式类似于滑动窗口的滑动机制。
对于类似于人工神经网络的流式应用,由于其具有多个逻辑地址连续的数据流的特点,通过采用多个TLB模块,使得每个TLB可以存放逻辑地址连续的页项,对于页项逻辑连续的TLB,采用本公开所述的TLB页的更新方法,可以极大减小TLB miss的发生,即可以极大减少从页表中查找的次数,极大提高数据访问的性能。
以上所述的具体实施例,对本公开的目的、技术方案和有益效果进行了进一步详细说明,应理解的是,以上所述仅为本公开的具体实施例而已,并不用于限制本公开,凡在本公开的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本公开的保护范围之内。
Claims (7)
- 一种支持多数据流的TLB装置,其特征在于,包括:控制单元,对应待处理的流式应用的k个数据流,设置k个TLB模块,每一个TLB模块具有一一对应的页与页框两部分,通过页与页框的映射关系完成逻辑地址向物理地址的转换;其中k为自然数。
- 如权利要求1所述的TLB装置,其特征在于,对于人工神经网络,所述控制单元设置4个TLB模块,分别对应权值、输入、输出、部分和四个数据流。
- 一种TLB模块的更新方法,其特征在于,包括以下步骤:对应待处理的流式应用的k个数据流,设置k个TLB模块,每一个TLB模块具有一一对应的页与页框两部分,通过页与页框的映射关系完成逻辑地址向物理地址的转换;其中k为自然数;当发生TLB miss的情况时,控制单元访问内存中的页表找到相关地址转换的页进行地址转换,同时,将所述页更新到对应TLB模块里,替换掉所述TLB页中一个页项。
- 权利要求4所述的更新方法,其特征在于,其中替换页项的步骤采用随机替换算法或LRU算法。
- 一种TLB页的更新方法,其特征在于,包括以下步骤:对应待处理的流式应用的k个数据流,设置k个TLB模块,每一个TLB模块具有一一对应的页与页框两部分,通过页与页框的映射关系完成逻辑地址向物理地址的转换;其中k为自然数;每次发生TLB miss时,控制单元访问内存中的页表找到相关地址转换的页进行地址转换,同时,将所述页及所述页之后连续的页项替换TLB模块中原有的所有页项。
- 一种TLB页的更新方法,其特征在于,包括以下步骤:对应待处理的流式应用的k个数据流,设置k个TLB模块,每一个TLB模块具有一一对应的页与页框两部分,通过页与页框的映射关系完成逻辑地址向物理地址的转换;其中k为自然数,且所述k个TLB模块中TLB存放的页项为连续页项,即所存页项表示的逻辑地址连续,设当前TLB存放页项为<P1,F1>,<P2,F2>,<P3,F3>,...,<Pn,Fn>,其中<P,F>表示页号与页框号的映射关系,P表示页号,F表示页框号,页号P1,P2,P3,...,Pn所对应的逻辑地址范围连续;在每次发生TLB hit后,完成地址转换的页被替换,即<P1,F1>被替换,替换的页为当前TLB存放的连续页项的下一个连续页项,即<Pn+1,Fn+1>,替换后TLB存放的页项为<Pn+1,Fn+1>,<P2,F2>,<P3,F3>,...,<Pn,Fn>,仍为连续页项。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP17842787.8A EP3506113B1 (en) | 2016-08-26 | 2017-08-03 | Tlb device supporting multiple data flows and update method for tlb module |
KR1020187034255A KR102396866B1 (ko) | 2016-08-26 | 2017-08-03 | 다중 데이터 스트림을 지원하는 tlb장치와 tlb모듈의 업데이트 방법 |
US16/286,361 US10474586B2 (en) | 2016-08-26 | 2019-02-26 | TLB device supporting multiple data streams and updating method for TLB module |
US16/538,351 US11513972B2 (en) | 2016-08-26 | 2019-08-12 | TLB device supporting multiple data streams and updating method for TLB module |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610738487.2A CN107783912A (zh) | 2016-08-26 | 2016-08-26 | 一种支持多数据流的tlb装置和tlb模块的更新方法 |
CN201610738487.2 | 2016-08-26 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/286,361 Continuation-In-Part US10474586B2 (en) | 2016-08-26 | 2019-02-26 | TLB device supporting multiple data streams and updating method for TLB module |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2018036364A1 true WO2018036364A1 (zh) | 2018-03-01 |
Family
ID=61245442
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2017/095845 WO2018036364A1 (zh) | 2016-08-26 | 2017-08-03 | 支持多数据流的tlb装置和tlb模块的更新方法 |
Country Status (6)
Country | Link |
---|---|
US (2) | US10474586B2 (zh) |
EP (1) | EP3506113B1 (zh) |
KR (1) | KR102396866B1 (zh) |
CN (3) | CN110908931B (zh) |
TW (1) | TWI766878B (zh) |
WO (1) | WO2018036364A1 (zh) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110209603A (zh) * | 2019-05-31 | 2019-09-06 | 龙芯中科技术有限公司 | 地址转换方法、装置、设备及计算机可读存储介质 |
EP3731101A1 (en) * | 2019-04-26 | 2020-10-28 | INTEL Corporation | Architectural enhancements for computing systems having artificial intelligence logic disposed locally to memory |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110908931B (zh) * | 2016-08-26 | 2021-12-28 | 中科寒武纪科技股份有限公司 | Tlb模块的更新方法 |
CN111241012A (zh) * | 2020-02-25 | 2020-06-05 | 江苏华创微系统有限公司 | 支持多级页表的tlb架构 |
CN112965921B (zh) * | 2021-02-07 | 2024-04-02 | 中国人民解放军军事科学院国防科技创新研究院 | 一种多任务gpu中tlb管理方法及系统 |
CN114063934B (zh) * | 2021-12-09 | 2023-11-03 | 北京奕斯伟计算技术股份有限公司 | 数据更新装置、方法及电子设备 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040073766A1 (en) * | 2002-10-10 | 2004-04-15 | International Business Machines Corporation | Method, apparatus and system for allocating and accessing memory-mapped facilities within a data processing system |
CN102662860A (zh) * | 2012-03-15 | 2012-09-12 | 天津国芯科技有限公司 | 用于进程切换的旁路转换缓冲器(tlb)及在其中地址匹配的方法 |
CN103455443A (zh) * | 2013-09-04 | 2013-12-18 | 华为技术有限公司 | 一种缓存管理方法和装置 |
CN104298616A (zh) * | 2013-07-15 | 2015-01-21 | 华为技术有限公司 | 数据块初始化方法、高速缓冲存储器和终端 |
CN104346284A (zh) * | 2013-08-02 | 2015-02-11 | 华为技术有限公司 | 一种内存管理方法及内存管理设备 |
Family Cites Families (47)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5517596A (en) | 1991-05-17 | 1996-05-14 | International Business Machines Corporation | Learning machine synapse processor system apparatus |
US5574877A (en) * | 1992-09-25 | 1996-11-12 | Silicon Graphics, Inc. | TLB with two physical pages per virtual tag |
JPH06119246A (ja) * | 1992-10-08 | 1994-04-28 | Fujitsu Ltd | Tlb更新制御回路 |
JPH06195322A (ja) | 1992-10-29 | 1994-07-15 | Hitachi Ltd | 汎用型ニューロコンピュータとして用いられる情報処理装置 |
US6205531B1 (en) * | 1998-07-02 | 2001-03-20 | Silicon Graphics Incorporated | Method and apparatus for virtual address translation |
EP1182551B1 (en) * | 2000-08-21 | 2017-04-05 | Texas Instruments France | Address space priority arbitration |
US9411532B2 (en) * | 2001-09-07 | 2016-08-09 | Pact Xpp Technologies Ag | Methods and systems for transferring data between a processing device and external devices |
US9037807B2 (en) * | 2001-03-05 | 2015-05-19 | Pact Xpp Technologies Ag | Processor arrangement on a chip including data processing, memory, and interface elements |
US6681311B2 (en) * | 2001-07-18 | 2004-01-20 | Ip-First, Llc | Translation lookaside buffer that caches memory type information |
US6646899B2 (en) * | 2001-09-21 | 2003-11-11 | Broadcom Corporation | Content addressable memory with power reduction technique |
JP4390710B2 (ja) * | 2002-11-27 | 2009-12-24 | アールジービー・ネットワークス・インコーポレイテッド | 複数のデジタルビデオプログラムの時間多重化処理のための方法及び装置 |
US7111145B1 (en) * | 2003-03-25 | 2006-09-19 | Vmware, Inc. | TLB miss fault handler and method for accessing multiple page tables |
US7194582B1 (en) * | 2003-05-30 | 2007-03-20 | Mips Technologies, Inc. | Microprocessor with improved data stream prefetching |
US7177985B1 (en) * | 2003-05-30 | 2007-02-13 | Mips Technologies, Inc. | Microprocessor with improved data stream prefetching |
US7444493B2 (en) * | 2004-09-30 | 2008-10-28 | Intel Corporation | Address translation for input/output devices using hierarchical translation tables |
EP1967957B1 (en) * | 2005-12-27 | 2013-07-17 | Mitsubishi Electric Corporation | Transcoder |
US20080282055A1 (en) * | 2005-12-29 | 2008-11-13 | Rongzhen Yang | Virtual Translation Lookaside Buffer |
CN100543770C (zh) * | 2006-07-31 | 2009-09-23 | 辉达公司 | 用于gpu中的页映射的专门机制 |
US7945761B2 (en) * | 2006-11-21 | 2011-05-17 | Vmware, Inc. | Maintaining validity of cached address mappings |
US7827383B2 (en) * | 2007-03-09 | 2010-11-02 | Oracle America, Inc. | Efficient on-chip accelerator interfaces to reduce software overhead |
CN101425020A (zh) * | 2007-10-31 | 2009-05-06 | 国际商业机器公司 | 对mmu仿真进行加速的方法、装置和全系统仿真器 |
US8601234B2 (en) * | 2007-11-07 | 2013-12-03 | Qualcomm Incorporated | Configurable translation lookaside buffer |
CN101661437A (zh) * | 2008-08-28 | 2010-03-03 | 国际商业机器公司 | 旁路转换缓冲器以及在其中进行地址匹配的方法和装置 |
CN101833691A (zh) | 2010-03-30 | 2010-09-15 | 西安理工大学 | 一种基于fpga的最小二乘支持向量机串行结构实现方法 |
CN201927073U (zh) | 2010-11-25 | 2011-08-10 | 福建师范大学 | 一种可编程硬件bp神经元处理器 |
US9092358B2 (en) * | 2011-03-03 | 2015-07-28 | Qualcomm Incorporated | Memory management unit with pre-filling capability |
CN102163320B (zh) * | 2011-04-27 | 2012-10-03 | 福州瑞芯微电子有限公司 | 一种图像处理专用可配置的mmu电路 |
CN102360339A (zh) * | 2011-10-08 | 2012-02-22 | 浙江大学 | 一种提高tlb利用效率的方法 |
WO2013095525A1 (en) * | 2011-12-22 | 2013-06-27 | Intel Corporation | Content-aware caches for reliability |
KR20130090147A (ko) | 2012-02-03 | 2013-08-13 | 안병익 | 신경망 컴퓨팅 장치 및 시스템과 그 방법 |
CN103996069B (zh) | 2013-02-20 | 2018-04-03 | 百度在线网络技术(北京)有限公司 | 一种基于多gpu的bpnn训练方法和装置 |
CN103116556B (zh) * | 2013-03-11 | 2015-05-06 | 无锡江南计算技术研究所 | 内存静态划分虚拟化方法 |
CN104239238B (zh) * | 2013-06-21 | 2018-01-19 | 格芯公司 | 用于管理转换旁视缓冲的方法和装置 |
CN104375950B (zh) * | 2013-08-16 | 2017-08-25 | 华为技术有限公司 | 一种基于队列对通信的物理地址确定方法及装置 |
JP6552512B2 (ja) * | 2013-10-27 | 2019-07-31 | アドバンスト・マイクロ・ディバイシズ・インコーポレイテッドAdvanced Micro Devices Incorporated | 入出力メモリマップユニット及びノースブリッジ |
US20150199279A1 (en) * | 2014-01-14 | 2015-07-16 | Qualcomm Incorporated | Method and system for method for tracking transactions associated with a system memory management unit of a portable computing device |
US9495302B2 (en) | 2014-08-18 | 2016-11-15 | Xilinx, Inc. | Virtualization of memory for programmable logic |
JP2016048502A (ja) * | 2014-08-28 | 2016-04-07 | 富士通株式会社 | 情報処理装置及びメモリアクセス処理方法 |
US9703722B2 (en) * | 2014-11-14 | 2017-07-11 | Cavium, Inc. | Method and system for compressing data for a translation look aside buffer (TLB) |
GB2536201B (en) * | 2015-03-02 | 2021-08-18 | Advanced Risc Mach Ltd | Handling address translation requests |
CN104899641B (zh) | 2015-05-25 | 2018-07-13 | 杭州朗和科技有限公司 | 深度神经网络学习方法、处理器和深度神经网络学习系统 |
US11288205B2 (en) * | 2015-06-23 | 2022-03-29 | Advanced Micro Devices, Inc. | Access log and address translation log for a processor |
CN105095966B (zh) | 2015-07-16 | 2018-08-21 | 北京灵汐科技有限公司 | 人工神经网络和脉冲神经网络的混合计算系统 |
CN105184366B (zh) | 2015-09-15 | 2018-01-09 | 中国科学院计算技术研究所 | 一种时分复用的通用神经网络处理器 |
CN105653790B (zh) * | 2015-12-29 | 2019-03-29 | 东南大学—无锡集成电路技术研究所 | 一种基于人工神经网络的乱序处理器Cache访存性能评估方法 |
CN110908931B (zh) * | 2016-08-26 | 2021-12-28 | 中科寒武纪科技股份有限公司 | Tlb模块的更新方法 |
US10417140B2 (en) * | 2017-02-24 | 2019-09-17 | Advanced Micro Devices, Inc. | Streaming translation lookaside buffer |
-
2016
- 2016-08-26 CN CN201911001876.7A patent/CN110908931B/zh active Active
- 2016-08-26 CN CN201610738487.2A patent/CN107783912A/zh active Pending
- 2016-08-26 CN CN201911001839.6A patent/CN110874332B/zh active Active
-
2017
- 2017-08-03 KR KR1020187034255A patent/KR102396866B1/ko active IP Right Grant
- 2017-08-03 WO PCT/CN2017/095845 patent/WO2018036364A1/zh unknown
- 2017-08-03 EP EP17842787.8A patent/EP3506113B1/en active Active
- 2017-08-16 TW TW106127835A patent/TWI766878B/zh active
-
2019
- 2019-02-26 US US16/286,361 patent/US10474586B2/en active Active
- 2019-08-12 US US16/538,351 patent/US11513972B2/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040073766A1 (en) * | 2002-10-10 | 2004-04-15 | International Business Machines Corporation | Method, apparatus and system for allocating and accessing memory-mapped facilities within a data processing system |
CN102662860A (zh) * | 2012-03-15 | 2012-09-12 | 天津国芯科技有限公司 | 用于进程切换的旁路转换缓冲器(tlb)及在其中地址匹配的方法 |
CN104298616A (zh) * | 2013-07-15 | 2015-01-21 | 华为技术有限公司 | 数据块初始化方法、高速缓冲存储器和终端 |
CN104346284A (zh) * | 2013-08-02 | 2015-02-11 | 华为技术有限公司 | 一种内存管理方法及内存管理设备 |
CN103455443A (zh) * | 2013-09-04 | 2013-12-18 | 华为技术有限公司 | 一种缓存管理方法和装置 |
Non-Patent Citations (1)
Title |
---|
See also references of EP3506113A4 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3731101A1 (en) * | 2019-04-26 | 2020-10-28 | INTEL Corporation | Architectural enhancements for computing systems having artificial intelligence logic disposed locally to memory |
CN110209603A (zh) * | 2019-05-31 | 2019-09-06 | 龙芯中科技术有限公司 | 地址转换方法、装置、设备及计算机可读存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN110908931B (zh) | 2021-12-28 |
KR20190039470A (ko) | 2019-04-12 |
EP3506113A1 (en) | 2019-07-03 |
CN110908931A (zh) | 2020-03-24 |
TW201807577A (zh) | 2018-03-01 |
KR102396866B1 (ko) | 2022-05-11 |
US20190227946A1 (en) | 2019-07-25 |
EP3506113A4 (en) | 2020-04-22 |
CN107783912A (zh) | 2018-03-09 |
CN110874332A (zh) | 2020-03-10 |
US20190361816A1 (en) | 2019-11-28 |
TWI766878B (zh) | 2022-06-11 |
US10474586B2 (en) | 2019-11-12 |
US11513972B2 (en) | 2022-11-29 |
CN110874332B (zh) | 2022-05-10 |
EP3506113B1 (en) | 2022-05-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2018036364A1 (zh) | 支持多数据流的tlb装置和tlb模块的更新方法 | |
CN104253855B (zh) | 一种面向内容中心网络中基于内容分类的类别流行度缓存替换方法 | |
EP3497577B1 (en) | Updating least-recently-used data for greater persistence of higher-generality cache entries | |
CN100418331C (zh) | 基于网络处理器的路由查找结果缓存方法 | |
US20060026381A1 (en) | Address translation information storing apparatus and address translation information storing method | |
EP0911737A1 (en) | Cache memory with reduced access time | |
JP2006172499A (ja) | アドレス変換装置 | |
US8335908B2 (en) | Data processing apparatus for storing address translations | |
JPH0749812A (ja) | ページテーブル中のハッシュアドレスタグを用いたメモリアドレス制御装置 | |
US11836079B2 (en) | Storage management apparatus, storage management method, processor, and computer system | |
JP2009512943A (ja) | 多階層の変換索引緩衝機構(TLBs)フィールドの更新 | |
CN107729053B (zh) | 一种实现高速缓存表的方法 | |
US7024536B2 (en) | Translation look-aside buffer for improving performance and reducing power consumption of a memory and memory management method using the same | |
CN116860665A (zh) | 由处理器执行的地址翻译方法及相关产品 | |
CN110460528A (zh) | 命名数据网转发平面的fib存储结构及其使用方法 | |
JPH11345168A (ja) | デ―タ処理システム内のキャッシュ・メモリにアクセスするための方法およびシステム | |
US11474953B2 (en) | Configuration cache for the ARM SMMUv3 | |
JPS601658B2 (ja) | アドレス変換制御方式 | |
CN106407242B (zh) | 分组处理器转发数据库缓存 | |
US6581139B1 (en) | Set-associative cache memory having asymmetric latency among sets | |
JPS626350A (ja) | Tlb制御装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
ENP | Entry into the national phase |
Ref document number: 20187034255 Country of ref document: KR Kind code of ref document: A |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17842787 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2017842787 Country of ref document: EP Effective date: 20190326 |