WO2015172718A1 - 在存储器中进行多访问的方法、装置和存储系统 - Google Patents
在存储器中进行多访问的方法、装置和存储系统 Download PDFInfo
- Publication number
- WO2015172718A1 WO2015172718A1 PCT/CN2015/078863 CN2015078863W WO2015172718A1 WO 2015172718 A1 WO2015172718 A1 WO 2015172718A1 CN 2015078863 W CN2015078863 W CN 2015078863W WO 2015172718 A1 WO2015172718 A1 WO 2015172718A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- addresses
- address
- result
- memory
- predetermined condition
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/06—Addressing a physical block of locations, e.g. base addressing, module addressing, memory dedication
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/14—Handling requests for interconnection or transfer
- G06F13/16—Handling requests for interconnection or transfer for access to memory bus
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/30003—Arrangements for executing specific machine instructions
- G06F9/3004—Arrangements for executing specific machine instructions to perform operations on memory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/30—Arrangements for executing machine instructions, e.g. instruction decode
- G06F9/34—Addressing or accessing the instruction operand or the result ; Formation of operand address; Addressing modes
- G06F9/345—Addressing or accessing the instruction operand or the result ; Formation of operand address; Addressing modes of multiple operands or results
Definitions
- Embodiments of the present invention relate to a method for performing multiple accesses in a memory, an apparatus for supporting multiple accesses in a memory, and a storage system, and more particularly to a multi-access in a memory capable of improving access performance of a memory.
- a method, a device that supports multiple accesses in memory, and a storage system are particularly preferred.
- Random memory access has always been an important factor affecting computer performance.
- access to a DRAM requires hundreds of clock cycles.
- Computer system architectures and programming languages have been using methods such as Cache (cache), prefetching, etc. to minimize random access to DRAM or to reduce the impact of random access on performance.
- Embodiments of the present invention provide a method for performing multiple access in a memory, a device for supporting multiple access in a memory, and a storage system, which can improve access performance of the computer system.
- a method for performing multiple accesses in a memory comprising: receiving N addresses in a memory, wherein N is an integer greater than 1 and the N addresses are non-contiguous; The predetermined operation is performed according to the N addresses; and the result of the output operation.
- an apparatus for supporting multiple access in a memory comprising: a receiving unit, configured to receive N addresses in a memory, where N is an integer greater than 1 and the N The addresses are non-contiguous; a processing unit for performing a predetermined operation according to the N addresses; and an output unit for outputting the result of the operation.
- a storage system comprising the apparatus for supporting multiple accesses in a memory as previously described.
- multiple addresses in memory can be operated, and these addresses can be either continuous or non-contiguous, which allows the desired address to be entered and used just as desired by the user.
- the predetermined operation can be performed inside the memory according to the input address and the result of the operation is output, not only the function of the memory is expanded, but also the speed of data processing is improved, saving time.
- FIG. 1 is a schematic flow chart showing a method for performing multiple accesses in a memory according to an embodiment of the present invention
- FIG. 2 is a schematic flowchart showing a method for performing multiple accesses in a memory according to another embodiment of the present invention
- Figure 3 is a schematic diagram showing the data structure of a diagram
- FIG. 4 is a schematic flowchart showing a method for performing multiple accesses in a memory when performing predetermined operations on data stored at N addresses, according to an embodiment of the present invention
- FIG. 5 is a schematic flowchart showing a method for performing multiple accesses in a memory when performing a predetermined operation on data stored at N addresses according to another embodiment of the present invention
- FIG. 6 is a schematic flowchart showing a method for performing multiple accesses in a memory when a predetermined operation is performed on N addresses according to still another embodiment of the present invention
- FIG. 7 is a schematic block diagram showing an apparatus for supporting multiple access in a memory according to an embodiment of the present invention.
- FIG. 8 is a schematic block diagram showing an apparatus for supporting multiple access in a memory according to another embodiment of the present invention.
- the embodiments of the present invention can also be applied to other storage devices and storage systems, such as SRAM (static random access memory), PCM (Phase Change Memory), FRAM ( Ferroelectric memory) and so on.
- SRAM static random access memory
- PCM Phase Change Memory
- FRAM Ferroelectric memory
- FIG. 1 is a schematic flow diagram showing a method 100 for multiple accesses in a memory in accordance with an embodiment of the present invention.
- the method 100 can be performed in a processor.
- N addresses in memory are received, where N is an integer greater than one and the N addresses are non-contiguous.
- a predetermined operation is performed based on the N addresses.
- multiple addresses in memory can be operated, and these addresses can be either continuous or non-contiguous, which allows the desired address to be entered and used just as desired by the user.
- the predetermined operation can be performed inside the memory according to the input address and the result of the operation is output, not only the function of the memory is expanded, but also the speed of data processing is improved, saving time.
- the buffer may be utilized to store the intermediate result and executed when all the addresses are executed before the predetermined operation performed on all the addresses ends.
- the address in the output buffer at the end of the scheduled operation As a result, this will further increase the access speed.
- FIG. 2 is a schematic flow diagram showing a method 200 for multiple accesses in a memory in accordance with another embodiment of the present invention.
- the method 200 can be performed in a memory.
- N addresses in memory are received, where N is an integer greater than one and the N addresses are non-contiguous.
- a predetermined operation is performed based on the N addresses.
- the result of the operation is stored in a buffer within the memory.
- the result of the output operation in method 100 is specifically: the address in the output buffer as a result.
- the access speed can be further improved.
- the time overhead due to the handshake signal is large, It takes about 60% of the total time, so the total time it takes is about a few hundred nanoseconds (for example, 200ns), and when the buffer is used, the same data is stored inside the memory because the handshake signal is greatly reduced.
- the total time taken to make an access can be shortened to tens of nanoseconds or even a few nanoseconds, such as 1-2 ns.
- the buffer can be configured in a structure similar to a cache to further increase access speed. Therefore, when the buffer is utilized, the access time can be shortened.
- the buffer can be newly divided from the original buffer domain in the memory, or it can be a new buffer area added to the memory. In the latter case, it may be necessary to improve the hardware of the memory.
- the data in the buffer can be emptied after each output of the result.
- Figure 3 is a schematic diagram showing the data structure of a diagram. Although an undirected graph is shown in FIG. 3, it will be apparent to those skilled in the art that it can also be a directed graph and can also include weight information.
- the figure includes 8 vertices V0, V1, V2, V3, V4, V5, V6, and V7 and 11 edges.
- a one-dimensional array is used to store vertex data and a two-dimensional array is used.
- Store side data is used.
- Vertex data can include a variety of information.
- the vertex data when performing the traversal of the graph, can indicate whether the vertex has been traversed, for example, 0 means not traversed and 1 means traversed.
- the vertex data in an application to a vertex level, can represent the number of nodes that the vertex is relative to the currently designated center vertex.
- the embodiments of the present invention are not limited thereto, and those skilled in the art may understand that the vertex data may also include any other suitable information.
- any two vertices can be associated with each other, so that when accessing the data in the graph using the memory, the order of accessing the vertices is impossible.
- it has strong randomness and is difficult to cache, resulting in slower access.
- it may be accessed in the order of V2 ⁇ V7 in a certain operation, and may be accessed in the order of V3 ⁇ V7 in the next operation, and In another operation, it is possible to access in the order of V5 ⁇ V7.
- each address may be determined by a base address (base_address) and an offset, where the offset indicates the distance of the address from the base address.
- a plurality of offsets may be defined in the form of an array, such as bias[i], where i is an integer and 0 ⁇ i ⁇ N-1.
- the number of the vertex may be used as the offset.
- the offset can be calculated by multiplying the address index by the size of the address element to further determine the address of the vertex.
- address index and address element size can be convenient for the user to operate, because in most cases, the user cannot know the exact address of the vertex, but the number of each vertex can be known. Therefore, by using the correspondence between the number of the vertex and its address index, the actual address can be determined quickly and conveniently. This greatly reduces the time required to enter data compared to the scheme in which the actual address needs to be entered, and in such a way that the user can see if the input vertices are correct and is user friendly.
- discontinuous is broad and includes not only absolute discrete vertices or addresses, such as the above four vertices V0, V2, V4, and V7, but may also include partial contiguous Vertices or addresses, such as vertices V0 to V4 and vertices V7, have a total of 5 vertices.
- N addresses are input in ascending order in the above example, since the input multiple addresses may be non-contiguous, the multiple addresses may be input in any order without necessarily following The order of increment or decrement.
- performing the predetermined operation according to the N addresses in 120 or 220 may include performing a predetermined operation on the data stored at the N addresses.
- FIG. 4 is a schematic flow diagram showing a method 400 for multiple accesses in memory when performing predetermined operations on data stored at N addresses, in accordance with an embodiment of the present invention.
- the method 400 can be performed in a memory.
- N addresses in a memory to be accessed are received, where N is an integer greater than one and the N addresses are non-contiguous.
- each of the N addresses is accessed and a determination is made as to whether the data stored at the address satisfies a predetermined condition.
- one or more of the N addresses satisfying a predetermined condition are output as a result.
- the N addresses may be non-contiguous. Of course, those skilled in the art can understand that these N addresses can also be continuous.
- multiple addresses in the memory can be accessed, and the returned results include one or more addresses that meet the criteria, thus, as compared to conventional operations that can only access one address at a time,
- the access speed is greatly improved, and thereby the access performance of the computer system can be improved.
- the addresses to be accessed can be either contiguous or non-contiguous, which makes it possible to access the desired address just as the user desires. Further, since it is possible to judge whether or not the data at each address satisfies the condition inside the memory, the time for input/output is saved, and the processing speed is improved.
- a buffer may be utilized to store intermediate results before the end of access to all addresses, and when all addresses are The address in the output buffer at the end of the access is used as a result to further increase the access speed.
- FIG. 5 is a schematic flow diagram showing a method 500 for multiple accesses in memory when performing predetermined operations on data stored at N addresses, in accordance with another embodiment of the present invention.
- the method 500 can be performed in a memory.
- N addresses in the memory to be accessed are received, where N is an integer greater than one.
- each of the N addresses is accessed and a determination is made as to whether the data stored at the address satisfies a predetermined condition.
- the address of the data satisfying the predetermined condition is stored in a buffer in the memory before the access to all of the N addresses is completed.
- one or more of the N addresses that satisfy the predetermined condition are output at 530 as a result specifically: an address in the output buffer as a result.
- the buffer is further improved by configuring the buffer The speed of access. Therefore, when a buffer is used to access multiple addresses in memory, the access time is shortened.
- a method for accessing in memory when performing a predetermined operation on data stored at N addresses may Accessing data in these addresses to determine if the data stored at the address satisfies a predetermined condition.
- the predetermined conditions herein may be arbitrarily specified by the user according to actual needs.
- the predetermined condition in the case of a traversal of the graph, the predetermined condition may indicate whether the vertex has been traversed, and may return a vertex address (vertex number) that has not been traversed as a result.
- the predetermined condition may indicate whether the vertex has been marked by the level or the first level of the predetermined vertex, and may be returned without being marked.
- the vertices of the hierarchy are either the vertices of the specified hierarchy or even the addresses (vertex numbers) of the vertices of the specified multiple levels as a result.
- an operation including a relational operation and/or a logical operation may be performed on the data and the predetermined condition value, and may be true when the operation result indication is true When it is determined that the predetermined condition is satisfied.
- the relational operations may include, but are not limited to, equal to, greater than, greater than or equal to, less than, less than or equal to, and not equal to
- logical operations may include, but are not limited to, AND, or and XOR.
- the embodiments of the present invention are not limited thereto, and those skilled in the art may understand that the operations herein may also include any suitable operations that are existing and future developed.
- the original value of the data may also be replaced with a new value, and the new value may be a function of a fixed value for the N addresses or an original value of the data at each address.
- the new value may also be a set of values corresponding to one or more of the N addresses, such a set of values may be set by the user or specified or called internally by the system, such that the predetermined condition can be met Write different values for each address or every few addresses.
- base_address represents a base address
- bias_index[i] represents a set of address indexes
- element_size represents an address element size
- op represents an operation performed
- condition_value represents a predetermined condition value
- new_value represents a new value.
- the operation is similar to the conventional Compare and Swap (CAS) operation.
- CAS Compare and Swap
- embodiments of the present invention can perform such operations on multiple addresses, and these addresses can be non-contiguous.
- each qualified bias_index[i] is temporarily stored in the buffer, and is not outputted in the buffer until the access of all the addresses ends. The address is the result.
- “AND”, or “OR” and XOR “NOR” and so on are merely exemplary, and embodiments of the present invention are not limited thereto, and may also include any other combination of appropriate operations and operations developed in the future.
- the predetermined condition may be not only a fixed predetermined condition value "condition_value”, but also a relational expression such as an expression that may be a predetermined condition value and an original value of the address. and many more.
- the determination of the predetermined condition is very flexible in determining whether the data stored at the address satisfies the predetermined condition, including various operations, and thus various needs can be satisfied.
- the predetermined condition is the same for N addresses in the above description, the embodiment of the present invention is not limited thereto, and in some embodiments, the predetermined condition may be different for N addresses.
- the above-mentioned “op” and “condition_value” may include elements respectively corresponding to each of the N addresses, for example, may be provided in the form of arrays op[i] and condition_value[i], where 0 ⁇ i ⁇ N-1.
- FIG. 6 is a schematic flow diagram showing a method 600 for multiple accesses in memory when performing predetermined operations on data stored at N addresses, in accordance with yet another embodiment of the present invention.
- the method 600 can be performed in a memory.
- N addresses in memory are received, where N is an integer greater than one and the N addresses are non-contiguous.
- At 620 at least one of an arithmetic operation, a relational operation, and a logical operation is performed on data stored at the N addresses.
- the result of the operation is output as a result.
- arithmetic operations such as addition, subtraction, and division may be performed on data stored at N addresses, and the results of sum, difference, and product obtained after the operation may be output, for example,
- the pseudo code shows the summation:
- a relational operation can be performed on data stored at N addresses to obtain maximum, minimum, intermediate, and the like in the data.
- the buffer when performing the above operations on data stored at N addresses, can also be utilized to increase the speed, that is, the intermediate result is temporarily stored in the buffer before the operations on all N addresses are completed. Then, after completing the operation on the N addresses, the result in the output buffer is used as the operation result. This is similar to some of the previous embodiments, and thus a detailed description thereof will be omitted for brevity.
- FIG. 7 is a schematic block diagram showing an apparatus 700 for supporting multiple accesses in a memory, in accordance with an embodiment of the present invention.
- a device 700 may also be referred to as a Multi-Random Access Memory with Processing Function (MRAMPF).
- MMRAMPF Multi-Random Access Memory with Processing Function
- the apparatus 700 for supporting multiple accesses in the memory may include: receiving a ticket Element 710, processing unit 720, and output unit 730.
- the receiving unit 710 is configured to receive N addresses in the memory, where N is an integer greater than 1 and the N addresses are non-contiguous.
- the processing unit 720 is configured to perform a predetermined operation according to the N addresses.
- the output unit 730 is for outputting the result of the operation.
- multiple addresses in memory can be operated, and these addresses can be either continuous or non-contiguous, which allows the desired address to be entered and used just as desired by the user.
- the predetermined operation can be performed inside the memory according to the input address and the result of the operation is output, not only the function of the memory is expanded, but also the speed of data processing is improved, saving time.
- the buffer may be utilized to store the intermediate result and executed when all the addresses are executed before the predetermined operation performed on all the addresses ends.
- the address in the output buffer at the end of the predetermined operation is used as a result to further increase the access speed.
- FIG. 8 is a schematic block diagram showing an apparatus 800 for supporting multiple accesses in a memory, in accordance with another embodiment of the present invention.
- the apparatus 800 differs from the apparatus 800 shown in FIG. 7 in that it further includes a buffer 825 for storing intermediate results before the processing unit 820 completes operations on all of the N addresses.
- the receiving unit 810, the processing unit 820, and the output unit 830 included in the apparatus 800 illustrated in FIG. 8 respectively correspond to the receiving unit 710, the processing unit 720, and the output unit 730 illustrated in FIG. 7, have a similar structure and Perform similar functions separately, and the details are not described here.
- output unit 830 outputs the result of the operation stored in buffer 825. Therefore, according to the embodiment of the present invention, since the intermediate result is temporarily stored in the buffer before the operation of all the addresses is completed, the access speed can be further improved.
- the buffer 825 herein may be newly partitioned from the original buffer domain in the memory, or may be a new buffer area added to the memory. In the latter case, it may be necessary to improve the hardware of the memory.
- the data in the buffer can be emptied each time the data in the buffer is output.
- processing unit 720 or 820 can determine each address by a base address and an offset, where the offset indicates the distance of the address from the base address.
- multiple offsets may be defined in the form of an array, such as bias[i], where i is an integer and 0 ⁇ i ⁇ N-1.
- receiving N offsets may further include receiving address element size (4 Bytes) and N address indexes.
- the determination may be made by the receiving unit 710 or 810 and the determined N addresses are transmitted to the processing unit 720 or 820.
- all of the N addresses may be transmitted to processing unit 720 or 820 after receiving unit 710 or 810 receives and determines all N addresses.
- the receiving unit 710 or 810 transmits it to the processing unit 720 or 820 every time an address is determined.
- the processing unit 720 or 820 performing a predetermined operation according to the N addresses may include performing a predetermined operation on data stored at the N addresses.
- the processing unit 720 or 820 can access each of the N addresses and determine whether data stored at the address satisfies a predetermined condition, and the output unit 730 or 830 may output one or more of the N addresses that satisfy a predetermined condition as a result.
- the address of the data satisfying the predetermined condition is stored in a buffer in the memory before completing access to all of the N addresses. After completing access to all of the N addresses, the output unit 730 or 830 can output the address in the buffer as a result.
- the processing unit 720 or 820 determining whether the data stored at the address satisfies the predetermined condition may include performing an operation including a relational operation and/or a logical operation on the data and the predetermined condition value, and when the operation result indication is true When it is determined that the predetermined condition is satisfied.
- the relational operations may include, but are not limited to, equal to, greater than, greater than or equal to, less than, less than or equal to, and not equal to
- logical operations may include, but are not limited to, AND, or and XOR.
- processing unit 720 or 820 can replace the original value of the data with a new value, which can be a function of a fixed value or an original value.
- the predetermined condition may be the same or different for N addresses.
- N may depend on the actual situation, such as user requirements, hardware design, computing power, etc., for example N may be 32, 64, and the like. N can be appropriately selected so as not to affect the processing performance of the memory.
- the processing unit 720 or 820 may perform at least one of an arithmetic operation, a relational operation, and a logical operation on the data stored at the N addresses, and the output unit 730 or 830 may output an operation
- the processing unit 720 or 820 may perform at least one of an arithmetic operation, a relational operation, and a logical operation on the data stored at the N addresses, and the output unit 730 or 830 may output an operation
- the result is the result.
- base_address represents a base address
- bias_index[i] represents a set of address indexes, which may be continuous or non-contiguous
- element_size represents an address element size (for example, 4 Bytes)
- Function() represents a predetermined operation to be performed
- parameter represents a predetermined order.
- the parameters required for the operation may be one or more, and output indicates the address to which the result is to be output.
- the N addresses that are actually to be operated can be determined based on the following:
- bias_index[i]*element_size can be replaced by bias[i]
- op, condition_value and new_value can be A set of elements, and even an expression.
- the Function may be, for example, Function (op, condition_value, new_value), where op represents The operation performed, condition_value represents a predetermined condition value, and new_value represents a new value.
- the Function may be, for example, a Function (op), or even a Function (op1, Op2, op3,...), or Function (op1[i], op2[i], op3[i],).
- embodiments of the present invention also include a storage system including the apparatus 700 or 800 for supporting multiple accesses in the memory as described above with reference to FIG. 7 or 8.
- FIGS. 7 and 8 Only portions related to the embodiments of the present invention are shown in FIGS. 7 and 8, but those skilled in the art will appreciate that the devices shown in FIGS. 7 and 8 or The device can include other necessary units.
- the disclosed systems, devices, and methods may be implemented in other manners.
- the device embodiments described above are merely illustrative.
- the division of the unit is only a logical function division.
- there may be another division manner for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed.
- the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
- the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, and may be located in one place. Or it can be distributed to multiple network elements. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
- each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
- the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
Abstract
Description
V0 | V1 | V2 | V3 | V4 | V5 | V6 | V7 |
Claims (29)
- 一种用于在存储器中进行多访问的方法,包括:接收存储器中的N个地址,其中N为大于1的整数并且所述N个地址是非连续的;根据N个地址来执行预定操作;以及输出操作的结果。
- 根据权利要求1所述的方法,其中,在输出操作的结果之前,该方法还包括:将操作的结果存储在存储器内的缓冲区中。
- 根据权利要求2所述的方法,其中,输出操作的结果包括:输出存储在缓冲区内的操作的结果。
- 根据权利要求1-3中的任一个所述的方法,其中,通过基址和偏移量来确定每个地址,其中偏移量指示该地址与基址的距离。
- 根据权利要求4所述的方法,其中,接收存储器中的N个地址进一步包括:接收基址和N个偏移量;以及根据第i地址=基址+第i偏移量来确定N个地址中的每一个,0<i≤N-1。
- 根据权利要求5所述的方法,其中,接收N个偏移量进一步包括接收地址元素大小和N个地址索引,并且根据第i地址=基址+第i偏移量来确定N个地址中的每一个包括根据第i地址=基址+第i地址索引×地址元素大小来确定N个地址中的每一个。
- 根据权利要求1-6中的任一个所述的方法,其中,根据N个地址来执行预定操作包括:对在N个地址处存储的数据执行预定操作。
- 根据权利要求7所述的方法,其中,当对在N个地址处存储的数据执行预定操作时:访问该N个地址中的每一个,并确定存储在该地址处的数据是否满足预定条件;以及输出该N个地址中满足预定条件的一个或多个地址作为结果。
- 根据权利要求2-8中的任一个所述的方法,其中,将操作的结果存储在存储器内的缓冲区中包括:在完成对该N个地址中的所有地址的访问之前,将满足预定条件的数据的地址存储在存储器内的缓冲区中。
- 根据权利要求9所述的方法,其中,输出该N个地址中满足预定条件的一个或多个地址作为结果包括:输出缓冲区中的地址作为结果。
- 根据权利要求8-10中的任一个所述的方法,其中,确定存储在该地址处的数据是否满足预定条件包括:对数据与预定条件值执行包括关系运算和/或逻辑运算的操作;以及当操作结果指示为真时,确定满足预定条件,其中,关系运算包括等于、大于、大于等于、小于、小于等于和不等于,并且逻辑运算包括与、或和异或。
- 根据权利要求8-11中的任一个所述的方法,其中,该方法进一步包括:当确定满足预定条件时,用新值来替换该数据的原始值,其中,新值是固定值或原始值的函数。
- 根据权利要求8-12中的任一个所述的方法,其中,所述预定条件对N个地址相同或不同。
- 根据权利要求7-13中的任一个所述的方法,其中,当对在N个地址处存储的数据执行预定操作时:对在N个地址处存储的数据执行算术运算、关系运算和逻辑运算中的至少一个;以及输出运算结果作为结果。
- 一种用于支持存储器中的多访问的装置,包括:接收单元,用于接收存储器中的N个地址,其中N为大于1的整数并且所述N个地址是非连续的;处理单元,用于根据N个地址来执行预定操作;以及输出单元,用于输出操作的结果。
- 根据权利要求15所述的装置,其中,该装置还包括:缓冲区,用于存储操作的结果。
- 根据权利要求16所述的装置,其中,输出单元输出存储在缓冲区内的操作的结果。
- 根据权利要求15-17中的任一个所述的装置,其中,通过基址和偏 移量来确定每个地址,其中偏移量指示该地址与基址的距离。
- 根据权利要求18所述的装置,其中,接收单元接收基址和N个偏移量,并且根据第i地址=基址+第i偏移量来确定N个地址中的每一个,0<i≤N-1。
- 根据权利要求19所述的装置,其中,接收单元接收N个偏移量进一步包括接收地址元素大小和N个地址索引,并且根据第i地址=基址+第i偏移量来确定N个地址中的每一个包括根据第i地址=基址+第i地址索引×地址元素大小来确定N个地址中的每一个。
- 根据权利要求15-20中的任一个所述的装置,其中,处理单元根据N个地址来执行预定操作包括:对在N个地址处存储的数据执行预定操作。
- 根据权利要求21所述的装置,其中,当对在N个地址处存储的数据执行预定操作时:处理单元访问该N个地址中的每一个,并确定存储在该地址处的数据是否满足预定条件;以及输出单元输出该N个地址中满足预定条件的一个或多个地址作为结果。
- 根据权利要求17-22中的任一个所述的装置,其中,在完成对该N个地址中的所有地址的访问之前,将满足预定条件的数据的地址存储在存储器内的缓冲区中。
- 根据权利要求23所述的装置,其中,输出单元输出缓冲区中的地址作为结果。
- 根据权利要求22-24中的任一个所述的装置,其中,处理单元确定存储在该地址处的数据是否满足预定条件包括:对数据与预定条件值执行包括关系运算和/或逻辑运算的操作;以及当操作结果指示为真时,确定满足预定条件,其中,关系运算包括等于、大于、大于等于、小于、小于等于和不等于,并且逻辑运算包括与、或和异或。
- 根据权利要求22-25中的任一个所述的装置,其中,当确定满足预定条件时,处理单元用新值来替换该数据的原始值,其中,新值是固定值或原始值的函数。
- 根据权利要求22-26中的任一个所述的装置,其中,所述预定条件对N个地址相同或不同。
- 根据权利要求21-27中的任一个所述的装置,其中,当对在N个地址处存储的数据执行预定操作时:处理单元对在N个地址处存储的数据执行算术运算、关系运算和逻辑运算中的至少一个;以及输出单元输出运算结果作为结果。
- 一种存储系统,包括如权利要求15-28中的任何一个所述的装置。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP22213885.1A EP4180972A1 (en) | 2014-05-14 | 2015-05-13 | Method and apparatus for multiple accesses in memory and storage system |
EP15792922.5A EP3144817A4 (en) | 2014-05-14 | 2015-05-13 | Method and apparatus for multiple accesses in memory and storage system |
US15/310,984 US10956319B2 (en) | 2014-05-14 | 2015-05-13 | Method and apparatus for multiple accesses in memory and storage system, wherein the memory return addresses of vertexes that have not been traversed |
JP2017512089A JP6389323B2 (ja) | 2014-05-14 | 2015-05-13 | メモリ中にマルチアクセスを行う方法、装置、及びメモリシステム |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410201149.6 | 2014-05-14 | ||
CN201410201149.6A CN103942162B (zh) | 2014-05-14 | 2014-05-14 | 在存储器中进行多访问的方法、装置和存储系统 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2015172718A1 true WO2015172718A1 (zh) | 2015-11-19 |
Family
ID=51189834
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2015/078863 WO2015172718A1 (zh) | 2014-05-14 | 2015-05-13 | 在存储器中进行多访问的方法、装置和存储系统 |
Country Status (5)
Country | Link |
---|---|
US (1) | US10956319B2 (zh) |
EP (2) | EP3144817A4 (zh) |
JP (1) | JP6389323B2 (zh) |
CN (1) | CN103942162B (zh) |
WO (1) | WO2015172718A1 (zh) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103942162B (zh) * | 2014-05-14 | 2020-06-09 | 清华大学 | 在存储器中进行多访问的方法、装置和存储系统 |
GB2533568B (en) * | 2014-12-19 | 2021-11-17 | Advanced Risc Mach Ltd | Atomic instruction |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101512499A (zh) * | 2006-08-31 | 2009-08-19 | 高通股份有限公司 | 相对地址产生 |
CN102171649A (zh) * | 2008-12-22 | 2011-08-31 | 英特尔公司 | 用于用单个命令对多个不连续地址范围的传送进行排队的方法和系统 |
CN103238133A (zh) * | 2010-12-08 | 2013-08-07 | 国际商业机器公司 | 用于多地址矢量载入的矢量收集缓冲器 |
CN103942162A (zh) * | 2014-05-14 | 2014-07-23 | 清华大学 | 在存储器中进行多访问的方法、装置和存储系统 |
Family Cites Families (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000067573A (ja) | 1998-08-19 | 2000-03-03 | Mitsubishi Electric Corp | 演算機能付きメモリ |
US20040103086A1 (en) * | 2002-11-26 | 2004-05-27 | Bapiraju Vinnakota | Data structure traversal instructions for packet processing |
US7865701B1 (en) * | 2004-09-14 | 2011-01-04 | Azul Systems, Inc. | Concurrent atomic execution |
JP4300205B2 (ja) * | 2005-08-02 | 2009-07-22 | 株式会社東芝 | 情報処理システムおよび情報処理方法 |
CN101506793B (zh) * | 2006-08-23 | 2012-09-05 | 陈锦夫 | 在动态虚拟记忆中运行操作系统 |
CN100446129C (zh) * | 2006-09-07 | 2008-12-24 | 华为技术有限公司 | 一种内存故障测试的方法及系统 |
US8447962B2 (en) * | 2009-12-22 | 2013-05-21 | Intel Corporation | Gathering and scattering multiple data elements |
JP2010287279A (ja) * | 2009-06-11 | 2010-12-24 | Toshiba Corp | 不揮発性半導体記憶装置 |
CA2790009C (en) * | 2010-02-18 | 2017-01-17 | Katsumi Inoue | Memory having information refinement detection function, information detection method using memory, device including memory, information detection method, method for using memory, and memory address comparison circuit |
JP4588114B1 (ja) | 2010-02-18 | 2010-11-24 | 克己 井上 | 情報絞り込み検出機能を備えたメモリ、その使用方法、このメモリを含む装置。 |
US8904153B2 (en) * | 2010-09-07 | 2014-12-02 | International Business Machines Corporation | Vector loads with multiple vector elements from a same cache line in a scattered load operation |
US20120079459A1 (en) * | 2010-09-29 | 2012-03-29 | International Business Machines Corporation | Tracing multiple threads via breakpoints |
US8612676B2 (en) * | 2010-12-22 | 2013-12-17 | Intel Corporation | Two-level system main memory |
US9342453B2 (en) * | 2011-09-30 | 2016-05-17 | Intel Corporation | Memory channel that supports near memory and far memory access |
US8850162B2 (en) * | 2012-05-22 | 2014-09-30 | Apple Inc. | Macroscalar vector prefetch with streaming access detection |
US11074169B2 (en) * | 2013-07-03 | 2021-07-27 | Micron Technology, Inc. | Programmed memory controlled data movement and timing within a main memory device |
US9497206B2 (en) * | 2014-04-16 | 2016-11-15 | Cyber-Ark Software Ltd. | Anomaly detection in groups of network addresses |
-
2014
- 2014-05-14 CN CN201410201149.6A patent/CN103942162B/zh active Active
-
2015
- 2015-05-13 EP EP15792922.5A patent/EP3144817A4/en not_active Ceased
- 2015-05-13 WO PCT/CN2015/078863 patent/WO2015172718A1/zh active Application Filing
- 2015-05-13 JP JP2017512089A patent/JP6389323B2/ja active Active
- 2015-05-13 EP EP22213885.1A patent/EP4180972A1/en not_active Withdrawn
- 2015-05-13 US US15/310,984 patent/US10956319B2/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101512499A (zh) * | 2006-08-31 | 2009-08-19 | 高通股份有限公司 | 相对地址产生 |
CN102171649A (zh) * | 2008-12-22 | 2011-08-31 | 英特尔公司 | 用于用单个命令对多个不连续地址范围的传送进行排队的方法和系统 |
CN103238133A (zh) * | 2010-12-08 | 2013-08-07 | 国际商业机器公司 | 用于多地址矢量载入的矢量收集缓冲器 |
CN103942162A (zh) * | 2014-05-14 | 2014-07-23 | 清华大学 | 在存储器中进行多访问的方法、装置和存储系统 |
Non-Patent Citations (1)
Title |
---|
See also references of EP3144817A4 * |
Also Published As
Publication number | Publication date |
---|---|
CN103942162B (zh) | 2020-06-09 |
EP3144817A1 (en) | 2017-03-22 |
US10956319B2 (en) | 2021-03-23 |
JP6389323B2 (ja) | 2018-09-12 |
JP2017519317A (ja) | 2017-07-13 |
EP4180972A1 (en) | 2023-05-17 |
EP3144817A4 (en) | 2017-07-26 |
US20170083236A1 (en) | 2017-03-23 |
CN103942162A (zh) | 2014-07-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10810179B2 (en) | Distributed graph database | |
Nazareth | Conjugate gradient method | |
JP6356675B2 (ja) | 集約/グループ化動作:ハッシュテーブル法のハードウェア実装 | |
US11762828B2 (en) | Cuckoo filters and cuckoo hash tables with biasing, compression, and decoupled logical sparsity | |
EP3401807B1 (en) | Synopsis based advanced partition elimination | |
KR20130060187A (ko) | 캐시 및/또는 소켓 감지 멀티-프로세서 코어 너비 우선 순회 | |
US20200167327A1 (en) | System and method for self-resizing associative probabilistic hash-based data structures | |
US9753984B2 (en) | Data access using decompression maps | |
US11500873B2 (en) | Methods and systems for searching directory access groups | |
CN112307062B (zh) | 数据库聚合查询方法、装置及系统 | |
US20230102690A1 (en) | Near-memory engine for reducing bandwidth utilization in sparse data applications | |
WO2015172718A1 (zh) | 在存储器中进行多访问的方法、装置和存储系统 | |
US11030714B2 (en) | Wide key hash table for a graphics processing unit | |
Knorr et al. | Proteus: A self-designing range filter | |
Chen et al. | Efficient graph similarity search in external memory | |
US10095630B2 (en) | Sequential access to page metadata stored in a multi-level page table | |
CN112486988A (zh) | 数据处理方法、装置、设备及存储介质 | |
CN113297266A (zh) | 数据处理方法、装置、设备及计算机存储介质 | |
US9213639B2 (en) | Division of numerical values based on summations and memory mapping in computing systems | |
JP2014130492A (ja) | インデックスの生成方法及び計算機システム | |
KR102471553B1 (ko) | 컴퓨팅 기기에 의해 수행되는 방법, 장치, 기기 및 컴퓨터 판독가능 저장 매체 | |
Otoo et al. | Chunked extendible dense arrays for scientific data storage | |
US10339066B2 (en) | Open-addressing probing barrier | |
Nimako et al. | Chunked extendible dense arrays for scientific data storage | |
US9223708B2 (en) | System, method, and computer program product for utilizing a data pointer table pre-fetcher |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 15792922 Country of ref document: EP Kind code of ref document: A1 |
|
DPE1 | Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101) | ||
ENP | Entry into the national phase |
Ref document number: 2017512089 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
REEP | Request for entry into the european phase |
Ref document number: 2015792922 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2015792922 Country of ref document: EP Ref document number: 15310984 Country of ref document: US |