CN105378686A - Method and system for implementing a bit array in a cache line - Google Patents

Method and system for implementing a bit array in a cache line Download PDF

Info

Publication number
CN105378686A
CN105378686A CN201480038914.5A CN201480038914A CN105378686A CN 105378686 A CN105378686 A CN 105378686A CN 201480038914 A CN201480038914 A CN 201480038914A CN 105378686 A CN105378686 A CN 105378686A
Authority
CN
China
Prior art keywords
request
bit array
value
execution
return
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201480038914.5A
Other languages
Chinese (zh)
Other versions
CN105378686B (en
Inventor
B·施泰因马赫尔-布罗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Publication of CN105378686A publication Critical patent/CN105378686A/en
Application granted granted Critical
Publication of CN105378686B publication Critical patent/CN105378686B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/3004Arrangements for executing specific machine instructions to perform operations on memory
    • G06F9/30043LOAD or STORE instructions; Clear instruction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0893Caches characterised by their organisation or structure
    • G06F12/0895Caches characterised by their organisation or structure of parts of caches, e.g. directory or tag array
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3824Operand accessing
    • G06F9/3834Maintaining memory consistency
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3851Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution from multiple instruction streams, e.g. multistreaming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/52Program synchronisation; Mutual exclusion, e.g. by means of semaphores

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Memory System Of A Hierarchy Structure (AREA)
  • Stored Programmes (AREA)
  • Logic Circuits (AREA)

Abstract

The present invention relates to a method for implementing a bit array (318) in a cache line (211) of a memory system (128) that includes a memory storage (208) and a controller (206), the method comprising configuring in the cache line (211) the bit array (318), the bit array comprising array of bits, wherein the configuring further comprises defining a value of each bit in the bit array, receiving, by the controller (206), a request (210) for an operation on the bit array wherein the request is indicative of a location of the cache line (211) in the memory storage (208) and information specifying the request; identifying, by the controller (206), for the operation one or more actions on the bit array (318) using the information, wherein the one or more actions are encoded in the controller (206); and in response to receiving the request, performing the request by executing the one or more encoded actions.

Description

For realizing the method and system of bit array in cache line
Technical field
The present invention relates to computing system, more particularly, relating to the method for realizing bit array in cache line.
Background technology
For the company of different scales, many multi-threaded computer system become one of multinomial important technology.They improve counting yield and the dirigibility of computing hardware platform.Multithreading operation on multi-threaded computer system can comprise step-by-step atom and store operation (AMO), and it is for the individual position actuating logic operation of bit array.This type of AMO comprises memory-type operation storeAND, storeOR, storeXOR and loaded type operation fetchAND, fetchOR, fetchXOR.
Summary of the invention
A target of embodiments of the invention is to provide a kind of method, computer system and computer program product of improvement.Described target is solved by the theme of independent claims.Favourable embodiment is described in the dependent claims.
Atom used herein stores operation (AMO) and refers to operate the read-modify-write of shared data.Operate in not by performing with regard to this meaning when the interference of another thread with regard to the read-modify-write of a thread, AMO is atom.In other words, for thread or the processor of the shared data of access (such as, concurrent access), each reading, write or AMO access are not subject to the interference of another access by performing atomically.
On the one hand, the present invention relates to a kind of method for realizing bit array in the cache line of storage system comprising memory storage and controller.
Described method is included in described cache line and configures described bit array, and described bit array comprises the array with multiple, and wherein said configuration comprises the value of in the described bit array of definition each further; Described controller receives the request to bit array executable operations, and wherein said request indicates described cache line position in which memory and the information of instruction appointment described request; One or more actions that described controller (206) uses described message identification to perform described bit array for described operation, wherein said one or more action is coded in described controller; And in response to reception described request, perform described request by performing one or more coded action.
Described bit array is the data structure storing a position, that is, each array element is corresponding with single position.Each element of described bit array stores one in two values 0 or 1.Each element of described bit array is by unique index mark.Usually, index value 0 is used to the bit array with E element to (E-1).
The realization of described bit array allows the described cache line of access at least partially, to individual position of described bit array and hyte execution step-by-step logical operation.Such as, each position of described bit array can represent a resource instances of the computer system comprising storage system.The operation performed described bit array by the value of one or more of correspondence is set as 1, such as, can allow the user of such as thread and so on automatically to become " owner " of one or more described resource instances.
Such as, each position of described bit array can represent multiple thread a lock in share one group lock.Operational example for described bit array carrys out the multiple lock of automatic acquisition as allowed thread with high-performance, low programing work amount.
Such as, 1 or 0 value of each of described bit array can represent existence or the disappearance of element-specific in Priority Queues.This type of Priority Queues is up to E only element, and wherein each element has the unique priority level being numbered 0 to E-1, thus corresponds to each index of described bit array.
These features can be favourable, because compared with classic method, especially for concurrent thread, they can allow the higher operating rate for described bit array.This is because can be simple operations for the operation of described bit array, this simple operations be faster than the traditional operation for other data structure.
These features can allow multiple thread to use described bit array as concurrent data structure.Such as, the operation of asking can be AMO.The request received can be one of relevant multiple concurrent requests to described bit array executable operations of receiving from multiple thread of described controller.The concurrent request received can be performed in order.Described controller can be suitable for performing and managing this type of multiple concurrent thread.Described controller can comprise the receiver accepting memory access request and the transmitter returning response, and they all can comprise first-in first-out buffer (FIFO) to process the memory access of multiple concurrent thread.
These features can provide individual interface, so that multiple user or thread such as should be used for utilizing described bit array by multithreading, in described multithreading application, multiple thread sends the request (such as described request) accessing described bit array simultaneously.
Another advantage can be that described storage system can realize new atom storage operation to use the bit array being better than traditional bit array.Such as, AMO can be the modification of processor storage or load instructions.
The operation asked only can need the address of described cache line, and without the need to the element address of described bit array.
According to an embodiment, described request comprises fetches and sets a 0-position request, and the execution of described request comprises the following one or more action of execution: from index 0, read each position of described bit array in order; Finding, in the primary situation with 0 value, to return the index that first is found position, and described first value being found position is set as 1; Otherwise return predefined fail values.
Term 0-position (0-bit) refers to the position with 0 value.Term 1-position (1-bit) refers to the position with 1 value.
For the bit array with E element, fetch and set a 0-position request and return 0 to the index value in E-1.If do not have 0-position in described bit array, then described predefined fail values may not be corresponding with effective index.Such as, described fail values can be E (figure place of described bit array).
Described cache line also can be accessed by other step-by-step AMO.Such as, the step-by-step AMO addressing of such as storeAND () and so on the 8 byte words revised in 128 byte cacheline.Such as, thread sends and fetches and set an O-position request to obtain the resource corresponding with the primary index with 0 value.After this resource of use, thread sends storeAND () operation so that this position is set as 0 value for the suitable word in described cache line, thus resource can be used again.Like this, bit array can be used for the distribution of page, index node (inode), dish sector etc.
According to an embodiment, described request comprises fetches and sets last 0-position request, wherein said bit array comprises E element, and the execution of described request comprises the following one or more action of execution: from index E-1, read each position of described bit array in order; When finding, in described bit array, there is the last position of 0 value, returning the index of described last position, and the value of described last position is set as 1; Otherwise return predefined fail values.
According to an embodiment, described request comprises the request counted 1-position, and the execution of wherein said request comprises the following one or more action of execution: each position of reading described bit array; The position with 1 value is counted; And return the result of described counting.Such as, this type of request is also referred to as group's counting (populationcount) request.
According to an embodiment, described request comprises 0 request fetching and set isolation, and the execution of described request comprises and performs following one or more action: read described bit array and 0 that determines the position with 0 value or more continuous sequence; When discovery 0 sequence, return predefined fail values; Otherwise the figure place based on one or more continuous sequence arranges (rank) described one or more continuous sequence; Select a position in described one or more continuous sequence with the continuous sequence of lowest order digit; Return the index of selected position; The value of described selected position is set as 1.
This can be favourable, can for the 0 the longest possible series of other concurrent operations because can retain.When described bit array represents the continuous example of some resource (such as page or disk block), this can be favourable, wherein retains a series of 0 and will retain continuous print available resources, so that resource acquisition in the future.The acquisition in this type of future can use for described bit array fetch and set N number of continuous print 0-position request.
According to an embodiment, described request comprises fetches a 1-position request, and the execution of wherein said request comprises the following one or more action of execution: from index 0, read each position of described bit array in order; Finding, in the primary situation with 1 value, to return the index that first is found position; Otherwise return predefined fail values.Such as, fetch a 1-position request can be used to based on the highest priority element in the queue of bit array Identity Priority.
According to an embodiment, described request comprises fetches and removes a 1-position request, and the execution of described request comprises the following one or more action of execution: from index 0, read each position of described bit array in order; Finding, in the primary situation with 1 value, to return the index that first is found position, and described first value being found position is set as 0; Otherwise return predefined fail values.Such as, to fetch and a 1-position request of removing can be used to based on described bit array mark and the highest priority element that removes in Priority Queues.
According to an embodiment, described request comprises fetches and removes last 1-position request, wherein said bit array comprises E element, and the execution of described request comprises the following one or more action of execution: from index E-1, read each position of described bit array in order; When finding, in described bit array, there is the last position of 1 value, returning the index of described last position, and the value of described last position is set as 0; Otherwise return predefined fail values.Such as, the lowest priority element that last 1-position request can be used to identify based on described bit array and remove in Priority Queues is fetched and removes.
According to an embodiment, described request comprises fetches N number of 1-position request, and described request provides value for N, and the execution of described request comprises the following one or more action of execution: from index 0, read each position of described bit array in order; When occurring for the N time that finds to have the position of 1 value, return the found index of the N number of 1; Otherwise return predefined fail values.Such as, if described request provides 1 value for N, then described request is equal to and fetches a 1-position request.Such as, when having 4 threads, fetch one that N number of 1-position request allows in each thread identification Priority Queues in front 4 elements.
According to an embodiment, described request comprises fetches and removes N number of 1 request, and described request provides value for N, and the execution of described request comprises the following one or more action of execution: from index 0, read each position of described bit array in order; When occurring for the N time that finds to have the position of 1 value, returning the found index of the N number of 1 and the value being found position is set as 0; Otherwise return predefined fail values.Such as, if fast thread and slow thread provide 1 value and 2 values for N respectively, then 2 threads that the performance of the element in process Priority Queues is different can meet application performance target best.
According to an embodiment, described request comprises the 1-position request of fetching and removing isolation, and the execution of described request comprises and performs following one or more action: read described bit array and 0 that determines the position with 1 value or more continuous sequence; When discovery 0 sequence, return predefined fail values; Otherwise based on the described one or more continuous sequence of figure place arrangement of one or more continuous sequence; Select a position with the continuous sequence of lowest order digit; Return the index of selected position; The value of described selected position is set as 0.
According to an embodiment, described request comprises fetches and sets N number of continuous 0-position request, and the execution of wherein said request comprises the following one or more action of execution: read described bit array and determine to have 0 of N number of of 0 value or more continuous sequence; When discovery 0 sequence, return predefined fail values; Otherwise select the sequence meeting predefine condition in one or more continuous sequence; Return the primary index of selected sequence; And the described selected value of N number of is set as 1.
This such as allows thread automatic acquisition to comprise the entitlement of the sequence of N number of resource.If there is no N number of continuous print 0-position in described bit array, then can return predefined fail values.This type of resource sequence can be used for the distribution of page, index node, dish sector etc.
According to an embodiment, described predefine condition comprises the continuous position of top n that described sequence comprises described bit array.According to an embodiment, described bit array can be regarded as buffer circle, and this allows the continuous sequence with N number of to be made up of with the continuous sequence being positioned at described bit array end the continuous sequence being positioned at the beginning of described bit array.
According to an embodiment, described request comprises fetches and sets N number of 0-position request of specifying, and the execution of wherein said request comprises the following one or more action of execution: read described bit array and search the specific bit being positioned at assigned indexes place; At least one in described specific bit has 1 value, return predefined fail values; Each position in described specific bit has 0 value, each specific bit is set as 1 value, and returns predefined success value.This allows the multiple lock of thread automatic acquisition.
For fetching and set N number of appointment 0-position request, the one or more values in described request can specify the index of requested 0-position effectively.Such as, for the bit array being up to 256 elements, described request can provide 8 byte values, each byte specific bit index in wherein said 8 bytes.Be less than 8 to specify, some byte specifies identical index.In other words, in order to specify S 0-position, 8 bytes specify S unique index.Such as, for the bit array being up to 1024 elements, described request can provide 64 place values, and wherein minimum effective 60 positions are regarded as 6 fields, each field 10, each appointment position index wherein in 6 fields.
Of the present invention fetch and set N number of appointment the request of 0-position solve particle locking (granularlocking) problem, in particle locking, each process or thread must hold the multiple locks concentrated from shared lock, wherein lock the figure place E that concentrated lock quantity equals at most in described bit array, and each lock that wherein said lock is concentrated can be numbered as 0 to E-1.If do not have the present invention, particle locking can produce trickle lock dependence.This granularity programming personnel of increasing introduces the chance of deadlock unawares.Therefore, in the absence of the present invention, can only be followed by the perfection to strict agreement of the software support of relative complex (expense is huge) and application programming and build lock (such as, manage multiple concurrent lock, so that deleted entry X from Table A automatically, then inserts table B by X).
If the quantity of object set (each object is locked) is greater than E, then by the numbering that each object hash is between 0 and E-1, thus allow to use described bit array when losing thicker lock granularity performance.
Such as, described bit array realizes by two or more cache lines of described storage system, and two or more cache lines wherein said are continuous print cache lines.Two or more continuous print cache lines described can be indicated to the request of described bit array executable operations, and the first cache line position within the storage system in two or more cache lines described.
On the other hand, the present invention relates to a kind of computer program, it comprises the computer executable instructions of the method step of the method performed according to above-mentioned any embodiment.
On the other hand, the present invention relates to a kind of system for realizing bit array in cache line, described system comprises memory storage and controller, described system is configured to configure described bit array in described cache line, described bit array comprises the array with multiple, and wherein said configuration comprises the value of in the described bit array of definition each further; Described controller receives the request to described bit array executable operations, and wherein said request indicates the position of described cache line in described memory storage and the information of instruction appointment described request; For one or more actions that described operation uses described message identification to perform described bit array, wherein said one or more action is coded in described controller; And in response to reception described request, perform described request by performing one or more coded action.
Described storage system can be the one-level in cache hierarchy, makes the described controller performing request that more rudimentary (multiple) in storage system accessing cache hierarchical structure can be made to set up metadata and element in storage cache.Storage system can be divided into two or more parts, and the controller in one of them part can use the storage operation dynamic array data structure in this part.A cache level can be replicated to two or more unit, and wherein said controller can bottom high-speed cache in accessing cache unit or other arbitrary portion of storage level.
" computer-readable recording medium " used herein comprises any tangible storage medium that can store the instruction that can be performed by the processor of computing equipment.Described computer-readable recording medium can be called as computer-readable non-provisional storage medium.Described computer-readable recording medium also can be called as tangible computer-readable medium.In certain embodiments, can also store can by the data of the processor access of computing equipment for computer-readable recording medium.The example of computer-readable recording medium comprises-but be not limited to-floppy disk, magnetic hard disk drives, solid state hard disc, flash memory, USB thumb actuator, random access memory (RAM), ROM (read-only memory) (ROM), CD, magnetooptical disc, and the register file of processor.The example of CD comprises compact disk (CD) and digital versatile disc (DVD), such as CD-ROM, CD-RW, CD-R, DVD-ROM, DVD-RW or DVD-R dish.Term " computer-readable recording medium " also refers to polytype recording medium, and these recording mediums can be accessed via network or communication link by computer equipment.Such as, data are by modulator-demodular unit, be retrieved by the Internet or by LAN (Local Area Network).The computer-executable code that computer-readable medium comprises can to comprise with any suitable medium transmission-but be not limited to-wireless, wired, optical cable, RF etc., or the combination of above-mentioned any appropriate.
Computer-readable signal media can comprise such as in a base band or as carrier wave a part propagate data-signal, wherein carry computer-executable code.The data-signal of this propagation can adopt various ways, comprise-but be not limited to the combination of-electromagnetic signal, light signal or above-mentioned any appropriate.Computer-readable signal media can be any computer-readable medium beyond computer-readable recording medium, and this computer-readable medium can send, propagates or transmit the program for being used by instruction execution system, device or device or be combined with it.
" computer memory " or " storer " is an example of computer-readable recording medium.Computer memory is can by any storer of processor direct access." Computer Memory Unit " or " memory storage " is the another example of computer-readable recording medium.Computer Memory Unit is any non-volatile computer readable storage medium storing program for executing.In certain embodiments, Computer Memory Unit can also be computer memory, and computer memory also can be Computer Memory Unit.
" processor " used herein comprises can the electronic package of executive routine, machine-executable instruction or computer-executable code.Be interpreted as comprising more than one processor or process core to the quoting of computing equipment comprising " processor ".Described processor can be such as polycaryon processor.Processor also can refer to the processor sets in single computer systems, or refers to be distributed in the processor sets in the middle of multiple computer system.Term " computing equipment " also should be interpreted as referring to computing equipment set or network of computing devices, and each computing equipment in described computing equipment set or network of computing devices comprises one or more processor.Described computer-executable code can be performed by the multiple processors in same computing equipment, also can be performed by the multiple processors even distributed across multiple computing equipment.
Computer-executable code can comprise the machine-executable instruction or the program that make processor perform an aspect of of the present present invention.Computer-executable code for performing the operation of various aspects of the present invention can be write with the combination in any of one or more programming languages and be compiled into machine-executable instruction, described programming language comprises object oriented program language-such as Java, Smalltalk, C++ etc., also comprises conventional process type programming language-such as " C " language or similar programming language.In some cases, described computer-executable code can take the form of higher level lanquage or precompiler form, and the interpreter that can produce machine-executable instruction in conjunction with being in operation uses.
Described computer-executable code can fully perform on the user computer, partly perform on the user computer, as one, independently software package performs, partly part performs on the remote computer or performs on remote computer or server completely on the user computer.In the situation relating to remote computer, remote computer can by the network of any kind-comprise LAN (Local Area Network) (LAN) or wide area network (WAN)-be connected to subscriber computer, or, outer computer (such as utilizing ISP to pass through Internet connection) can be connected to.
With reference to according to the process flow diagram of the method for the embodiment of the present invention, device (system) and computer program and/or block diagram, various aspects of the present invention are described.Should be appreciated that the combination of each square frame in each square frame of process flow diagram and/or block diagram and process flow diagram and/or block diagram, can realize by taking at where applicable the computer program instructions of computer-executable code form.Can understand further, when not mutual exclusion, the combination of the square in different process flow diagrams, diagram and/or block diagram can be combined.These computer program instructions can be supplied to the processor of multi-purpose computer, special purpose computer or other programmable data treating apparatus, thus produce a kind of machine, make these instructions when the processor by computing machine or other programmable data treating apparatus performs, create the device of the function/action specified in the one or more square frames in realization flow figure and/or block diagram.
Also can these computer program instructions be stored in computer-readable medium, these instructions make computing machine, other programmable data treating apparatus or miscellaneous equipment work in a specific way, thus the instruction be stored in computer-readable medium just produces the manufacture (articleofmanufacture) of the instruction of the function/action specified in the one or more square frames comprised in realization flow figure and/or block diagram.
Also can computer program instructions be loaded on computing machine, other programmable data treating apparatus or miscellaneous equipment, make to perform sequence of operations step on computing machine, other programmable device or miscellaneous equipment, to produce computer implemented process, thus the instruction performed on computing machine or other programmable device is made to provide the process of the function/action specified in the one or more square frames in realization flow figure and/or block diagram.
Person of ordinary skill in the field knows, various aspects of the present invention can be implemented as device, method or computer program.Therefore, various aspects of the present invention can be implemented as following form, that is: hardware embodiment, completely Software Implementation (comprising firmware, resident software, microcode etc.) completely, or the embodiment that hardware and software aspect combines, " circuit ", " module " or " system " can be referred to as here.In addition, various aspects of the present invention can also be embodied as the form of the computer program in one or more computer-readable medium, comprise computer-executable code in this computer-readable medium.
To understand, one or more in above-described embodiment are combined, as long as the embodiment not mutual exclusion of combination.
Accompanying drawing explanation
Below with reference to the accompanying drawings, only in more detail the preferred embodiments of the present invention are described by example, wherein:
Fig. 1 illustrates the system architecture that can operate the method performed for realizing bit array in the cache line in storer;
Fig. 2 illustrates the block diagram of storage system;
Fig. 3 is the figure of the sequence of operations illustrated for bit array; And
Fig. 4 is the process flow diagram of the method for realizing bit array in the cache line in storer.
Embodiment
Hereinafter, in accompanying drawing, number parts like identical parts specified class, or specify the parts performing identical functions.If functional equivalent, the parts of by the agency of need not be introduced in figure below above.
Fig. 1 illustrates that the computer system (or server) 112 in computing system 100 shows with the form of universal computing device.The assembly of computer system 112 can include but not limited to: one or more processor or processing unit 116, system storage 128, connects the bus 118 of different system assembly (comprising system storage 128 and processing unit 116).
Computer system 112 typically comprises various computing systems computer-readable recording medium.These media can be any obtainable media can accessed by computer system 112, comprise volatibility and non-volatile media, moveable and immovable medium.
Storage system 128 can comprise the computer system-readable medium taking volatile memory form, such as random access memory (RAM) and/or cache memory.Storage system can comprise one or more active buffer memory device.Active buffer device can comprise multiple memory element (such as, chip).Active buffer memory device can comprise the accumulation layer forming three-dimensional (" 3D ") memory device, and in this three-dimensional storage part, each chip alignment forms the storehouse (Vault) communicated with processing unit 116.Active buffer memory device can comprise multiple subregion, and these subregions can be accessed by multiple treatment element simultaneously, and wherein these subregions can be any suitable memory paragraphs, comprising but be not limited to storehouse (Vault).
Processing unit 116 can send request to storage system, thus utilizes the metadata of dynamic array data structure and association to realize application.
Computer system 112 also can communicate with one or more external units 114 of such as keyboard, indication equipment, display 124 and so on; With allow user and the mutual one or more devices communicatings of computer system 112; And/or communicate with any equipment (such as, network interface card, telephone modem) allowing computer system 112 with other computing device communication one or more.This type of communication can occur via I/O interface 122.And computer system 112 can via the one or more network services of network adapter 120 with such as LAN (Local Area Network) (LAN), wide area network (WAN) and/or public network (such as, the Internet) and so on.As described here, network adapter 120 is via bus 118 other component communication with computer system/server 112.
Fig. 2 illustrates the block diagram of storage system 128 in detail.Storage system 128 comprises controller 206 and memory storage 208.Memory storage 208 can be such as any suitable physical storage, such as high-speed cache or random access memory (RAM).
Controller 206 comprises the receiver 214 of request 210 and the transmitter 216 of response 212, and receiver 214 and transmitter 216 are configured to communicate with bus 118, and wherein receiver 214 and transmitter 216 include first-in first-out buffer.In response to request 210, controller 206 pairs of memory storages 208 perform read/write access, and can return response 212.
Memory storage 208 comprises one or more cache line 211.Cache line 211 can be used as bit array and is accessed, and this bit array can comprise place value 0 or 1.Can realize bit array in cache line 211, such as mode is: configuring high speed caching is at least partially capable, with use step-by-step operation accessing cache capable described at least partially.
The structure of described storage system can provide individual interface, so that multiple user or thread such as should be used for utilizing described bit array by multithreading, in described multithreading application, multiple thread sends the capable request of accessing cache (such as asking 210) simultaneously.
The request 210 of bit array 211 executable operations is performed by performing the action corresponding with described operation requests by controller 206.These actions are coded in controller.
Request 210 from requestor or user is received from bus 118 by the receiver 214 of controller 206.Requestor can be the thread performing application, such as, perform the thread of AMO operation.Request 210 instruction cache line position in memory.
Request 210 can be about any request suitably to bit array 318 executable operations, such as, all elements of bit array is set as 0, or all elements is set as the operation requests of 1.This can allow the initialization of bit array.This request can comprise for return be set to 1 figure place, a 1-position, middle 1-position or the first transition bit (that is, when leading 0 changes to 1, or leading 1 changes to 0) operation requests.This request can comprise the operation requests returning and remove whole first mark (blip) further.Such as, for array 00001101, home position 4 and length 2, and the array obtained is 00000001.Middle 1-position refers to position such in array: the 1-position in this front and back with equal number.
A position in bit array 215 or one group of position can be accessed in communication 218 by controller 206, and wherein this access is based on the operation in request 210.Be fetch in the example of a 0-position request in request 210, the primary value with 0 value read in communication 218 sends to user by controller 206 by response 210.
Fig. 3 is the figure that the sequence of operations performed for bit array (such as, 211) is shown.Bit array 318.1 comprises the position of fixed qty.For simplified characterization, bit array 318.1 is shown to include first 24 of cache line.Position, position from left to right marks, as shown in row 333.
These operations can be asked by the controller (such as, 206) of storage system as concurrent request and receive, and can according to under type processed in sequence (such as, one next).Sequence of operation can be arbitrary.
The operation 301 of sequence is corresponding with the fetchAndSetFirst0Bit (cacheLineAddress) returning in bit array 318.1 the primary index with 0 value, then this position is set as value 1.For this reason, controller 206 can read bit array in order from having the position of index 00, until find first (that is, the index 00) with 0 value.Then controller can return index 00.Then, the value of position 00 is set as 1 by controller, thus produces bit array 318.2.
The operation 303 of sequence is also corresponding with the fetchAndSetFirst0Bit (cacheLineAddress) returning in bit array 318.2 the primary index with 0 value, then this position is set as value 1.For this reason, controller 206 can read bit array in order from having the position of index 00, until find first (that is, the index 03) with 0 value.Then controller can return index 03.Then, the value of position 00 is set as 1 by controller, thus produces bit array 318.3.
The operation 305 of sequence is corresponding with the fetchAndSetFirstNcontiguous0Bits (cacheLineAddress, 2) of the primary index returning in bit array 318.3 first group 2 continuous 0, then these 2 positions is set as value 1.For this reason, controller 206 can read bit array 318.3 in order from having the position of index 00, until find front 2 continuous positions (that is, index 06,07) all with 0 value.Then controller can return index 06, and the value of two positions 06 and 07 is set as 1, thus produces bit array 318.4.
The operation 307 of sequence is corresponding with the fetchAndSetFirstNcontiguous0Bits (cacheLineAddress, 4) of the primary index returning in bit array 318.3 first group 4 continuous 0, then these 4 positions is set as value 1.For this reason, controller 206 can read bit array 318.4 in order from having the position of index 00, until find front 4 continuous positions (that is, index 11-14) all with 0 value.Then controller can return index 11, and the value of 4 position 11-14 is set as 1, thus produces bit array 318.5.
The operation 309 of sequence is corresponding with fetchAndSetNgiven0Bits (cacheLineAddress, 3, x04090F).2nd parameter 3 indicates 3 positions can change to 1 from 0.3rd parameter 0x040911 indicates 3 position indexes expected to be: 0x04,0x09,0x11==17.Then controller 206 can read the value of each in these 3 positions of bit array 318.5, and judges whether each position has 0 value.Controller can return 1 to indicate successfully, and the value of each in these 3 positions can be set as 1, thus produces bit array 318.6.
The operation 311 of sequence is corresponding with fetchAndSetNgiven0Bits (cacheLineAddress, 2, x0108).2nd parameter 2 indicates 2 positions can change to 1 from 0.Position index 0x01,0x08 that 3rd parameter 0x0108 indicates 2 to expect.Then controller can read the value of each in these 2 positions of bit array 318.6, and judges whether each position has 0 value.In the case, a position in two positions has 1 value.Controller can return predefined fail values 0 to indicate failed request, and the content of not dirty bit array 318.6.
Fig. 4 realizes the illustrative methods of bit array and the process flow diagram of system for operating storage system (all storage systems as shown in Figure 1) with (such as, in single cache line) in the cache line in storage system.
In step 401, configuring high speed caching is capable in memory, and to make to can be used as bit array, to carry out accessing cache at least partially capable.Such as, can be used as the bit array with 1024 elements, to carry out accessing cache capable.
In step 403, controller receives the request to bit array executable operations from user or requestor.Request 210 provides the address location of the cache line that will operate.The requestor sending request can be any suitable user, the thread such as run on a processor, the processing element be included in buffer-stored storehouse, or via the thread of network interface logic by network service.
In step 405, one or more actions that controller uses described message identification to perform bit array for described operation.Described one or more action is coded in controller.Multiple action can be coded in controller.The mark of one or more action can comprise selects one or more action from multiple action.
In step 407, controller performs one or more action to perform described request.Such as, described request can comprise fetches and sets a 0-position to return in bit array the primary index with 0 value, then this position is set as value 1.
As shown in the figure, after step 407, next request 210 pending such as controller, then returns square frame 403.
Initial configuration procedure 401 can perform once during initial configuration, and step 403 to 407 can be repeated when controller 206 processes each operation requests simultaneously.
List of reference characters
100 computing systems
112 servers
114 external units
116 processors
120 network adapter
122I/O interface
124 displays
128 storage systems
206 controllers
208 memory storages
210 requests
211 cache lines
212 responses
213 metadata fields
214 receivers
215 element field
216 transmitters
218-220 communicates
301-311 operates
311 cache lines
333 indexes
401-407 step

Claims (15)

1., for realizing a method for bit array (318) in the cache line (211) of storage system (128) comprising memory storage (208) and controller (206), described method comprises:
-described the bit array of configuration (318) in described cache line (211), described bit array (318) comprises the array with multiple, and wherein said configuration comprises the value of in the described bit array of definition (318) each further;
-described controller (206) receives the request (210) to bit array (318) executable operations, and wherein said request (210) indicates the position of described cache line (211) in described memory storage (208) and the information of instruction appointment described request (210);
One or more actions that-described controller (206) uses described message identification to perform described bit array (318) for described operation, wherein said one or more action is coded in described controller (206); And
-in response to reception described request (210), perform described request (210) by performing one or more coded action.
2. method according to claim 1 and 2, wherein said request comprises fetches and sets a 0-position request, and the execution of described request comprises the following one or more action of execution:
-from index 0, read each position of described bit array (318) in order;
-finding, in the primary situation with 0 value, to return the index that first is found position, and described first value being found position is set as 1; Otherwise return predefined fail values.
3. method according to claim 1 and 2, wherein said request comprises fetches and sets last 0-position request, and wherein said bit array comprises E element, and the execution of described request comprises the following one or more action of execution:
-from index E-1, read each position of described bit array (318) in order;
-when finding, in described bit array (318), there is the last position of 0 value, return the index of described last position, and the value of described last position is set as 1; Otherwise return predefined fail values.
4. method according to claim 1 and 2, wherein said request comprises the 0-position request fetching and set isolation, and the execution of described request comprises the following one or more action of execution:
-read described bit array (318) and 0 that determines the position with 0 value or more continuous sequence;
-when discovery 0 sequence, return predefined fail values; Otherwise:
Based on the described one or more continuous sequence of figure place arrangement of one or more continuous sequence;
Select a position in described one or more continuous sequence with the continuous sequence of lowest order digit;
Return the index of selected position;
The value of described selected position is set as 1.
5. method according to claim 1 and 2, wherein said request comprises fetches and removes a 1-position request, and the execution of described request comprises the following one or more action of execution:
-from index 0, read each position of described bit array (318) in order;
-finding, in the primary situation with 1 value, to return the index that first is found position, and described first value being found position is set as 0; Otherwise return predefined fail values.
6. method according to claim 1 and 2, wherein said request comprises fetches and removes last 1-position request, and wherein said bit array comprises E element, and the execution of described request comprises the following one or more action of execution:
-from index E-1, read each position of described bit array (318) in order;
-when finding, in described bit array (318), there is the last position of 1 value, return the index of described last position, and the value of described last position is set as 0; Otherwise return predefined fail values.
7. method according to claim 1 and 2, wherein said request comprises the 1-position request of fetching and removing isolation, and the execution of described request comprises the following one or more action of execution:
-read described bit array (318) and 0 that determines the position with 1 value or more continuous sequence;
-when discovery 0 sequence, return predefined fail values; Otherwise:
Based on the described one or more continuous sequence of figure place arrangement of one or more continuous sequence;
Select a position with the continuous sequence of lowest order digit;
Return the index of selected position;
The value of described selected position is set as 0.
8. method according to claim 1 and 2, wherein said request comprises fetches and sets N number of continuous 0-position request, and the execution of wherein said request comprises the following one or more action of execution:
-read described bit array (318) and determine to have 0 of N number of of 0 value or more continuous sequence;
-when discovery 0 sequence, return predefined fail values; Otherwise
Select the sequence meeting one or more combinations of predefine condition;
Return the primary index of the selected sequence of N number of;
The described selected value of N number of is set as 1.
9. method according to claim 8, wherein said predefine condition comprises the continuous position of top n that described sequence comprises described bit array (318).
10. method according to claim 1 and 2, wherein said request comprises fetches and sets N number of 0-position request of specifying, and the execution of wherein said request comprises the following one or more action of execution:
-read described bit array (318) and search the specific bit being positioned at assigned indexes place;
-at least one in described specific bit has 1 value, return predefined fail values;
-each position in described specific bit has 0 value, each specific bit is set as 1 value, and returns predefined success value.
11. methods according to claim 1 and 2, wherein said request comprises the request counted 1-position, and the execution of wherein said request comprises the following one or more action of execution:
-read each position of described bit array (318);
-quantity of the position with 1 value is counted; And
-return the result of described counting.
12. methods according to claim 1 and 2, wherein said request comprises fetches a 1-position request, and the execution of wherein said request comprises the following one or more action of execution:
-from index 0, read each position of described bit array (318) in order;
-finding, in the primary situation with 1 value, to return the index that first is found position; Otherwise return predefined fail values.
13. methods according to claim 1 and 2, wherein said request comprises fetches N number of 1-position request, and described request provides value for N, and the execution of described request comprises the following one or more action of execution:
-from index 0, read each position of described bit array (318) in order;
-when occurring for the N time that finds to have the position of 1 value, return the found index of the N number of 1; Otherwise return predefined fail values.
14. 1 kinds of computer programs, it comprises the computer executable instructions of the method step of the method performed according to above-mentioned arbitrary claim.
15. 1 kinds for realizing the system of bit array (318) in cache line (211), described system comprises memory storage (208) and controller (206), and described system is configured to:
-described the bit array of configuration (318) in described cache line (211), described bit array (318) comprises the array with multiple, and wherein said configuration comprises the value of in the described bit array of definition each further;
-described controller (206) receives the request (210) to bit array (318) executable operations, and wherein said request (210) indicates the position of described cache line (211) in described memory storage (208) and the information of instruction appointment described request;
-operating for described the one or more actions using described message identification to perform described bit array (318), wherein said one or more action is coded in described controller (206); And
-in response to reception described request (210), perform described request (210) by performing one or more coded action.
CN201480038914.5A 2013-07-11 2014-07-01 For realizing the method and system of bit array in cache line Expired - Fee Related CN105378686B (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
GB1312446.6A GB2516092A (en) 2013-07-11 2013-07-11 Method and system for implementing a bit array in a cache line
GB1312446.6 2013-07-11
PCT/IB2014/062757 WO2015004571A1 (en) 2013-07-11 2014-07-01 Method and system for implementing a bit array in a cache line

Publications (2)

Publication Number Publication Date
CN105378686A true CN105378686A (en) 2016-03-02
CN105378686B CN105378686B (en) 2018-05-29

Family

ID=49081142

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201480038914.5A Expired - Fee Related CN105378686B (en) 2013-07-11 2014-07-01 For realizing the method and system of bit array in cache line

Country Status (5)

Country Link
JP (1) JP6333371B2 (en)
CN (1) CN105378686B (en)
DE (1) DE112014003212T5 (en)
GB (2) GB2516092A (en)
WO (1) WO2015004571A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111108485A (en) * 2017-08-08 2020-05-05 大陆汽车有限责任公司 Method of operating a cache

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE112016000512T5 (en) 2015-03-26 2017-11-23 Spiration, Inc. D.B.A. Olympus Respiratory America BIOPSY SAMPLE RETAINING MECHANISM

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030014588A1 (en) * 2001-07-10 2003-01-16 Micron Technology, Inc. Caching of dynamic arrays
JP2003030051A (en) * 2001-07-19 2003-01-31 Sony Corp Data processor and data access method
CN101149704A (en) * 2007-10-31 2008-03-26 中国人民解放军国防科学技术大学 Segmental high speed cache design method in microprocessor and segmental high speed cache

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH03172947A (en) * 1989-11-13 1991-07-26 Matra Design Semiconductor Inc Microcomputer system
AU2001236793A1 (en) * 2000-02-25 2001-09-03 Sun Microsystems, Inc. Apparatus and method for maintaining high snoop traffic throughput and preventing cache data eviction during an atomic operation
US6836823B2 (en) * 2001-11-05 2004-12-28 Src Computers, Inc. Bandwidth enhancement for uncached devices
CN101689143B (en) * 2007-06-20 2012-07-04 富士通株式会社 Cache control device and control method
US8296524B2 (en) * 2009-06-26 2012-10-23 Oracle America, Inc. Supporting efficient spin-locks and other types of synchronization in a cache-coherent multiprocessor system
US8543769B2 (en) * 2009-07-27 2013-09-24 International Business Machines Corporation Fine grained cache allocation
US8566524B2 (en) * 2009-08-31 2013-10-22 International Business Machines Corporation Transactional memory system with efficient cache support
US20110219215A1 (en) * 2010-01-15 2011-09-08 International Business Machines Corporation Atomicity: a multi-pronged approach
US20120185672A1 (en) * 2011-01-18 2012-07-19 International Business Machines Corporation Local-only synchronizing operations

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030014588A1 (en) * 2001-07-10 2003-01-16 Micron Technology, Inc. Caching of dynamic arrays
JP2003030051A (en) * 2001-07-19 2003-01-31 Sony Corp Data processor and data access method
CN101149704A (en) * 2007-10-31 2008-03-26 中国人民解放军国防科学技术大学 Segmental high speed cache design method in microprocessor and segmental high speed cache

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111108485A (en) * 2017-08-08 2020-05-05 大陆汽车有限责任公司 Method of operating a cache
CN111108485B (en) * 2017-08-08 2023-11-24 大陆汽车科技有限公司 Method of operating a cache

Also Published As

Publication number Publication date
CN105378686B (en) 2018-05-29
JP6333371B2 (en) 2018-05-30
DE112014003212T5 (en) 2016-04-28
GB201312446D0 (en) 2013-08-28
GB2516092A (en) 2015-01-14
WO2015004571A1 (en) 2015-01-15
GB201601479D0 (en) 2016-03-09
JP2016526739A (en) 2016-09-05
GB2530962B (en) 2020-04-22
GB2530962A (en) 2016-04-06

Similar Documents

Publication Publication Date Title
CN106980669B (en) A kind of storage of data, acquisition methods and device
US20180060318A1 (en) Coordinated hash table indexes to facilitate reducing database reconfiguration time
US9805074B2 (en) Compressed representation of a transaction token
EP3238421B1 (en) System for high-throughput handling of transactions in data-partitioned, distributed, relational database management system
KR101959153B1 (en) System for efficient processing of transaction requests related to an account in a database
US10013312B2 (en) Method and system for a safe archiving of data
WO2017219858A1 (en) Streaming data distributed processing method and device
US9305112B2 (en) Select pages implementing leaf nodes and internal nodes of a data set index for reuse
CN105320608A (en) Memory controller and method for controlling a memory device to process access requests
US11036635B2 (en) Selecting resources to make available in local queues for processors to use
US10901640B2 (en) Memory access system and method
CN107818114A (en) A kind of data processing method, device and database
CN113076304A (en) Distributed version management method, device and system
CN109376165A (en) The implementation method and device and computer readable storage medium of memory database lock
US11048557B2 (en) Methods and modules relating to allocation of host machines
US10275480B1 (en) Immediately-consistent lock-free indexing for distributed applications
CN109614411B (en) Data storage method, device and storage medium
US10311033B2 (en) Alleviation of index hot spots in data sharing environment with remote update and provisional keys
CN105378686A (en) Method and system for implementing a bit array in a cache line
US8788470B2 (en) Allocating and managing random identifiers using a shared index set across products
CN108846009B (en) Copy data storage method and device in ceph
CN109791541B (en) Log serial number generation method and device and readable storage medium
US11321146B2 (en) Executing an atomic primitive in a multi-core processor system
CN113849482A (en) Data migration method and device and electronic equipment
CN108027835A (en) Apparatus and method for the storage for managing primary database and replica database

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20180529

Termination date: 20200701