CN111949648A - Memory cache data system and data indexing method - Google Patents

Memory cache data system and data indexing method Download PDF

Info

Publication number
CN111949648A
CN111949648A CN201910397340.5A CN201910397340A CN111949648A CN 111949648 A CN111949648 A CN 111949648A CN 201910397340 A CN201910397340 A CN 201910397340A CN 111949648 A CN111949648 A CN 111949648A
Authority
CN
China
Prior art keywords
data
index
dimension
bitmap
memory cache
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910397340.5A
Other languages
Chinese (zh)
Other versions
CN111949648B (en
Inventor
胡蓉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Wodong Tianjun Information Technology Co Ltd
Original Assignee
Beijing Wodong Tianjun Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Wodong Tianjun Information Technology Co Ltd filed Critical Beijing Wodong Tianjun Information Technology Co Ltd
Priority to CN201910397340.5A priority Critical patent/CN111949648B/en
Publication of CN111949648A publication Critical patent/CN111949648A/en
Application granted granted Critical
Publication of CN111949648B publication Critical patent/CN111949648B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2255Hash tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a memory cache data system and a data indexing method, and relates to the technical field of computers. One embodiment of the system comprises: storing original data of a plurality of targets in a data container preset in a memory cache; wherein the raw data comprises data of at least one index dimension; the raw data of each of the plurality of targets as an element of the data container, each target having a unique subscript of the data container; the system further comprises: bitmap data corresponding to each sub-dimension of the index dimension for indexing the original data; and each bitmap data comprises values of the plurality of targets in the sub-dimensions corresponding to the bitmap data, which are arranged according to a preset sequence based on the subscripts. The implementation method can store multi-dimensional data of the target and construct an index structure by using a Bitmap algorithm, and multi-dimensional efficient indexing of the memory cache data is realized.

Description

Memory cache data system and data indexing method
Technical Field
The invention relates to the technical field of computers, in particular to a memory cache data system and a data indexing method.
Background
In the existing caching technology, data is generally stored based on a hash table structure, and when indexing the data, a corresponding Value is searched according to a Key in the hash table. In the data structure, only one-to-one indexing of Key and Value can be performed, the capability of indexing according to multiple data dimensions is not provided, and the requirements of high-dimensional data structures and complex services cannot be supported.
Disclosure of Invention
In view of this, embodiments of the present invention provide a memory cache data system and a data indexing method, which can store multi-dimensional data of a target and construct an index structure using a Bitmap algorithm, so as to implement multi-dimensional efficient indexing of memory cache data.
To achieve the above object, according to one aspect of the present invention, a memory cache data system is provided.
The memory cache data system of the embodiment of the invention can comprise: storing original data of a plurality of targets in a data container preset in a memory cache; wherein the raw data comprises data of at least one index dimension; the raw data of each of the plurality of targets as an element of the data container, each target having a unique subscript of the data container; the system may further comprise: bitmap data corresponding to each sub-dimension of the index dimension for indexing the original data; and each bitmap data comprises values of the plurality of targets in the sub-dimensions corresponding to the bitmap data, which are arranged according to a preset sequence based on the subscripts.
Optionally, the system may further include a hash table for storing the hash value of each sub-dimension of the index dimension and a storage location flag of the bitmap data corresponding to the sub-dimension.
Optionally, the raw data may further include: data of at least one non-indexed dimension.
Optionally, the sequence is an ascending order of the subscripts, and a sequence number in bitmap data of any one of the plurality of targets is the same as a data container subscript of the target.
Optionally, the data container is a dynamic array.
To achieve the above object, according to another aspect of the present invention, a data indexing method based on the above memory cache data system is provided.
The data indexing method of the embodiment of the invention comprises the following steps: receiving an index request, and acquiring at least one index condition carried in the index request; determining at least one bitmap data corresponding to the index condition in a memory cache, and determining a sequence number of at least one target meeting the index condition by using the bitmap data or a bit operation result aiming at the bitmap data; acquiring subscripts of the targets in the data containers according to the serial numbers; wherein the bit operation is determined by a logic state of an index condition in the index request; in response to the index request, the original data pointed to by the subscript in the data container is returned.
Optionally, the method further comprises: and when the request data of the index request is not stored in the memory cache, determining the request data to be returned from the disk or a third-party storage system according to the subscript of the acquired target in the data container.
Optionally, the bit operation comprises at least one of: or, and, not, or exclusive or.
To achieve the above object, according to still another aspect of the present invention, there is provided an electronic apparatus.
An electronic device of the present invention includes: one or more processors; the storage device is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors realize the data indexing method provided by the invention.
To achieve the above object, according to still another aspect of the present invention, there is provided a computer-readable storage medium.
A computer-readable storage medium of the present invention has stored thereon a computer program which, when executed by a processor, implements the data indexing method provided by the present invention.
According to the technical scheme of the invention, one embodiment of the invention has the following advantages or beneficial effects:
firstly, storing multi-dimensional data of a target by adopting data containers such as dynamic arrays and the like, and generating bitmap data which can represent the value of the target in each sub-dimension aiming at the sub-dimension of each index dimension; when indexing data, the bitmap data corresponding to the indexing conditions is obtained and bit operation is performed to determine the required target, so that the required data can be obtained from the data container. Through the arrangement, the invention can realize the high-efficiency data index supporting multi-condition combination and complex index format in the memory cache in a simple and low-cost mode, and the index statement can be written in a mode similar to SQL standard, thereby quickly acquiring the required data from the memory cache and avoiding the problem of delay caused by input/output of a disk or a network.
Secondly, in practical application, due to the fact that the number of the index dimensionality and the sub-dimensionality is large, the Bitmap data volume is large and positioning is not easy. Aiming at the problems, the invention constructs a hash table to store the hash value of each sub-dimension and the storage position mark of the Bitmap data corresponding to the sub-dimension, and can carry out quick positioning by means of the storage position mark when inquiring the Bitmap. In addition, when the service data volume is small, all the service data can be stored in the memory cache so as to improve the indexing efficiency; when the service data volume is large, the index dimension data in the service data volume can be stored in the memory cache, and the rest data can be stored in a disk or a third-party storage system, so that the memory can be efficiently utilized.
Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is a schematic diagram of a memory caching data system according to an embodiment of the invention;
FIG. 2 is a schematic diagram illustrating a bitmap data generation procedure of a memory cache data system according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating the main steps of a data indexing method according to an embodiment of the present invention;
FIG. 4 is a flow chart illustrating the implementation of a data indexing method according to an embodiment of the present invention;
FIG. 5 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;
fig. 6 is a schematic structural diagram of an electronic device for implementing the data indexing method in the embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
It should be noted that the embodiments of the present invention and the technical features of the embodiments may be combined with each other without conflict.
FIG. 1 is a schematic diagram of a memory caching data system according to an embodiment of the invention.
As shown in fig. 1, the memory cache data system according to the embodiment of the present invention may include: raw data of a plurality of targets stored in a data container and an index model including bitmap data. Specifically, the memory cache refers to a cache device provided in a memory of the electronic device, and is not a primary cache, a secondary cache, or the like of a Central Processing Unit (CPU). Taking a server memory cache as an example, the server memory cache can store part of hot data, when a terminal requests data, the data is firstly searched from the memory cache, and only when the memory cache cannot be met, the data is searched from a storage layer (such as a disk). Therefore, the probability of directly accessing the storage layer by the user can be effectively reduced, and the response speed and the processing capacity of the system are improved.
In the embodiment of the present invention, the data container refers to a virtual device that is preset in the memory cache and is used for accommodating data or objects, and has a plurality of integer subscripts for identifying each data storage location, and the data in each data storage location is used as one element of the data container. For example, dynamic array ArrayList in Java (an object-oriented programming language) programming language is a container with indices of 0, 1, 2, 3, from zero, and each data storage location can store data of integer, long integer, short integer, character, byte, Boolean, single precision floating point, double precision floating point, and reference types (e.g., classes, interfaces, objects in Java).
In this context, a target may be an entity, or may be a virtual object such as data, a concept, or the like. The raw data of the target is data that is not processed after being stored in the memory buffer, as compared with bitmap data that will be described later. In some embodiments, the raw data of each target includes data of at least one indexing dimension (i.e., a dimension that can be used for indexing), each indexing dimension may correspond to an indexing model, and the indexing model may implement indexing for the raw data in the data container by the bitmap data it contains.
Further, Bitmap data (i.e., Bitmap data) herein refers to binary data generated according to a known Bitmap algorithm (Bitmap algorithm). Those skilled in the art will understand that: the Bitmap algorithm adopts a Bit (Bit) as a unit to store data, so that the storage space can be saved; in a plurality of bitmap data generated based on the same set of targets, bits of the same sequence number correspond to the same target. The sequence number refers to an index number of each bit in the bitmap data, and the index number is increased from the upper bit to the lower bit. For example: for bitmap data having five bits (bit), the sequence numbers of the five bits are 0, 1, 2, 3, 4 in this order; in the three bitmap data 00000, 11111, 01010 generated based on the five targets A, B, C, D, E, the bits having the sequence number 0 all correspond to the object a, the bits having the sequence number 1 all correspond to the object B, the bits having the sequence number 2 all correspond to the object C, the bits having the sequence number 3 all correspond to the object D, and the bits having the sequence number 4 all correspond to the object E.
In an embodiment of the invention, the raw data of each of the plurality of targets is taken as an element of a data container, each target having a unique index of the data container. That is, the subscripts of the data containers may isolate different targets and thus may serve as an identification for each target. In data indexing, the subscript of the target is determined, and then the required data can be obtained from the data container. In fig. 1, the tabular content on the right side of the "raw data stored in the data container" is an illustration of the data container, and subscripts 0, 1, and 2 below the tabular content isolate the raw data of different targets.
In some embodiments, one bitmap data corresponds to one sub-dimension of one index dimension, and the bitmap data includes values of a plurality of targets arranged in a preset order based on the subscripts of the data container in the sub-dimension corresponding to the bitmap data. Specifically, for any index dimension, the sub-dimension refers to all dimension values of the plurality of targets in the index dimension, for example, for the index dimension of "season", if all dimension values of the plurality of targets in the season dimension are spring and autumn, the sub-dimension of the season dimension is spring and autumn. The value of the target in the sub-dimension reflects whether the target has the sub-dimension, generally, the value of the target is one when the target has the sub-dimension, and the value of the target is zero when the target does not have the sub-dimension.
The preset order based on the data container subscripts refers to an order preset according to an arrangement rule of the data container subscripts, such as an ascending order of the data container subscripts, a descending order of the data container subscripts, and the like. It can be understood that the bitmap data includes values of a plurality of targets arranged according to a preset sequence in corresponding sub-dimensions, which means that 0 and 1 data stored on bits (bit) arranged from high bits to low bits in the bitmap data are respectively used as values of the targets arranged according to the sequence in the corresponding sub-dimensions, so that a mapping relationship can be established between a sequence number in the bitmap data corresponding to the same target and a subscript of a data container, and when data indexing is performed, the subscript of the data container of the target can be determined by only acquiring the sequence number of the target in the bitmap data, and then required data can be acquired from the data container. Preferably, each bit from the upper bit to the lower bit in the bitmap data corresponds one-to-one to a plurality of objects arranged in ascending order according to the data container index, and the total number of bits of the bitmap data is the same as the total number of the objects, so that the sequence number of bits in the bitmap data corresponding to the same object is the same as the data container index, thereby further simplifying the index logic.
For example, in fig. 1, original data of 8 targets a, b, c. 00101110, 11100000, 01011010, etc. For any object, its data container index is the same as its sequence number in the bitmap data. For example, for target b, its data container index is 1, and its sequence number in the bitmap data is also 1.
In the memory cache data system comprising the index model, the efficient index of cache data can be realized by operating the bitmap data of the corresponding sub-dimension, and simultaneously, the multi-index condition combination and the complex index format can be supported.
In specific application, the number of bitmaps is large due to the fact that the number of sub-dimensions contained in the index dimension is large, and positioning is difficult. To address this problem, in the embodiment of the present invention, a hash table may be established in the index model of each index dimension to store the mapping relationship between the sub-dimension and the storage location flag of the corresponding bitmap data. The hash table may be a container class Hashtable or HashMap in Java. Specifically, in the hash table, Key is a hash Value of each sub-dimension, Value is a storage location flag of the corresponding bitmap data, and the required bitmap data can be located quickly by the storage location flag. For example, in the index model of index dimension 1 in fig. 1, the hash table located above stores the mapping relationship between the sub-dimension hash values d, e, f and the storage location markers 00, 11, 22, and after the storage location markers are obtained, the bitmap data can be quickly located by the table located below.
In an actual application scenario, if the data volume of the hot spot data is small, all data (including index dimension data and non-index dimension data) can be stored in a memory cache to realize efficient indexing. If the data volume of the hot spot data is large and the data cannot be completely contained in the memory cache, the index dimension data in the hot spot data can be stored, and the non-index dimension data is stored in a disk or a third-party storage system such as Redis (an open source storage system) and ElasticSearch (a search and storage system). When indexing is carried out on the non-index dimension data, the non-index dimension data can be obtained from a hard disk or a third-party storage system by using the data container subscript of the target determined based on the bitmap data, so that the high-efficiency utilization of the memory cache is realized.
Fig. 2 is a schematic diagram illustrating a bitmap data generating step of the memory cache data system according to the embodiment of the present invention. As shown in fig. 2, each bitmap data in the index model may be generated according to the following steps:
1. and sequentially selecting the current targets according to the subscript ascending order of the data container, and considering one index dimension of the current targets.
2. And acquiring the sub-dimension of the current target in the current index dimension, and inquiring bitmap data through the sub-dimension.
3. Judging whether bitmap data corresponding to the sub-dimensions exist in the index model: if yes, directly executing the next step; otherwise, constructing bitmap data and executing the next step.
4. And setting the bit corresponding to the target in the bitmap data as 1. It is to be understood that, before the setting of this step is performed, the data stored in the corresponding bit in the bitmap data is zero.
5. Judging whether the index dimension is traversed (namely judging whether the index dimension is traversed completely): if yes, executing the next step, otherwise returning to the step 1 to consider the next index dimension.
6. Judging whether the target is traversed (namely judging whether the target is traversed completely): if yes, ending the process; otherwise, returning to the step 1 to select the next target.
FIG. 3 is a diagram illustrating the main steps of a data indexing method according to an embodiment of the present invention. As shown in fig. 3, the data indexing method of the memory cache data system according to the embodiment of the present invention may perform the following steps:
step S301: and receiving an index request sent by the terminal, and acquiring at least one index condition carried in the index request. It is to be understood that different indexing conditions may be distinguished according to the indexing dimension or sub-dimensions of the indexing dimension.
In practical applications, the indexing condition in the indexing request may have a certain logic state. For example, if there is only one index condition described in negative form in the index request (e.g., "producing place is not Beijing"), the logical state of the index condition is "not"; if multiple index conditions exist in the index request, the logical state of the index condition is consistent with its logical relationship. For example, the index request needs to request data of "origin is beijing" and "type is clothes", the logical relationship of the index condition "origin is beijing" and "type is clothes" is and ", and the logical states of the two index conditions are and".
Particularly, since the memory cache data system supports multi-condition combination and complex index formats, an index statement used by the terminal to realize an index request can be written in a manner similar to a Structured Query Language (SQL) specification. For example, a partial index statement may be written as: where sku and type and dcid are index dimensions.
Step S302: determining at least one bitmap data corresponding to the index condition in the memory cache, and determining the sequence number of at least one target meeting the index condition by using the bitmap data or the bit operation result aiming at the bitmap data; acquiring subscripts of the targets in the data containers according to the serial numbers; the bit operation is determined by the logic state of the index condition in the index request.
In this step, if only one index condition without a logic state is carried in the index request, a bitmap data storage location flag corresponding to a sub-dimension (the sub-dimension corresponds to the index condition) is obtained in the memory cache according to the hash table, and then the bitmap data is obtained by using the storage location flag. In the bitmap data, the bit with the storage data of 1 corresponds to the target meeting the index condition, and at this time, the sequence number of the bit can be used to obtain the subscript of the corresponding data container, and the data pointed by the subscript is the request data of the index request.
For example, the index condition carried in the index request is "origin is beijing", and at this time, the bitmap data storage location mark corresponding to the "beijing" sub-dimension may be obtained from the index model of the "origin" dimension in the memory cache, so as to determine the bitmap data corresponding to the "beijing" sub-dimension. If the bitmap data is 01011010, the sequence numbers of the targets meeting the index condition are 1, 3, 4 and 6, and if the sequence numbers corresponding to the same target are the same as the data container subscripts, the data pointed by the data container subscripts 1, 3, 4 and 6 are the request data of the index request.
If the index request carries a plurality of index conditions, after determining the bitmap data corresponding to each index condition, bit operations need to be performed on the plurality of bitmap data. Generally, the bit operations are determined by the logic state of the index condition. Specifically, if the logical state of the index condition is "OR", then the bit operation is OR; if the logical state of the index condition is AND, the bit operation is AND; if the logical state of the index condition is "not," the bit operation is either a non-operation (possibly in combination with an and operation on the full-scale target bitmap data that can characterize whether each bit corresponds to a target) or an xor operation of the bitmap data corresponding to the index condition and the full-scale target bitmap data.
After the bit operation is executed, a bitmap data as a result of the bit operation can be obtained, in the bitmap data, the bit with the storage data of 1 corresponds to the target meeting the index condition, at this time, the sequence number of the bit can be used to obtain the corresponding data container subscript, and the data pointed by the subscript is the request data of the index request.
For example, the index request needs to request data of "beijing" and "type of clothing" in "origin," at the same time, at this time, bitmap data corresponding to the "beijing" sub-dimension in the "origin" dimensional index model and bitmap data corresponding to the "clothing" sub-dimension in the "type" dimensional index model may be first obtained by using a hash table, and then an and operation is performed on the two bitmap data to obtain final bitmap data, and a required sequence number and a required data container subscript may be determined from the bitmap data.
Step S303: in response to the index request, the original data pointed to by the subscript in the data container is returned.
In this step, the subscript obtained in step S302 may be used in the data container to determine that the required data is returned to the terminal, thereby completing the data indexing process.
Preferably, in the embodiment of the present invention, since part or all of the non-index dimension data may not be stored in the memory cache (the non-index dimension data is stored in the disk or the third-party storage system, and meanwhile, in the disk or the third-party storage system, the data of each target is often associated with the data container index of the target, and the data container index may be used as an index field of the data), if the non-index dimension data is requested, it is necessary to determine, from the disk or the third-party storage system, the required non-index dimension data to return to the terminal by using the data container index obtained in step S302. At this time, the returned data may include the index dimension data returned from the memory cache and the non-index dimension data returned from the disk or the third-party storage system.
Fig. 4 is a schematic diagram of an execution flow of the data indexing method according to the embodiment of the present invention, and specific steps shown in the schematic diagram are described above and are not described herein again. In addition, the memory caching system and the data indexing method described above can be applied to various electronic devices and application scenarios, and are not limited to the server and the server memory indexing field.
In the technical scheme of the embodiment of the invention, data containers such as dynamic arrays and the like are adopted to store multi-dimensional data of the target, and bitmap data capable of representing the value of the target in each sub-dimension is generated aiming at the sub-dimension of each index dimension; when indexing data, the bitmap data corresponding to the indexing conditions is obtained and bit operation is performed to determine the required target, so that the required data can be obtained from the data container. Through the arrangement, the invention can realize the high-efficiency data index supporting multi-condition combination and complex index format in the memory cache in a simple and low-cost mode, and the index statement can be written in a mode similar to SQL standard, thereby quickly acquiring the required data from the memory cache and avoiding the problem of delay caused by input/output of a disk or a network. Meanwhile, the number of the index dimensionality and the sub-dimensionality is large, so that the bitmap data volume is large and is not easy to position. Aiming at the problems, the invention constructs a hash table to store the hash value of each sub-dimension and the storage position mark of the bitmap data corresponding to the sub-dimension, and can carry out quick positioning by means of the storage position mark when inquiring the bitmap. In addition, when the service data volume is small, all the service data can be stored in the memory cache so as to improve the indexing efficiency; when the service data volume is large, the index dimension data in the service data volume can be stored in the memory cache, and the rest data can be stored in a disk or a third-party storage system, so that the memory can be efficiently utilized.
It should be noted that, for the convenience of description, the foregoing method embodiments are described as a series of acts, but those skilled in the art will appreciate that the present invention is not limited by the order of acts described, and that some steps may in fact be performed in other orders or concurrently. Moreover, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no acts or modules are necessarily required to implement the invention.
FIG. 5 illustrates an exemplary system architecture 500 to which the data indexing method of embodiments of the present invention may be applied.
As shown in fig. 5, the system architecture 500 may include terminal devices 501, 502, 503, a network 504, and a server 505 (this architecture is merely an example, and the components included in a particular architecture may be adapted according to application specific circumstances). The network 504 serves to provide a medium for communication links between the terminal devices 501, 502, 503 and the server 505. Network 504 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 501, 502, 503 to interact with a server 505 over a network 504 to receive or send messages or the like. Various client applications, such as a search-class application (for example only), may be installed on the terminal devices 501, 502, 503.
The terminal devices 501, 502, 503 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 505 may be a server providing various services, such as a metadata server (for example only) providing support for search-type applications operated by users with the terminal devices 501, 502, 503. The metadata server may process the received index request and feed back the processing results (e.g., data responsive to the index request-by way of example only) to the terminal devices 501, 502, 503.
It should be noted that the data indexing method provided by the embodiment of the present invention is generally executed by the server 505.
It should be understood that the number of terminal devices, networks, and servers in fig. 5 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
The invention also provides the electronic equipment. The electronic device of the embodiment of the invention comprises: one or more processors; the storage device is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors realize the data indexing method provided by the invention.
Referring now to FIG. 6, shown is a block diagram of a computer system 600 suitable for use with the electronic device implementing an embodiment of the present invention. The electronic device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 6, the computer system 600 includes a Central Processing Unit (CPU)601 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM603, various programs and data necessary for the operation of the computer system 600 are also stored. The CPU601, ROM 602, and RAM603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted in the storage section 608 as necessary.
In particular, the processes described in the main step diagrams above may be implemented as computer software programs, according to embodiments of the present disclosure. For example, embodiments of the invention include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the main step diagram. In the above-described embodiment, the computer program can be downloaded and installed from the network through the communication section 609, and/or installed from the removable medium 611. The computer program performs the above-described functions defined in the system of the present invention when executed by the central processing unit 601.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by the apparatus, cause the apparatus to perform steps comprising: receiving an index request, and acquiring at least one index condition carried in the index request; determining at least one bitmap data corresponding to the index condition in a memory cache, and determining a sequence number of at least one target meeting the index condition by using the bitmap data or a bit operation result aiming at the bitmap data; acquiring subscripts of the targets in the data containers according to the serial numbers; wherein the bit operation is determined by a logic state of an index condition in the index request; in response to the index request, the original data pointed to by the subscript in the data container is returned.
In the technical scheme of the embodiment of the invention, data containers such as dynamic arrays and the like are adopted to store multi-dimensional data of the target, and bitmap data capable of representing the value of the target in each sub-dimension is generated aiming at the sub-dimension of each index dimension; when indexing data, the bitmap data corresponding to the indexing conditions is obtained and bit operation is performed to determine the required target, so that the required data can be obtained from the data container. Through the arrangement, the invention can realize the high-efficiency data index supporting multi-condition combination and complex index format in the memory cache in a simple and low-cost mode, and the index statement can be written in a mode similar to SQL standard, thereby quickly acquiring the required data from the memory cache and avoiding the problem of delay caused by input/output of a disk or a network. Meanwhile, the number of the index dimensions and the sub-dimensions thereof is large, so that the Bitmap data volume is large and the Bitmap data is not easy to position. Aiming at the problems, the invention constructs a hash table to store the hash value of each sub-dimension and the storage position mark of the Bitmap data corresponding to the sub-dimension, and can carry out quick positioning by means of the storage position mark when inquiring the Bitmap. In addition, when the service data volume is small, all the service data can be stored in the memory cache so as to improve the indexing efficiency; when the service data volume is large, the index dimension data in the service data volume can be stored in the memory cache, and the rest data can be stored in a disk or a third-party storage system, so that the memory can be efficiently utilized.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A memory caching data system, comprising: storing original data of a plurality of targets in a data container preset in a memory cache; wherein the content of the first and second substances,
the raw data comprises data of at least one index dimension; the raw data of each of the plurality of targets as an element of the data container, each target having a unique subscript of the data container;
the system further comprises: bitmap data corresponding to each sub-dimension of the index dimension for indexing the original data; and each bitmap data comprises values of the plurality of targets in the sub-dimensions corresponding to the bitmap data, which are arranged according to a preset sequence based on the subscripts.
2. The system of claim 1, further comprising:
and the hash table is used for storing the hash value of each sub-dimension of the index dimension and the storage position mark of the bitmap data corresponding to the sub-dimension.
3. The system of claim 1, wherein the raw data further comprises: data of at least one non-indexed dimension.
4. The system of claim 1, wherein the order is an ascending order of the subscripts, and wherein any one of the plurality of targets has a same sequence number in the bitmap data as a data container subscript of the target.
5. The system of any of claims 1-4, wherein the data container is a dynamic array.
6. A data indexing method based on the memory cache data system of any one of claims 1 to 5, comprising:
receiving an index request, and acquiring at least one index condition carried in the index request;
determining at least one bitmap data corresponding to the index condition in a memory cache, and determining a sequence number of at least one target meeting the index condition by using the bitmap data or a bit operation result aiming at the bitmap data; acquiring subscripts of the targets in the data containers according to the serial numbers; wherein the bit operation is determined by a logic state of an index condition in the index request; and
in response to the index request, the original data pointed to by the subscript in the data container is returned.
7. The method of claim 6, further comprising:
and when the request data of the index request is not stored in the memory cache, determining the request data to be returned from the disk or a third-party storage system according to the subscript of the acquired target in the data container.
8. The method of claim 6 or 7, wherein the bit operations comprise at least one of: or, and, not, or exclusive or.
9. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 6-8.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 6-8.
CN201910397340.5A 2019-05-14 2019-05-14 Memory data caching system and data indexing method Active CN111949648B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910397340.5A CN111949648B (en) 2019-05-14 2019-05-14 Memory data caching system and data indexing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910397340.5A CN111949648B (en) 2019-05-14 2019-05-14 Memory data caching system and data indexing method

Publications (2)

Publication Number Publication Date
CN111949648A true CN111949648A (en) 2020-11-17
CN111949648B CN111949648B (en) 2024-03-01

Family

ID=73335385

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910397340.5A Active CN111949648B (en) 2019-05-14 2019-05-14 Memory data caching system and data indexing method

Country Status (1)

Country Link
CN (1) CN111949648B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113343191A (en) * 2021-08-04 2021-09-03 广东南方电信规划咨询设计院有限公司 Network information security protection method and system
WO2023226277A1 (en) * 2022-05-27 2023-11-30 深圳技术大学 Internet of things gateway data transmission method and device, server, client and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102402617A (en) * 2011-12-23 2012-04-04 天津神舟通用数据技术有限公司 Easily compressed database index storage system using fragments and sparse bitmap, and corresponding construction, scheduling and query processing methods
CN104866608A (en) * 2015-06-05 2015-08-26 中国人民大学 Query optimization method based on join index in data warehouse
US20160026579A1 (en) * 2014-07-22 2016-01-28 Lsi Corporation Storage Controller and Method for Managing Metadata Operations in a Cache
CN105960637A (en) * 2013-11-28 2016-09-21 英特尔公司 Techniques for block-based indexing
US9489410B1 (en) * 2016-04-29 2016-11-08 Umbel Corporation Bitmap index including internal metadata storage
CN106874437A (en) * 2017-02-04 2017-06-20 中国人民大学 The internal storage data warehouse ranks storage conversion implementation method of data base-oriented all-in-one

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102402617A (en) * 2011-12-23 2012-04-04 天津神舟通用数据技术有限公司 Easily compressed database index storage system using fragments and sparse bitmap, and corresponding construction, scheduling and query processing methods
CN105960637A (en) * 2013-11-28 2016-09-21 英特尔公司 Techniques for block-based indexing
US20160026579A1 (en) * 2014-07-22 2016-01-28 Lsi Corporation Storage Controller and Method for Managing Metadata Operations in a Cache
CN104866608A (en) * 2015-06-05 2015-08-26 中国人民大学 Query optimization method based on join index in data warehouse
US9489410B1 (en) * 2016-04-29 2016-11-08 Umbel Corporation Bitmap index including internal metadata storage
CN106874437A (en) * 2017-02-04 2017-06-20 中国人民大学 The internal storage data warehouse ranks storage conversion implementation method of data base-oriented all-in-one

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
李高超;李卢毓海;刘梦雅;刘燕兵;: "基于二级索引结构的图压缩算法", 通信学报, no. 06 *
程鹏;: "位图索引技术及其研究综述", 科技信息, no. 26 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113343191A (en) * 2021-08-04 2021-09-03 广东南方电信规划咨询设计院有限公司 Network information security protection method and system
CN113343191B (en) * 2021-08-04 2022-05-27 广东南方电信规划咨询设计院有限公司 Network information security protection method and system
WO2023226277A1 (en) * 2022-05-27 2023-11-30 深圳技术大学 Internet of things gateway data transmission method and device, server, client and storage medium

Also Published As

Publication number Publication date
CN111949648B (en) 2024-03-01

Similar Documents

Publication Publication Date Title
US11068441B2 (en) Caseless file lookup in a distributed file system
CN108846753B (en) Method and apparatus for processing data
KR20200027413A (en) Method, device and system for storing data
CN109657174B (en) Method and device for updating data
CN107704202B (en) Method and device for quickly reading and writing data
CN108897874B (en) Method and apparatus for processing data
US10909086B2 (en) File lookup in a distributed file system
US10169348B2 (en) Using a file path to determine file locality for applications
CN110489440B (en) Data query method and device
CN110688096B (en) Method and device for constructing application program containing plug-in, medium and electronic equipment
CN111737564B (en) Information query method, device, equipment and medium
US11809429B2 (en) Method for processing model parameters, and apparatus
US11822912B2 (en) Software installation through an overlay file system
JP2021089704A (en) Method, apparatus, electronic device, readable storage medium, and computer program for data query
CN110109983B (en) Method and device for operating Redis database
CN111949648B (en) Memory data caching system and data indexing method
US11190620B2 (en) Methods and electronic devices for data transmission and reception
CN110110184B (en) Information inquiry method, system, computer system and storage medium
CN112131242A (en) Data rapid query method and device based on redis
US20200012630A1 (en) Smaller Proximate Search Index
US10114864B1 (en) List element query support and processing
CN113704242A (en) Data processing method and device
CN107291628B (en) Method and apparatus for accessing data storage device
CN112182085A (en) Data export method, device, equipment and storage medium
CN112711572A (en) Online capacity expansion method and device suitable for sub-warehouse and sub-meter

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant