CN111949648B - Memory data caching system and data indexing method - Google Patents

Memory data caching system and data indexing method Download PDF

Info

Publication number
CN111949648B
CN111949648B CN201910397340.5A CN201910397340A CN111949648B CN 111949648 B CN111949648 B CN 111949648B CN 201910397340 A CN201910397340 A CN 201910397340A CN 111949648 B CN111949648 B CN 111949648B
Authority
CN
China
Prior art keywords
data
index
dimension
bitmap
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910397340.5A
Other languages
Chinese (zh)
Other versions
CN111949648A (en
Inventor
胡蓉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Wodong Tianjun Information Technology Co Ltd
Original Assignee
Beijing Wodong Tianjun Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Wodong Tianjun Information Technology Co Ltd filed Critical Beijing Wodong Tianjun Information Technology Co Ltd
Priority to CN201910397340.5A priority Critical patent/CN111949648B/en
Publication of CN111949648A publication Critical patent/CN111949648A/en
Application granted granted Critical
Publication of CN111949648B publication Critical patent/CN111949648B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2255Hash tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a memory data caching system and a data indexing method, and relates to the technical field of computers. One embodiment of the system comprises: original data of a plurality of targets stored in a data container preset in a memory cache; wherein the raw data comprises data of at least one index dimension; raw data of each of the plurality of targets as an element of the data container, each target having a unique subscript of the data container; the system further comprises: bitmap data for indexing the raw data corresponding to each sub-dimension of the index dimension; each bitmap data comprises values of the plurality of targets arranged according to a preset sequence based on the subscripts in the corresponding sub-dimension of the bitmap data. The embodiment can store the multi-dimensional data of the target, and construct an index structure by using a Bitmap algorithm, so as to realize multi-dimensional efficient indexing of the memory cache data.

Description

Memory data caching system and data indexing method
Technical Field
The invention relates to the technical field of computers, in particular to a memory data caching system and a data indexing method.
Background
In the conventional caching technology, data is generally stored based on a hash table structure, and when data is indexed, a corresponding Value is searched for according to a Key (Key) in the hash table. In such a data structure, only one-to-one indexing of Key and Value can be performed, and the capability of indexing according to multiple data dimensions is not provided, so that the high-dimensional data structure and complex service requirements cannot be supported.
Disclosure of Invention
In view of this, the embodiment of the invention provides a memory cache data system and a data indexing method, which can store multi-dimensional data of a target and construct an index structure by using a Bitmap algorithm to realize multi-dimensional efficient indexing of the memory cache data.
To achieve the above object, according to one aspect of the present invention, a memory cache data system is provided.
The memory cache data system according to the embodiment of the invention can comprise: original data of a plurality of targets stored in a data container preset in a memory cache; wherein the raw data comprises data of at least one index dimension; raw data of each of the plurality of targets as an element of the data container, each target having a unique subscript of the data container; the system may further comprise: bitmap data for indexing the raw data corresponding to each sub-dimension of the index dimension; each bitmap data comprises values of the plurality of targets arranged according to a preset sequence based on the subscripts in the corresponding sub-dimension of the bitmap data.
Optionally, the system may further include a hash table for storing a hash value of each sub-dimension of the index dimension and a storage location flag of bitmap data corresponding to the sub-dimension.
Optionally, the raw data may further include: data of at least one non-index dimension.
Optionally, the sequence is an ascending sequence of the subscripts, and a sequence number of any one of the plurality of targets in the bitmap data is the same as a data container subscript of the target.
Optionally, the data container is a dynamic array.
In order to achieve the above object, according to another aspect of the present invention, a data indexing method based on the memory cache data system is provided.
The data indexing method of the embodiment of the invention comprises the following steps: receiving an index request, and acquiring at least one index condition carried in the index request; determining at least one bitmap data corresponding to the index condition in a memory cache, and determining a sequence number of at least one target conforming to the index condition by using the bitmap data or a bit operation result aiming at the bitmap data; obtaining the subscript of the target in the data container according to the sequence number; wherein the bit operation is determined by a logic state of an indexing condition in the index request; and returning the original data pointed to by the index in the data container in response to the index request.
Optionally, the method further comprises: and when the request data of the index request is not stored in the memory cache, determining that the request data is returned from a disk or a third-party storage system according to the obtained target in the subscript of the data container.
Optionally, the bit operation includes at least one of: or operation, and operation, non operation, exclusive or operation.
To achieve the above object, according to still another aspect of the present invention, there is provided an electronic apparatus.
An electronic apparatus of the present invention includes: one or more processors; and the storage device is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors are enabled to realize the data indexing method provided by the invention.
To achieve the above object, according to still another aspect of the present invention, there is provided a computer-readable storage medium.
A computer readable storage medium of the present invention has stored thereon a computer program which, when executed by a processor, implements the data indexing method provided by the present invention.
According to the technical scheme of the invention, one embodiment of the invention has the following advantages or beneficial effects:
firstly, a data container such as a dynamic array is adopted to store multi-dimensional data of a target, and bitmap data which can represent the value of the target in each sub-dimension of each index dimension is generated; when indexing data, bit operation is carried out on bitmap data corresponding to the index condition, so that a required target can be determined, and further required data can be obtained from a data container. Through the arrangement, the invention can realize the high-efficiency data index supporting multi-condition combination and complex index format in the memory cache in a simple and low-cost mode, and the index statement can be written in a mode similar to SQL (structured query language) specification, so that the required data can be quickly obtained from the memory cache, and the delay problem caused by the input/output of a disk or a network is avoided.
In practical application, the index dimension and the number of sub-dimensions thereof are larger, so that the Bitmap data size is larger and the Bitmap data is not easy to position. In order to solve the problems, the invention constructs the hash table to store the hash value of each sub-dimension and the storage position mark of the Bitmap data corresponding to the sub-dimension, and can quickly position by means of the storage position mark when inquiring the Bitmap. In addition, when the service data volume is smaller, all the service data volume can be stored in a memory cache to improve the index efficiency; when the service data volume is large, the index dimension data can be stored in the memory cache, and the rest data is stored in a disk or a third-party storage system, so that the efficient utilization of the memory is realized.
Further effects of the above-described non-conventional alternatives are described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is a schematic diagram of a memory cache data system according to an embodiment of the present invention;
FIG. 2 is a schematic diagram illustrating a bitmap data generating step of a memory cache data system according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of main steps of a data indexing method according to an embodiment of the present invention;
FIG. 4 is a flow chart illustrating the execution of a data indexing method according to an embodiment of the present invention;
FIG. 5 is an exemplary system architecture diagram in which embodiments in accordance with the present invention may be applied;
fig. 6 is a schematic structural diagram of an electronic device for implementing a data indexing method according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, in which various details of the embodiments of the present invention are included to facilitate understanding, and are to be considered merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
It should be noted that the embodiments of the present invention and the technical features in the embodiments may be combined with each other without collision.
FIG. 1 is a schematic diagram of a memory cache data system according to an embodiment of the invention.
As shown in fig. 1, a memory cache data system according to an embodiment of the present invention may include: raw data of a plurality of objects stored in a data container and an index model including bitmap data. Specifically, the memory cache refers to a cache device disposed in the memory of the electronic device, and is not a first level cache, a second level cache, or the like of the cpu CPU (Central Processing Unit). Taking a server memory cache as an example, it can store part of hot spot data, when a terminal requests data, it searches data from the memory cache first, and then searches data from a storage layer (e.g. a disk) if the memory cache cannot be satisfied. Therefore, the probability of directly accessing the storage layer by the user can be effectively reduced, and the response speed and the processing capacity of the system are improved.
In the embodiment of the present invention, the data container refers to a virtual device preset in the memory cache and used for accommodating data or objects, where the virtual device has a plurality of integer subscripts, and is used for identifying each data storage location, and the data in each data storage location is used as an element of the data container. For example, a dynamic array ArrayList in a Java (an object-oriented programming language) programming language is a container, where subscripts are integers 0, 1, 2, and 3 from zero, and each data storage location may store data of integer, long integer, short integer, character, byte, boolean, single-precision floating point, double-precision floating point, and reference types (such as classes, interfaces, objects in Java).
In this context, the target may be an entity, or may be a virtual object such as data, a concept, or the like. The original data of the object is raw data after being stored in the memory cache with respect to bitmap data to be described later. In some embodiments, the original data of each target includes data of at least one index dimension (i.e., a dimension that can be used for indexing), each index dimension can correspond to an index model that can implement an index for the original data in the data container with bitmap data contained by the index model.
Further, bitmap data (i.e., bitmap data) herein refers to binary data generated according to a known Bitmap algorithm (Bitmap algorithm). Those skilled in the art will appreciate that: the Bitmap algorithm uses bits (Bit) as units to store data, so that the storage space can be saved; bits of the same sequence number correspond to the same target in a plurality of bitmap data generated based on the same set of targets. The sequence number refers to an index number of each bit in the bitmap data, and the index number is increased from high bit to low bit. For example: for bitmap data having five bits (bits), the sequence numbers of the five bits are 0, 1, 2, 3, 4 in order; of the three bitmap data 00000, 11111, 01010 generated based on the five targets A, B, C, D, E, the bits with the sequence number 0 correspond to the object a, the bits with the sequence number 1 correspond to the object B, the bits with the sequence number 2 correspond to the object C, the bits with the sequence number 3 correspond to the object D, and the bits with the sequence number 4 correspond to the object E.
In an embodiment of the present invention, the original data of each of a plurality of targets is used as an element of a data container, and each target has a unique subscript of the data container. That is, the subscript of the data container may isolate different targets and thus may serve as an identification of each target. When the data is indexed, the needed data can be obtained from the data container after the subscript of the target is determined. In fig. 1, the tabular content on the right side of the "original data stored in data container" is an illustration of the data container, and subscripts 0, 1, 2 below the tabular content isolate the original data of different targets.
In some embodiments, a bitmap data corresponds to a sub-dimension of an index dimension, and the bitmap data includes values of a plurality of targets arranged in a preset order based on the index of the data container in the sub-dimension corresponding to the bitmap data. Specifically, for any index dimension, the sub-dimension refers to all dimension values of the plurality of targets in the index dimension, for example, for an index dimension of "season", if all dimension values of the plurality of targets in the season dimension are spring and autumn, then the sub-dimension of the season dimension is spring and autumn. The value of the target in the sub-dimension reflects whether the target has the sub-dimension, and generally, the value of the target is one when the target has the sub-dimension, and the value of the target is zero when the target does not have the sub-dimension.
The preset order based on the data container index refers to an order preset according to an arrangement rule of the data container index, for example, an ascending order of the data container index, a descending order of the data container index, and the like. It can be understood that the bitmap data includes values of multiple targets arranged according to a preset sequence in corresponding sub-dimensions, which means that 0 and 1 data stored in bits (bits) arranged from high to low in the bitmap data are respectively used as values of targets arranged according to the sequence in corresponding sub-dimensions, so that a mapping relationship can be established between sequence numbers in bitmap data corresponding to the same target and subscripts of data containers, and when data indexing is performed, the subscripts of the data containers of the targets can be determined only by acquiring the sequence numbers of the targets in the bitmap data, and then required data can be acquired from the data containers. As a preferred embodiment, each bit from the upper bit to the lower bit in the bitmap data may be one-to-one corresponding to a plurality of targets arranged in ascending order of the index of the data container, and the total number of bits of the bitmap data may be the same as the total number of targets, so that the sequence numbers of bits of the bitmap data corresponding to the same target are the same as the index of the data container, whereby the index logic may be further simplified.
For example, in fig. 1, original data of 8 targets a, b, c..are stored in a data container, subscripts of 8 targets are 0, 1, 2, 3..7 in this order, in an index model of index dimension 1, each bit from the upper bit to the lower bit in bitmap data is stored a, b, c..the values of a.m. in the corresponding sub-dimension can be generated, 8-bit bitmap data corresponding to different sub-dimensions: 00101110, 11100000, 01011010, etc. For any target, its data container index is the same as its sequence number in the bitmap data. For example, for object b, its data container subscript is 1 and its sequence number in the bitmap data is also 1.
In the memory cache data system comprising the index model, efficient indexing of cache data can be realized by operating bitmap data of corresponding sub-dimensions, and multi-index condition combination and complex index formats can be supported.
In a specific application, the number of Bitmap is large because the number of child dimensions included in the index dimension is large, and positioning is difficult. To address this problem, in an embodiment of the present invention, a hash table may be built in the index model of each index dimension to store the mapping relationship of the sub-dimension and the storage location flag of the corresponding bitmap data. The hash table can be a container class Hashtable or HashMap in Java. Specifically, in the hash table, key (Key) is a hash Value of each sub-dimension, value (Value) is a storage location flag of the corresponding bitmap data, and the required bitmap data can be located quickly through the storage location flag. For example, in the index model of index dimension 1 in fig. 1, the upper hash table stores the mapping relationship between the sub-dimension hash values d, e, f and the storage location marks 00, 11, 22, and after the storage location marks are obtained, the bitmap data can be quickly located through the lower table.
In an actual application scenario, if the data size of the hot spot data is smaller, all data (including index dimension data and non-index dimension data) can be stored in the memory cache to realize efficient indexing. If the data volume of the hot spot data is large and the hot spot data cannot be fully contained in the memory cache, the index dimension data in the hot spot data can be stored, and the non-index dimension data can be stored in a disk or a third-party storage system such as a Redis (open source storage system), an elastic search (a search and storage system) and the like. When indexing the non-index dimension data, the index dimension data can be obtained from a hard disk or a third-party storage system by utilizing the index of the data container of the target determined based on the bitmap data, thereby realizing the efficient utilization of the memory cache.
Fig. 2 is a schematic diagram illustrating a bitmap data generating step of the memory cache data system according to an embodiment of the present invention. As shown in fig. 2, each bitmap data in the index model may be generated according to the following steps:
1. and sequentially selecting the current target according to the ascending order of the subscripts of the data containers, and considering one index dimension of the current target.
2. And acquiring the sub-dimension of the current target in the current index dimension, and inquiring bitmap data through the sub-dimension.
3. Judging whether bitmap data corresponding to the child dimension exists in the index model or not: if yes, directly executing the next step; otherwise, the bitmap data is constructed and the next step is performed.
4. The bit corresponding to the target in the bitmap data is set to 1. It will be appreciated that the data stored in the corresponding bit in the bitmap data is zero prior to the setting of this step.
5. Judging whether the index dimension is traversed (namely judging whether the index dimension is traversed or not): if yes, executing the next step, otherwise returning to the step 1 to consider the next index dimension.
6. Judging whether the target is traversed (namely judging whether the target is traversed or not): if yes, ending the flow; otherwise, returning to the step 1 to select the next target.
FIG. 3 is a schematic diagram showing main steps of a data indexing method according to an embodiment of the present invention. As shown in fig. 3, the data indexing method of the memory cache data system according to the embodiment of the invention may execute the following steps:
step S301: and receiving an index request sent by the terminal, and acquiring at least one index condition carried in the index request. It is understood that different indexing conditions may be distinguished according to the index dimension or sub-dimensions of the index dimension.
In practice, the indexing condition in the indexing request may have a certain logic state. For example, if there is only one index condition in the index request that is described in negative form (e.g., "the place of origin is not Beijing"), then the logical state of the index condition is "not"; if there are multiple index conditions in the index request, the logical state of the index conditions is consistent with its logical relationship. For example, if the index request needs to request data of "Beijing" being the place of origin and "being the clothes" being the type, the logical relationship between Beijing "being the place of origin and" being the clothes "being the type is" and ", and then the logical states of the two index conditions are both" and ".
In particular, since the memory cache data system supports multiple conditional combinations and complex index formats, the index statement for the terminal to fulfill the index request can be written in a manner similar to the specification of the structured query language SQL (Structured Query Language). For example, a partial index statement may be written as: where we sku= 'xxx' and type= 'xxxx' or dcid= 'xxxxx', where sku, type and dcid are index dimensions.
Step S302: determining at least one bitmap data corresponding to the index condition in the memory cache, and determining a sequence number of at least one target meeting the index condition by using the bitmap data or a bit operation result aiming at the bitmap data; obtaining the subscript of the target in the data container according to the sequence number; wherein the bit operation is determined by the logical state of the indexing condition in the indexing request.
In this step, if the index request carries only one index condition without a logical state, a bitmap data storage location flag corresponding to a child dimension (the child dimension corresponds to the index condition) is obtained from the hash table in the memory buffer, and then the bitmap data is obtained using the storage location flag. In the bitmap data, the bit of the stored data is 1, namely, corresponds to the target meeting the index condition, at the moment, the sequence numbers of the bits can be used for acquiring corresponding data container subscripts, and the data pointed by the subscripts is the request data of the index request.
For example, the index condition carried in the index request is "Beijing" at the origin, and at this time, the bitmap data storage location mark corresponding to the "Beijing" sub-dimension may be obtained from the index model of the "origin" dimension in the memory cache, so as to determine the bitmap data corresponding to the "Beijing" sub-dimension. If the bitmap data is 01011010, the sequence numbers of the targets meeting the index condition are 1, 3, 4 and 6, and if the sequence numbers corresponding to the same target are the same as the index of the data container, the data pointed by the index 1, 3, 4 and 6 of the data container are the request data of the index request.
If the index request carries a plurality of index conditions, after the bitmap data corresponding to each index condition is respectively determined, bit operation needs to be performed on the bitmap data. Typically, the bit operation is determined by the logical state of the index condition. Specifically, if the logical state of the index condition is "OR", then the bit operation is an OR operation; if the logical state of the index condition is AND, then the bit operation is AND operation; if the logical state of the index condition is "not", then the bit operation is a non-operation (possibly in combination with an AND operation for the full target bitmap data, which can characterize whether each bit corresponds to a target) or an exclusive OR operation for the bitmap data corresponding to the index condition and the full target bitmap data.
After the bit operation is performed, bit map data serving as a bit operation result can be obtained, wherein bits with 1 are stored in the bit map data, namely, the bits correspond to targets meeting index conditions, and at the moment, corresponding data container subscripts can be obtained by utilizing sequence numbers of the bits, and data pointed by the subscripts is request data of an index request.
For example, the index request needs to request that the "place of origin is Beijing" and the "type is clothes", at this time, the hash table may be first used to obtain bitmap data corresponding to the "Beijing" sub-dimension in the index model of the "place of origin" dimension, bitmap data corresponding to the "clothes" sub-dimension in the index model of the "type" dimension, and then the two bitmap data are subjected to and operation to obtain final bitmap data, from which the required sequence number and the required data container index may be determined.
Step S303: in response to the index request, the original data pointed to by the index in the data container is returned.
In this step, the required data may be determined to be returned to the terminal in the data container using the subscript acquired in step S302, thereby completing the data indexing flow.
Preferably, in the embodiment of the present invention, since some or all of the non-index dimension data may not be stored in the memory cache (where the non-index dimension data is stored in the disk or the third party storage system, and meanwhile, in the disk or the third party storage system, the data of each object is often associated with the data container index of the object, where the data container index may be used as the index field of the data), if the non-index dimension data is requested, the data container index acquired in step S302 needs to be used to determine that the required non-index dimension data is returned from the disk or the third party storage system to the terminal. At this point, the return data may include index dimension data returned from the memory cache and non-index dimension data returned from the disk or third party storage system.
Fig. 4 is a schematic flowchart of an implementation of the data indexing method according to an embodiment of the present invention, and specific steps thereof are described above and are not repeated herein. In addition, the memory caching system and the data indexing method described above can be applied to various electronic devices and applicable scenes, and are not limited to the fields of servers and server memory indexes.
In the technical scheme of the embodiment of the invention, a data container such as a dynamic array is adopted to store multi-dimensional data of the target, and bitmap data which can represent the value of the target in each sub-dimension is generated aiming at the sub-dimension of each index dimension; when indexing data, bit operation is carried out on bitmap data corresponding to the index condition, so that a required target can be determined, and further required data can be obtained from a data container. Through the arrangement, the invention can realize the high-efficiency data index supporting multi-condition combination and complex index format in the memory cache in a simple and low-cost mode, and the index statement can be written in a mode similar to SQL (structured query language) specification, so that the required data can be quickly obtained from the memory cache, and the delay problem caused by the input/output of a disk or a network is avoided. Meanwhile, the bitmap data size is large and the positioning is difficult due to the large index dimension and the large number of sub-dimensions thereof. In order to solve the problems, the invention constructs the hash table to store the hash value of each sub-dimension and the storage position mark of the bitmap data corresponding to the sub-dimension, and can quickly position by means of the storage position mark when inquiring the bitmap. In addition, when the service data volume is smaller, all the service data volume can be stored in a memory cache to improve the index efficiency; when the service data volume is large, the index dimension data can be stored in the memory cache, and the rest data is stored in a disk or a third-party storage system, so that the efficient utilization of the memory is realized.
It should be noted that, for the convenience of description, the foregoing method embodiments are expressed as a series of combinations of actions, but it should be understood by those skilled in the art that the present invention is not limited by the described order of actions, and some steps may actually be performed in other order or simultaneously. Moreover, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred embodiments, and that the acts and modules referred to are not necessarily required to practice the invention.
FIG. 5 illustrates an exemplary system architecture 500 in which the data indexing method of embodiments of the present invention may be applied.
As shown in fig. 5, a system architecture 500 may include terminal devices 501, 502, 503, a network 504, and a server 505 (this architecture is merely an example, and the components contained in a particular architecture may be tailored to the application specific case). The network 504 is used as a medium to provide communication links between the terminal devices 501, 502, 503 and the server 505. The network 504 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
A user may interact with the server 505 via the network 504 using the terminal devices 501, 502, 503 to receive or send messages or the like. Various client applications, such as a search class application (by way of example only), may be installed on the terminal devices 501, 502, 503.
The terminal devices 501, 502, 503 may be a variety of electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.
The server 505 may be a server providing various services, such as a metadata server (by way of example only) providing support for search class applications operated by users using the terminal devices 501, 502, 503. The metadata server may process the received index request and feed back the processing result (e.g., data in response to the index request—just an example) to the terminal devices 501, 502, 503.
It should be noted that, the data indexing method provided by the embodiment of the present invention is generally performed by the server 505.
It should be understood that the number of terminal devices, networks and servers in fig. 5 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
The invention also provides electronic equipment. The electronic equipment of the embodiment of the invention comprises: one or more processors; and the storage device is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors are enabled to realize the data indexing method provided by the invention.
Referring now to FIG. 6, there is illustrated a schematic diagram of a computer system 600 suitable for use in implementing an electronic device of an embodiment of the present invention. The electronic device shown in fig. 6 is only an example and should not be construed as limiting the functionality and scope of use of the embodiments of the invention.
As shown in fig. 6, the computer system 600 includes a Central Processing Unit (CPU) 601, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM603, various programs and data required for the operation of the computer system 600 are also stored. The CPU601, ROM 602, and RAM603 are connected to each other through a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, mouse, etc.; an output portion 607 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The drive 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed on the drive 610 as necessary, so that a computer program read out therefrom is installed into the storage section 608 as necessary.
In particular, the processes described in the main step diagrams above may be implemented as computer software programs according to the disclosed embodiments of the invention. For example, embodiments of the present invention include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the main step diagrams. In the above-described embodiment, the computer program can be downloaded and installed from a network through the communication section 609 and/or installed from the removable medium 611. The above-described functions defined in the system of the present invention are performed when the computer program is executed by the central processing unit 601.
The computer readable medium shown in the present invention may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, a computer readable signal medium may comprise a data signal propagated in baseband or as part of a carrier wave, with computer readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be present alone without being fitted into the device. The computer readable medium carries one or more programs which, when executed by the device, cause the device to perform steps comprising: receiving an index request, and acquiring at least one index condition carried in the index request; determining at least one bitmap data corresponding to the index condition in a memory cache, and determining a sequence number of at least one target conforming to the index condition by using the bitmap data or a bit operation result aiming at the bitmap data; obtaining the subscript of the target in the data container according to the sequence number; wherein the bit operation is determined by a logic state of an indexing condition in the index request; and returning the original data pointed to by the index in the data container in response to the index request.
In the technical scheme of the embodiment of the invention, a data container such as a dynamic array is adopted to store multi-dimensional data of the target, and bitmap data which can represent the value of the target in each sub-dimension is generated aiming at the sub-dimension of each index dimension; when indexing data, bit operation is carried out on bitmap data corresponding to the index condition, so that a required target can be determined, and further required data can be obtained from a data container. Through the arrangement, the invention can realize the high-efficiency data index supporting multi-condition combination and complex index format in the memory cache in a simple and low-cost mode, and the index statement can be written in a mode similar to SQL (structured query language) specification, so that the required data can be quickly obtained from the memory cache, and the delay problem caused by the input/output of a disk or a network is avoided. Meanwhile, the index dimension and the number of sub-dimensions thereof are large, so that the Bitmap data size is large and the Bitmap data is not easy to position. In order to solve the problems, the invention constructs the hash table to store the hash value of each sub-dimension and the storage position mark of the Bitmap data corresponding to the sub-dimension, and can quickly position by means of the storage position mark when inquiring the Bitmap. In addition, when the service data volume is smaller, all the service data volume can be stored in a memory cache to improve the index efficiency; when the service data volume is large, the index dimension data can be stored in the memory cache, and the rest data is stored in a disk or a third-party storage system, so that the efficient utilization of the memory is realized.
The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives can occur depending upon design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims (10)

1. A memory cache data system, comprising: original data of a plurality of targets stored in a data container preset in a memory cache; wherein,
the raw data includes data of at least one index dimension; raw data of each of the plurality of targets as an element of the data container, each target having a unique subscript of the data container;
the system further comprises: bitmap data for indexing the raw data corresponding to each sub-dimension of the index dimension; each bitmap data comprises values of the plurality of targets arranged according to a preset sequence based on the subscripts in the corresponding sub-dimension of the bitmap data.
2. The system of claim 1, wherein the system further comprises:
and the hash table is used for storing the hash value of each sub-dimension of the index dimension and the storage position mark of the bitmap data corresponding to the sub-dimension.
3. The system of claim 1, wherein the raw data further comprises: data of at least one non-index dimension.
4. The system of claim 1, wherein the order is an ascending order of the index, and wherein any one of the plurality of targets has a sequence number in the bitmap data that is the same as the index of the data container for that target.
5. The system of any of claims 1-4, wherein the data container is a dynamic array.
6. A method for indexing data based on the memory cache data system according to any one of claims 1 to 5, comprising:
receiving an index request, and acquiring at least one index condition carried in the index request;
determining at least one bitmap data corresponding to the index condition in a memory cache, and determining a sequence number of at least one target conforming to the index condition by using the bitmap data or a bit operation result aiming at the bitmap data; obtaining the subscript of the target in the data container according to the sequence number; wherein the bit operation is determined by a logic state of an indexing condition in the index request; and
and returning the original data pointed to by the index in the data container in response to the index request.
7. The method according to claim 6, wherein the method further comprises:
and when the request data of the index request is not stored in the memory cache, determining that the request data is returned from a disk or a third-party storage system according to the obtained target in the subscript of the data container.
8. The method of claim 6 or 7, wherein the bit operation comprises at least one of: or operation, and operation, non operation, exclusive or operation.
9. An electronic device, comprising:
one or more processors;
storage means for storing one or more programs,
when executed by the one or more processors, causes the one or more processors to implement the method of any of claims 6-8.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any of claims 6-8.
CN201910397340.5A 2019-05-14 2019-05-14 Memory data caching system and data indexing method Active CN111949648B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910397340.5A CN111949648B (en) 2019-05-14 2019-05-14 Memory data caching system and data indexing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910397340.5A CN111949648B (en) 2019-05-14 2019-05-14 Memory data caching system and data indexing method

Publications (2)

Publication Number Publication Date
CN111949648A CN111949648A (en) 2020-11-17
CN111949648B true CN111949648B (en) 2024-03-01

Family

ID=73335385

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910397340.5A Active CN111949648B (en) 2019-05-14 2019-05-14 Memory data caching system and data indexing method

Country Status (1)

Country Link
CN (1) CN111949648B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113343191B (en) * 2021-08-04 2022-05-27 广东南方电信规划咨询设计院有限公司 Network information security protection method and system
CN115102807B (en) * 2022-05-27 2023-11-28 深圳技术大学 Method, device, server, client and storage medium for gateway data transmission of Internet of things

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102402617A (en) * 2011-12-23 2012-04-04 天津神舟通用数据技术有限公司 Easily compressed database index storage system using fragments and sparse bitmap, and corresponding construction, scheduling and query processing methods
CN104866608A (en) * 2015-06-05 2015-08-26 中国人民大学 Query optimization method based on join index in data warehouse
CN105960637A (en) * 2013-11-28 2016-09-21 英特尔公司 Techniques for block-based indexing
US9489410B1 (en) * 2016-04-29 2016-11-08 Umbel Corporation Bitmap index including internal metadata storage
CN106874437A (en) * 2017-02-04 2017-06-20 中国人民大学 The internal storage data warehouse ranks storage conversion implementation method of data base-oriented all-in-one

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160026579A1 (en) * 2014-07-22 2016-01-28 Lsi Corporation Storage Controller and Method for Managing Metadata Operations in a Cache

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102402617A (en) * 2011-12-23 2012-04-04 天津神舟通用数据技术有限公司 Easily compressed database index storage system using fragments and sparse bitmap, and corresponding construction, scheduling and query processing methods
CN105960637A (en) * 2013-11-28 2016-09-21 英特尔公司 Techniques for block-based indexing
CN104866608A (en) * 2015-06-05 2015-08-26 中国人民大学 Query optimization method based on join index in data warehouse
US9489410B1 (en) * 2016-04-29 2016-11-08 Umbel Corporation Bitmap index including internal metadata storage
CN106874437A (en) * 2017-02-04 2017-06-20 中国人民大学 The internal storage data warehouse ranks storage conversion implementation method of data base-oriented all-in-one

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
位图索引技术及其研究综述;程鹏;;科技信息(26);全文 *
基于二级索引结构的图压缩算法;李高超;李卢毓海;刘梦雅;刘燕兵;;通信学报(06);全文 *

Also Published As

Publication number Publication date
CN111949648A (en) 2020-11-17

Similar Documents

Publication Publication Date Title
CN107870728B (en) Method and apparatus for moving data
CN108846753B (en) Method and apparatus for processing data
CN109657174B (en) Method and device for updating data
US10534753B2 (en) Caseless file lookup in a distributed file system
CN107704202B (en) Method and device for quickly reading and writing data
US10169348B2 (en) Using a file path to determine file locality for applications
CN110688096B (en) Method and device for constructing application program containing plug-in, medium and electronic equipment
US9369332B1 (en) In-memory distributed cache
KR102111871B1 (en) Method and apparatus for generating random string
JP2021089704A (en) Method, apparatus, electronic device, readable storage medium, and computer program for data query
CN111949648B (en) Memory data caching system and data indexing method
CN111061680A (en) Data retrieval method and device
WO2023056946A1 (en) Data caching method and apparatus, and electronic device
US11190620B2 (en) Methods and electronic devices for data transmission and reception
CN113761565B (en) Data desensitization method and device
CN113760961B (en) Data query method and device
CN116069725A (en) File migration method, device, apparatus, medium and program product
CN111752964A (en) Data processing method and device based on data interface
CN115617859A (en) Data query method and device based on knowledge graph cluster
CN110908996A (en) Data processing method and device
US10114864B1 (en) List element query support and processing
CN113220981A (en) Method and device for optimizing cache
CN109213815B (en) Method, device, server terminal and readable medium for controlling execution times
CN112783914B (en) Method and device for optimizing sentences
CN113704242A (en) Data processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant