CN105786918B - Data query method and device based on data loading storage space - Google Patents

Data query method and device based on data loading storage space Download PDF

Info

Publication number
CN105786918B
CN105786918B CN201410828712.2A CN201410828712A CN105786918B CN 105786918 B CN105786918 B CN 105786918B CN 201410828712 A CN201410828712 A CN 201410828712A CN 105786918 B CN105786918 B CN 105786918B
Authority
CN
China
Prior art keywords
data
storage
task
data acquisition
loading
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410828712.2A
Other languages
Chinese (zh)
Other versions
CN105786918A (en
Inventor
李艳平
王春生
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bright Oceans Inter Telecom Co Ltd
Original Assignee
Bright Oceans Inter Telecom Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bright Oceans Inter Telecom Co Ltd filed Critical Bright Oceans Inter Telecom Co Ltd
Priority to CN201410828712.2A priority Critical patent/CN105786918B/en
Publication of CN105786918A publication Critical patent/CN105786918A/en
Application granted granted Critical
Publication of CN105786918B publication Critical patent/CN105786918B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data query method based on data loading storage space, which comprises the steps of dividing the storage space into storage blocks, configuring a data acquisition task and configuring the corresponding relation between the data acquisition task and the storage blocks; creating a data storage array in a storage block corresponding to the data acquisition task according to the configured data acquisition task; executing the configured data acquisition task to acquire data, loading the acquired data into the corresponding storage array in the corresponding storage block according to the pattern configured in the data acquisition task, and updating the data loading state; and directly executing a data query task according to the task execution state, or querying and obtaining data in a data storage array in the storage space storage block. The invention can realize high-speed and high-efficiency data query. The invention also discloses a data query device based on the data loading storage space.

Description

Data query method and device based on data loading storage space
Technical Field
The invention relates to the field of computers, in particular to a data query technology.
Background
With the continuous development of the IT field, more and more fields apply large-scale software systems to manage various resource data, and in the process of processing certain data by the conventional management system, a large amount of resource information needs to be acquired, such as geographical location information in a specific application scene, other information such as standardized information, engineering reservation information, grouping information, and the like.
The commonly used means for acquiring the information is to search in a database through SQ L, one piece of data needs dozens of to dozens of unequal queries, the queries relate to a plurality of database tables and further relate to huge data quantity, query based on a table structure can seriously affect query efficiency, a set of software system database is a vital component in the whole system, frequent query of the database is caused due to increase of data processing quantity, problems of overhigh CPU utilization rate and the like occur, very heavy pressure is generated on the database, the overall processing capacity of the database is reduced, and once a problem occurs in the database, the operation of the whole system is affected.
In the prior art, a mode is also adopted in which cache data is placed in a Java heap through caching, which also has a problem, and the increased size of a JVM (Java Virtual Machine) heap exceeds a certain amount, which usually results in too long gc (garbagegistration) delay time, thereby seriously affecting the Java application performance and further affecting the performance of the whole system.
Therefore, a high-efficiency data query technology for improving system performance is in urgent need.
Disclosure of Invention
The invention provides a data query method based on data loading storage space, which comprises the following steps:
dividing a storage space into storage blocks, configuring a data acquisition task and configuring the corresponding relation between the data acquisition task and the storage blocks;
creating a data storage array in a storage block corresponding to the data acquisition task according to the configured data acquisition task;
executing the configured data acquisition task to acquire data, loading the acquired data into the corresponding storage array in the corresponding storage block according to the pattern configured in the data acquisition task, and updating the data loading state;
and directly executing a data query task according to the data loading state, or querying to obtain data in a data storage array in the storage space storage block.
Preferably, the method further comprises:
and carrying out persistent storage on the data in each storage block, and loading the data in the persistent storage into the storage once when the storage is started again.
Preferably:
dividing the storage blocks according to the service attributes, and identifying the service types of the data stored in the storage blocks;
and dividing the storage blocks into three layers according to application requirements, specifically, an in-heap cache, an out-of-heap cache and a disk storage, and loading data into the storage blocks of different levels according to the use heat of the acquired data.
In detail, the configuration data acquisition task specifically includes:
configuring the name of the data acquisition task, storage array information, execution command information, the execution period, the execution times, the execution mode and the data volume of the execution operation of the data acquisition task;
and configuring the data acquisition task group according to the relevance of the data acquisition tasks, wherein the data acquisition tasks in the same group correspond to the same storage block.
In more detail:
creating a data storage array in the corresponding storage block according to the configured storage array information and the corresponding relation between the data acquisition task and the storage block;
executing the data acquisition task according to the configured execution command information, the execution period, the execution times, the execution mode and the data volume of the execution operation to acquire target data;
and loading the acquired target data into a corresponding storage array in the created storage block corresponding to the data acquisition task according to the configuration style of the data acquisition task.
Further, the method for directly executing the data query task according to the data loading state, or obtaining the data in the data storage array in the storage space storage block by querying specifically is as follows:
when the data loading state is incomplete, directly executing a query task on a target database, and loading data acquired by executing the query task into a storage block;
and when the data loading state is the completion, querying and obtaining the data in the data storage array in the storage space storage block.
The invention also discloses a data query device based on the data loading storage space, which comprises:
the storage space comprises a plurality of storage blocks and is used for storing data;
the storage space management unit is used for managing the storage space and dividing the storage space into storage blocks;
the task configuration unit is used for configuring a data acquisition task and configuring the corresponding relation between the data acquisition task and the storage block divided by the storage space management unit;
the task execution unit is used for creating a data storage array in a storage block corresponding to the data acquisition task according to the data acquisition task configured by the task configuration unit; executing the data acquisition task to acquire data, and loading the acquired data into the corresponding storage array in the corresponding storage block according to the pattern configured in the data acquisition task; updating the data loading state;
and the data query unit is used for querying the data loading state, directly executing a data query task according to the data loading state, or querying and obtaining data in the data storage array in the storage space storage block.
Preferably, the apparatus further comprises:
and the data persistence unit is used for persistently storing the data in each storage block divided by the storage space management unit and loading the persistently stored data into the storage at one time when the storage is restarted.
In detail, the storage space management unit further includes:
the storage block dividing module is used for dividing the storage blocks according to the service attributes and identifying the service types of the storage data of the storage blocks;
the storage level dividing module is used for dividing the storage blocks divided by the storage block dividing module into three layers according to the application requirements, specifically, an in-heap cache, an out-of-heap cache and a disk storage;
and the storage space management module is used for managing the storage blocks divided by the storage block dividing module according to the relationship between the data acquisition task configured by the task configuration unit and the storage blocks.
In detail, the method for the task configuration unit to configure the data acquisition task specifically includes:
configuring the name of the data acquisition task, storage array information, execution command information, the execution period and times of the data acquisition task and the data volume of the execution operation;
and configuring the data acquisition task group according to the relevance of the data acquisition tasks, wherein the data acquisition tasks in the same group correspond to the same storage block.
In detail, the task execution unit further includes:
the storage array creating module is used for creating a data storage array in the corresponding storage block according to the storage array information configured by the task configuration unit and the corresponding relation between the data acquisition task and the storage block in the storage space;
the data acquisition module is used for executing the data acquisition task according to the execution command information configured by the task configuration unit, the execution period, the execution times, the execution mode and the data volume of the execution operation to acquire target data;
the data loading module is used for loading the data acquired by the data acquisition module into a corresponding storage array in the corresponding storage block created by the storage array creation module according to the configuration style configured by the task configuration unit;
and the data loading module loads data into the storage blocks of different grades divided by the storage grade division module according to the use heat of the acquired data.
In detail, the data query unit further includes:
the data loading state query module is used for querying the data loading state;
the query task execution module is used for directly executing a data query task to a target database of the query task according to the data loading uncompleted state obtained by the query of the data loading state query module;
and the data query module is used for querying and obtaining the data in the data storage array in the storage space storage block according to the data loading completion state obtained by querying of the data loading state query module.
According to the invention, the storage space is divided into the storage blocks, the storage arrays are created in the storage blocks to store the corresponding query target data, and a data user can obtain the query target data only by querying the arrays, so that the query efficiency is improved; the corresponding relation between the data acquisition task and the storage block is configured in advance, so that the storage configuration can be simply and transversely expanded, and the flexibility of the storage block for storing data is improved; the data in each storage block is subjected to persistent storage, so that the data can be loaded and stored at one time when the storage is restarted, and the problem that the data is obtained again after the restarting and the reloading are carried out is solved; the storage is further divided into three storage levels, data are loaded into storage blocks of different levels according to the heat of data use, frequently used data are placed into a heap with high levels for caching, and the data acquisition efficiency during query is further improved; in order to ensure the smooth operation of the query, when the data is not loaded completely, the data query task can be directly executed on the database, and the acquired data is loaded into the storage block for later query; in summary, the present invention provides an efficient data query method based on data loading and storing.
Drawings
Fig. 1 is a schematic flow chart of a data query method based on a data loading storage space according to an embodiment of the present invention;
fig. 1-1 is a schematic diagram of a storage block corresponding to a data acquisition task packet according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a method according to a second embodiment of the present invention;
FIG. 2-1 is a schematic diagram of a hierarchical management of a storage space according to a second embodiment of the present invention;
FIG. 3 is a schematic flow chart of a method according to a third embodiment of the present invention;
fig. 3-1 is a schematic diagram of table data to cache data according to a third embodiment of the present invention;
FIG. 4 is a flowchart illustrating a method according to a fourth embodiment of the present invention;
fig. 5 is a schematic structural diagram of a data query apparatus based on a data loading storage space according to a fifth embodiment of the present invention;
fig. 6 is a schematic structural diagram of an apparatus according to a sixth embodiment of the present invention.
Detailed Description
The following detailed description of the embodiments of the present invention will be provided in conjunction with the drawings and examples, so that how to implement the technical means for solving the technical problems and achieving the technical effects of the present invention can be fully understood and implemented.
Referring to fig. 1, a method for querying data based on a data load storage space according to an embodiment of the present invention is described, where the method includes:
step S101: the method comprises the steps of dividing a storage space into storage blocks, configuring a data acquisition task and configuring the corresponding relation between the data acquisition task and the storage blocks.
The storage space referred to herein is a temporary file exchange area, and the computer temporarily stores the most commonly used files in the cache by lifting the files out of the storage, just as moving tools and materials to the work bench, which is more convenient than taking the files out of the warehouse at the time. Because the temporary file exchange area often uses non-permanent storage which is cut off when power is cut off, the storage form is a Key-Value form.
The memory blocks may be divided according to the traffic type. One memory block performs the storage of one or several service types.
The data acquisition tasks can also be grouped according to the relevance of the data, and the data like a table can be divided into the same data acquisition task group; the associated table data may be divided into the same data acquisition task group. The data acquisition tasks of the same data acquisition task group correspond to the same storage block. As shown in fig. 1-1.
DB is database storage, Table1 … … n represents a database Table, task is a data acquisition task, group is a data acquisition task group, and cache1 … … n represents a storage block.
Step S102: and creating a data storage array in a storage block corresponding to the data acquisition task according to the configured data acquisition task.
And configuring the name of the data acquisition task, the storage array information, the execution command information, the execution period, the execution times, the execution mode and the data volume of the execution operation of the data acquisition task.
The temporary file exchange area, namely, each storage block of the storage space, stores data in a Key-Value form, so that the data is stored in the storage block in a simple manner of an array, and keys of the array and commands for acquiring the data can be configured when a data acquisition task is configured.
Step S103: and executing the configured data acquisition task to acquire data, loading the acquired data into the corresponding storage array in the corresponding storage block according to the mode configured in the data acquisition task, and updating the data loading state.
Step S104: and directly executing a data query task according to the data loading state, or querying to obtain data in a data storage array in the storage space storage block.
When receiving a data query task, if the data to be queried is loaded into a storage block and indicates that the data acquisition task is not completed, the data cannot be searched in the storage block, at the moment, the data query task can be directly issued to a target database table, a data query command is executed, so that target data to be queried is obtained, after the target data is obtained after the query is completed, a query result and the data query task can be used for generating the data acquisition task together, and a specified storage block is loaded for next query. If the data acquisition task is executed, the data can be directly acquired from the storage array in the storage block.
To solve the problem of data recovery from a restart, there are generally two cases of recovery: firstly, the storage space is directly reloaded from the database, the time depends on the performance of the database, and dozens of minutes are needed for recovering data when the data volume is large, so that the speed is low.
Preferably, the present invention restores the storage space by a persistent means, and the present invention further includes step S105.
Step S105: and carrying out persistent storage on the data in each storage block, and loading the data in the persistent storage into the storage once when the storage is started again.
Data persistence, i.e., saving data (e.g., objects in memory) to a storage device that can be permanently saved (e.g., disk), the primary application of persistence is to store objects in memory in relational databases, and certainly in disk files, XM L data files, and the like.
Because the storage space is temporary non-permanent storage, the data can be stored in a data persistence mode, and the storage can be loaded once when power is applied again.
When the enterprise is enlarged, the original effective method is to increase the management level, and the current effective method is to increase the management amplitude. As the management level decreases and the management breadth increases, the pyramidal organization is "compressed" into a flat organization. In the method, the storage space is divided into individual storage blocks, and each storage block stores the corresponding target data, namely, the data originally stored in the table is transferred into the storage block, so that the rapid data query can be realized based on the rapid processing speed of the storage block. And the work of loading data into the storage block and the expansion of the storage block are flat, so that more effective storage management can be realized. Therefore, efficient and high-speed data query is realized.
To better illustrate the present invention, the following example two of the present invention is given, as shown in FIG. 2:
step S201: and dividing the storage space into storage blocks according to the service attributes.
And dividing the storage blocks according to the service attributes, and identifying the service types of the storage data of the storage blocks.
The service attributes such as transmission service, data service, etc., and the size of the storage block are set according to the actual application condition.
The storage blocks may also be divided into three layers according to application requirements, as shown in fig. 2-1, specifically, an in-heap cache, an out-of-heap cache, and a disk storage, and data is loaded into storage blocks of different levels according to the usage heat of the acquired data.
Data are managed by using a simple storage space layering method, and data transfer is automatically carried out between different layers according to the requirements of application operation.
The first layer is a Heap cache (Heap), which may be the Heap memory of the JVM, and can achieve nanosecond level data access performance.
The second layer is an Off-Heap cache (Off-Heap), can be a process-embedded memory region, also belongs to a local hardware RAM, provides microsecond-level data access performance to hundreds of GB, and avoids the problem of response delay caused by JAVA GC.
And a third layer of disk area, which provides data access performance of microsecond level to hundreds of GB to hundreds of TB.
The cache can schedule data to different layers according to the actual conditions of the application during running and the heat of the data (namely the frequency of data use, the higher the frequency is, the higher the heat is, and the access can be completed in a shorter time), the most frequently used data can be configured in the first layer for storage, the most rapidly obtained data is convenient to use, and the like, so that the optimal throughput is provided for the application program.
The size of the storage space of each layer can be set according to the actual situation, and the storage of data can also be configured according to the actual situation. For example, the storage space size may be set as:
maxElementsInMemory="10000"
the number of the configured in-pile cache stored data is 1W;
maxMemoryOffHeap="4G"
the configured off-heap cache size is 4G.
Step S202: and configuring the name of the data acquisition task, the storage array information, the execution command information, the execution period, the execution times, the execution mode and the data volume of the execution operation of the data acquisition task.
And configuring the data acquisition task group according to the relevance of the data acquisition tasks, wherein the data acquisition tasks in the same group correspond to the same storage block.
The stored array information is mainly keys of the designated array, the execution command information is SQ L statements for acquiring data from the target table, the period, the times, the execution model and the data volume of the execution operation of the task are configured, the task can be executed according to the configured parameters, and the parameters can be increased according to the actual conditions so as to configure the data acquisition task conforming to the requirements of the user.
Step S203: and creating a data storage array in the corresponding storage block according to the configured storage array information and the corresponding relation between the data acquisition task and the storage block.
And creating a storage array in the storage block according to the storage block corresponding to the data acquisition task in the step, wherein the key of the storage array is the key configured in the data acquisition task.
The array type is typically an object array.
Step S204: and executing the data acquisition task according to the configured execution command information, the execution period, the execution times, the execution mode and the data volume of the execution operation to acquire target data.
Step S205: and loading the acquired target data into a corresponding storage array in a created storage block corresponding to the data acquisition task according to the configuration style of the data acquisition task, and updating the data loading state.
Step S206: receiving the data query task, obtaining the data loading status, if the data loading status is complete, entering step S207, and if the data loading status is incomplete, entering step S208.
Step S207: and inquiring and acquiring data in the data storage array in the storage space storage block.
Step S208: and directly executing the data query command in the data query task on the target database.
Step S209: and loading the data acquired by the query execution command into a storage block.
Step S210: and carrying out persistent storage on the data in each storage block, and loading the data in the persistent storage into the storage once when the storage is started again.
The invention can improve the data query speed from 100 times/s to 400000 times/s, has low delay and high throughput, linearly expands the performance and greatly improves the subsequent data processing speed. The Off-heap technology is used for storing the cache data outside the heap (Off heap), so that the performance bottleneck of a Java garbage recycling mechanism for a long time is solved, the size limitation of a memory is solved, and the phenomenon that business processing is halted due to a Java process Full GC is solved. And the data storage is layered, so that the problem of high-speed access of hot spot data can be solved. Is suitable for all mainstream JVMs and general low-cost hardware.
To better illustrate the method of the present invention, the third embodiment of the present invention is given below with reference to an example, and the process of acquiring data and loading data into a cache by a configuration data acquiring task is illustrated in fig. 3:
the storage space and the storage blocks are divided in advance according to actual conditions, and the cache name related to the embodiment is 'resourceCache'.
Step S301: and configuring a data acquisition task.
The data acquisition task is configured in the form of a configuration file, as follows:
Figure BDA0000645388450000091
Figure BDA0000645388450000101
description of the configuration:
group: task group name of gcity
And (4) Cache: cache name, resourceCache
One is taking region _ city _ local: {0} as a Key, namely, the content in the position with subscript of 0 in the created array is taken as a Key, and the other is taking region _ city _ local: {5} as a Key, namely, the content in the position with subscript of 5 in the created array is taken as a Key.
PageSize: how much data a page is when queried from the database.
PageSum: how many pages of data to batch commit to the cache.
SQ L-the source of the cached data, retrieves the data by performing SQ L on the data table region _ city _ local.
Scheduler: the periodic scheduling time "0305? "means 5: 30 begin loading.
Warmup: 1 denotes asynchronous loading and 2 table synchronous loading.
Repeatinterval: the time interval for retry of load failure is 60 seconds.
Repeat count: the number of load failure retries is 2.
The loading process resolves the configuration content into two cache loading tasks, the content is stored in the cache in an array form, one of the tasks is Key with region _ city _ local: {0} and the other task is region _ city _ local: {5}, wherein {0} is index number 0 of the array in the SQ L result, namely, the runtime replaces the values of the city _ id and the city _ name fields, such as region _ city _ local: 1 and region _ city _ local: Beijing, and stores the results into the cache with the cache name of resourcache, and updates the data at 5 points 30 per day, and if the results fail, the operation is repeated twice with the interval of 60 seconds each time.
Step S302: and creating a data storage array in the corresponding cache according to the storage array information in the data acquisition task.
This option creates an Object array, i.e., Object [ ] array, without the need to configure the array, which preserves the original database field types. The development of a cache user is more convenient, other objects do not need to be referred, and only the corresponding index needs to be found.
Here, two arrays are created in the resourceCache cache according to the configuration in the data acquisition task.
One key is region _ city _ local: {0} and one key is region _ city _ local: {5 }.
Step S303: and executing the execution command information in the data acquisition task to acquire the data in the target table.
The execution command information in the data acquisition task is as follows: select _ city _ id, progress _ name, progress _ id, region _ name, region _ id, city _ name from region _ city _ local
Executing the command, and acquiring information of 13 city _ id, 1 country _ name, 1 sea area, Beijing and the like from the region _ city _ local table.
Step S304: and filling the acquired data into a cache to create a data storage array.
Respectively storing the acquired data into:
array with Key being region _ city _ local: {13}
TABLE 1Key is an array of region _ city _ local: {13}
Figure BDA0000645388450000111
And an array with key region _ city _ local { Beijing }.
TABLE 2Key is an array of region _ city _ local { Beijing }
Figure BDA0000645388450000112
The conversion process from table data to cached data is shown in fig. 3-1.
This problem can be solved by an asynchronous loading mechanism if a request to query the data occurs while the cache loads the data.
When a data request notification is received, whether the corresponding data loading task is completed or not is firstly inquired through the data loading state, which is a single task, and all tasks are not required to be loaded and completed, but which loading is completed and which is used. And when the loading is not completed, the obtained data loading state is not completed, and at this time, the Cache query is converted into Sql query, and the result set of the database is obtained and returned. And when the cache loading is completed, the request is automatically returned to the data from the cache. Therefore, the problem that service processing can be carried out without waiting for the completion of cache loading in the starting process is solved. According to the actual application requirements, whether the asynchronous loading of the data is carried out or not can be controlled through the execution mechanism parameters in the configured data acquisition task.
To explain the above process in more detail, a fourth embodiment of the present invention is given as shown in fig. 4.
And S401, the data acquisition task acquires data, loads a cache corresponding to the data acquisition task, and updates the data loading state.
Step S402: receiving a data query task of a user, querying the data loading state, if the data loading state is not completed, entering step S403, and if the data loading state is completed, entering step S405
And step S403, converting the query data task into an SQ L request, and executing SQ L on the target database.
And step S404, returning the execution result of the target database SQ L to the user, and loading the execution result into a cache.
Step S405: and acquiring data queried by the user from the cached storage array through a Key.
The present invention further provides a data query apparatus based on data loading storage space for implementing a data query method based on data loading storage space, and the following provides a fifth embodiment of the present invention to illustrate a specific structure of the apparatus, as shown in fig. 5.
The data query device based on the data loading storage space comprises:
the storage space 1 includes a plurality of storage blocks for storing data.
The storage space referred to herein is a temporary file exchange area, and the computer temporarily stores the most commonly used files in the cache by lifting the files out of the storage, just as moving tools and materials to the work bench, which is more convenient than taking the files out of the warehouse at the time. Because the temporary file exchange area often uses non-permanent storage which is cut off when power is cut off, the storage form is a Key-Value form. A storage space management unit 2 for dividing the storage space into storage blocks.
The memory blocks may be divided according to the traffic type. One memory block performs the storage of one or several service types.
The data acquisition tasks can also be grouped according to the relevance of the data, and the data like a table can be divided into the same data acquisition task group; the associated table data may be divided into the same data acquisition task group. The data acquisition tasks of the same data acquisition task group correspond to the same storage block.
And the task configuration unit 3 is used for configuring a data acquisition task and configuring the corresponding relation between the data acquisition task and the storage block divided by the storage space management unit.
The method for the task configuration unit to configure the data acquisition task specifically comprises the following steps:
configuring the name of the data acquisition task, storage array information, execution command information, the execution period and times of the data acquisition task and the data volume of the execution operation;
and configuring the data acquisition task group according to the relevance of the data acquisition tasks, wherein the data acquisition tasks in the same group correspond to the same storage block.
The temporary file exchange area, namely, each storage block of the storage space, stores data in a Key-Value form, so that the data is stored in the storage block in a simple manner of an array, and keys of the array and commands for acquiring the data can be configured when a data acquisition task is configured.
The task execution unit 4 is configured to create a data storage array in a storage block corresponding to the data acquisition task according to the data acquisition task configured by the task configuration unit; executing the data acquisition task to acquire data, and loading the acquired data into the corresponding storage array in the corresponding storage block according to the pattern configured in the data acquisition task; and updating the data loading state.
And the data query unit 5 is configured to query the data loading state, and directly perform a data query task according to the data loading state, or query and obtain data in a data storage array in the storage space storage block.
When receiving a data query task, if the data to be queried is loaded into a storage block and indicates that the data acquisition task is not completed, the data cannot be searched in the storage block, at the moment, the data query task can be directly issued to a target database table, a data query command is executed, so that target data to be queried is obtained, after the target data is obtained after the query is completed, a query result and the data query task can be used for generating the data acquisition task together, and a specified storage block is loaded for next query. If the data acquisition task is executed, the data can be directly acquired from the storage array in the storage block. Whether to adopt the mode for data query can be configured through the 'execution mode' item in the data acquisition task.
And the data persistence unit 6 is configured to perform persistent storage on data in each storage block divided by the storage space management unit, and load the persistently stored data into the storage once when the storage is restarted.
The working principle of the invention is as follows: the storage space management unit 2 divides the storage space into storage blocks and the levels of the storage blocks according to the service attributes, the task configuration unit 3 configures data acquisition tasks, the task execution unit 4 executes the data acquisition tasks, a storage array is created, the acquired data are loaded into the storage array in the storage blocks according to the configuration condition, and a data user can directly inquire and quickly acquire the required data through the key of the array, so that high-speed and high-efficiency data inquiry is realized.
To describe the structure of each part of the data query device based on loading data into the storage space in detail, the following provides a sixth embodiment of the present invention, as shown in fig. 6.
The storage space 1 includes a plurality of storage blocks for storing data.
The storage space management unit 2 further includes:
and the storage block dividing module 21 is configured to divide the storage block according to the service attribute, and identify a service class of the storage data of the storage block.
And the storage level dividing module 22 is configured to divide the storage blocks divided by the storage block dividing module into three layers according to application needs, specifically, an in-heap cache, an out-of-heap cache, and a disk storage, and load data into storage blocks of different levels according to the use heat of the acquired data.
A task configuration unit 3, configured to configure a data acquisition task and configure a corresponding relationship between the data acquisition task and the storage block divided by the storage space management unit
The task execution unit 4 further includes:
a storage array creating module 41, configured to create a data storage array in the corresponding storage block according to the storage array information configured by the task configuration unit and the corresponding relationship between the data acquisition task and the storage block in the storage space;
a data obtaining module 42, configured to execute the data obtaining task according to the execution command information configured by the task configuration unit, the execution period, the execution frequency, the execution mode, and the data amount of the execution operation, so as to obtain target data;
and a data loading module 43, configured to load the data acquired by the data acquiring module into a corresponding storage array in the corresponding storage block created by the storage array creating module according to the configuration style configured by the task configuration unit.
The data querying unit 5 further comprises:
and a data loading state query module 51, configured to query a data loading state.
And the query task execution module 52 is configured to directly execute a data query task to a target database of the query task according to the data loading uncompleted state obtained by querying by the data loading state query module.
And the data query module 53 is configured to obtain data in the data storage array in the storage space storage block according to the data loading completion status query obtained by the data loading status query module.
And the data persistence unit 6 is configured to perform persistent storage on data in each storage block divided by the storage space management unit, and load the persistently stored data into the storage once when the storage is restarted.
The device of the present invention is used to implement a query method based on a data loading space, and therefore, the working principle of each part is similar to that of the method part, and reference may be made to the description of the method part, and the description of the method part may also be referred to in an application example, which is not repeated herein.
Although the embodiments of the present invention have been described, the description is not intended to limit the scope of the invention. Workers skilled in the art will recognize that changes may be made in form and detail without departing from the spirit and scope of the disclosure. The scope of the present invention is defined by the appended claims.

Claims (10)

1. The data query method based on the data loading storage space is characterized by comprising the following steps:
dividing a storage space into storage blocks, wherein the storage space is a temporary file exchange area, configuring a data acquisition task and configuring the corresponding relation between the data acquisition task and the storage blocks; dividing the storage blocks according to the service attributes, and identifying the service types of the data stored in the storage blocks; dividing the storage blocks into three layers according to application requirements, specifically, an in-heap cache, an out-of-heap cache and a disk storage, and loading data into the storage blocks of different levels according to the use heat of the acquired data;
creating a data storage array in a storage block corresponding to the data acquisition task according to the configured data acquisition task;
executing the configured data acquisition task to acquire data, loading the acquired data into a corresponding storage array in a corresponding storage block according to a pattern configured in the data acquisition task, and updating a data loading state;
and directly executing a data query task according to the data loading state, or querying to obtain data in a data storage array in the storage space storage block.
2. The method of claim 1, further comprising:
and carrying out persistent storage on the data in each storage block, and loading the data in the persistent storage into the storage once when the storage is started again.
3. The method according to claim 2, wherein the configuration data acquisition task is specifically:
configuring the name of the data acquisition task, storage array information, execution command information, the execution period, the execution times, the execution mode and the data volume of the execution operation of the data acquisition task;
and configuring the data acquisition task group according to the relevance of the data acquisition tasks, wherein the data acquisition tasks in the same group correspond to the same storage block.
4. The method of claim 3, wherein:
creating a data storage array in the corresponding storage block according to the configured storage array information and the corresponding relation between the data acquisition task and the storage block;
executing the data acquisition task according to the configured execution command information, the execution period, the execution times, the execution mode and the data volume of the execution operation to acquire target data;
and loading the acquired target data into a corresponding storage array in the created storage block corresponding to the data acquisition task according to the configuration style of the data acquisition task.
5. The method according to claim 4, wherein the method for directly performing a data query task according to the data loading status or obtaining data in the data storage array in the storage space storage block by querying specifically comprises:
when the data loading state is incomplete, directly executing a query task on a target database, and loading data acquired by executing the query task into a storage block;
and when the data loading state is the completion, querying and obtaining the data in the data storage array in the storage space storage block.
6. A data query apparatus for loading data into a storage space, the apparatus comprising:
the storage space is a temporary file exchange area and comprises a plurality of storage blocks for storing data;
the storage space management unit is used for managing the storage space and dividing the storage space into storage blocks;
the task configuration unit is used for configuring a data acquisition task and configuring the corresponding relation between the data acquisition task and the storage block divided by the storage space management unit;
the task execution unit is used for creating a data storage array in a storage block corresponding to the data acquisition task according to the data acquisition task configured by the task configuration unit; executing the data acquisition task to acquire data, and loading the acquired data into a corresponding storage array in a corresponding storage block according to a pattern configured in the data acquisition task; updating the data loading state;
the data query unit is used for querying the data loading state, directly executing a data query task according to the data loading state, or querying and obtaining data in a data storage array in the storage space storage block;
the storage space management unit further includes:
the storage block dividing module is used for dividing the storage blocks according to the service attributes and identifying the service types of the storage data of the storage blocks;
the storage level dividing module is used for dividing the storage blocks divided by the storage block dividing module into three layers according to the application requirements, specifically, an in-heap cache, an out-of-heap cache and a disk storage;
and the storage space management module is used for managing the storage blocks divided by the storage block dividing module according to the relationship between the data acquisition task configured by the task configuration unit and the storage blocks.
7. The apparatus of claim 6, further comprising:
and the data persistence unit is used for persistently storing the data in each storage block divided by the storage space management unit and loading the persistently stored data into the storage at one time when the storage is restarted.
8. The apparatus according to claim 7, wherein the method for the task configuration unit to configure the data acquisition task specifically is:
configuring the name of the data acquisition task, storage array information, execution command information, the execution period and times of the data acquisition task and the data volume of the execution operation;
and configuring the data acquisition task group according to the relevance of the data acquisition tasks, wherein the data acquisition tasks in the same group correspond to the same storage block.
9. The apparatus of claim 8, wherein the task execution unit further comprises:
the storage array creating module is used for creating a data storage array in the corresponding storage block according to the storage array information configured by the task configuration unit and the corresponding relation between the data acquisition task and the storage block in the storage space;
the data acquisition module is used for executing the data acquisition task according to the execution command information configured by the task configuration unit, the execution period, the execution times, the execution mode and the data volume of the execution operation to acquire target data;
the data loading module is used for loading the data acquired by the data acquisition module into a corresponding storage array in the corresponding storage block created by the storage array creation module according to the configuration style configured by the task configuration unit;
and the data loading module loads data into the storage blocks of different grades divided by the storage grade division module according to the use heat of the acquired data.
10. The apparatus of claim 9, wherein the data query unit further comprises:
the data loading state query module is used for querying the data loading state;
the query task execution module is used for directly executing a data query task to a target database of the query task according to the data loading uncompleted state obtained by the query of the data loading state query module; and the data query module is used for querying and obtaining the data in the data storage array in the storage space storage block according to the data loading completion state obtained by querying of the data loading state query module.
CN201410828712.2A 2014-12-26 2014-12-26 Data query method and device based on data loading storage space Active CN105786918B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410828712.2A CN105786918B (en) 2014-12-26 2014-12-26 Data query method and device based on data loading storage space

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410828712.2A CN105786918B (en) 2014-12-26 2014-12-26 Data query method and device based on data loading storage space

Publications (2)

Publication Number Publication Date
CN105786918A CN105786918A (en) 2016-07-20
CN105786918B true CN105786918B (en) 2020-08-04

Family

ID=56388624

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410828712.2A Active CN105786918B (en) 2014-12-26 2014-12-26 Data query method and device based on data loading storage space

Country Status (1)

Country Link
CN (1) CN105786918B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107844262B (en) * 2016-09-21 2021-06-25 东软集团股份有限公司 Data caching and sending method and device
CN106844721A (en) * 2017-02-09 2017-06-13 济南浪潮高新科技投资发展有限公司 Date storage method, device, system, computer-readable recording medium and storage control
CN109408689B (en) * 2018-10-24 2020-11-24 北京金山云网络技术有限公司 Data acquisition method, device and system and electronic equipment
CN111831699B (en) * 2020-09-21 2021-01-08 北京新唐思创教育科技有限公司 Data caching method, electronic equipment and computer readable medium
CN113204564B (en) * 2021-05-20 2023-02-28 山东英信计算机技术有限公司 Database high-frequency SQL query method, system and storage medium
CN114840498B (en) * 2022-07-05 2022-09-13 北京优合融宜科技有限公司 Method and device for realizing memory key value data management based on Java technology

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101408852A (en) * 2008-11-26 2009-04-15 阿里巴巴集团控股有限公司 Method, apparatus and system for scheduling task
CN101710983A (en) * 2009-11-24 2010-05-19 中兴通讯股份有限公司 Method for playing real-time streaming files and device thereof
CN103023800A (en) * 2012-11-29 2013-04-03 北京航空航天大学 Method for scheduling traffic under multi-core network processor by traffic chart mapping scheduling strategy

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7469318B2 (en) * 2005-02-10 2008-12-23 International Business Machines Corporation System bus structure for large L2 cache array topology with different latency domains
CN102103545B (en) * 2009-12-16 2013-03-27 中兴通讯股份有限公司 Method, device and system for caching data
CN103390061B (en) * 2013-07-31 2016-12-28 浙江大学 Customer-centric and spatio-temporal data accessing method based on multi-level buffer
CN103455284A (en) * 2013-09-18 2013-12-18 北京华胜天成科技股份有限公司 Method and device for reading and writing data

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101408852A (en) * 2008-11-26 2009-04-15 阿里巴巴集团控股有限公司 Method, apparatus and system for scheduling task
CN101710983A (en) * 2009-11-24 2010-05-19 中兴通讯股份有限公司 Method for playing real-time streaming files and device thereof
CN103023800A (en) * 2012-11-29 2013-04-03 北京航空航天大学 Method for scheduling traffic under multi-core network processor by traffic chart mapping scheduling strategy

Also Published As

Publication number Publication date
CN105786918A (en) 2016-07-20

Similar Documents

Publication Publication Date Title
CN105786918B (en) Data query method and device based on data loading storage space
Marcu et al. Spark versus flink: Understanding performance in big data analytics frameworks
KR101365464B1 (en) Data management system and method using database middleware
CN102831120B (en) A kind of data processing method and system
CN101930472A (en) Parallel query method for distributed database
CN102169507A (en) Distributed real-time search engine
CN106919675B (en) Data storage method and device
CN100538646C (en) A kind of method and apparatus of in distributed system, carrying out the SQL script file
CN102163195A (en) Query optimization method based on unified view of distributed heterogeneous database
CN104301360A (en) Method, log server and system for recording log data
CN102750356A (en) Construction and management method for secondary indexes of key value library
CN103455526A (en) ETL (extract-transform-load) data processing method, device and system
CN103488687A (en) Searching system and searching method of big data
CN110941602B (en) Database configuration method and device, electronic equipment and storage medium
CN104331421A (en) High-efficiency processing method and system for big data
CN104239377A (en) Platform-crossing data retrieval method and device
CN110597835B (en) Transaction data deleting method and device based on blockchain
CN103823846A (en) Method for storing and querying big data on basis of graph theories
CN104182441A (en) Data sheet synchronization method and device
US10642530B2 (en) Global occupancy aggregator for global garbage collection scheduling
CN111488323B (en) Data processing method and device and electronic equipment
CN112000649B (en) Method and device for synchronizing incremental data based on map reduce
CN115391427A (en) System, method, medium, and apparatus for automatic scaling of Impala
Al-Khasawneh et al. MapReduce a comprehensive review
CN103631804A (en) Map cutting method and processing system of electronic map

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant