CN114372077A - Performance index data retrieval method and device, electronic equipment and storage medium - Google Patents

Performance index data retrieval method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN114372077A
CN114372077A CN202111681305.XA CN202111681305A CN114372077A CN 114372077 A CN114372077 A CN 114372077A CN 202111681305 A CN202111681305 A CN 202111681305A CN 114372077 A CN114372077 A CN 114372077A
Authority
CN
China
Prior art keywords
target
data
compressed
index data
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111681305.XA
Other languages
Chinese (zh)
Inventor
黄践焜
卢国庆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Communication Information System Co Ltd
Original Assignee
Inspur Communication Information System Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Communication Information System Co Ltd filed Critical Inspur Communication Information System Co Ltd
Priority to CN202111681305.XA priority Critical patent/CN114372077A/en
Publication of CN114372077A publication Critical patent/CN114372077A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing

Abstract

The invention provides a method, a device, electronic equipment and a medium for retrieving performance index data, wherein the method comprises the following steps: determining SQL sentences input by a target user; analyzing the SQL statement according to a preset structured query language execution engine to acquire a target field to be retrieved; the structured query language execution engine determines a compressed field name and position information corresponding to the target field to be retrieved in a preset metadata table, and changes the target field to be retrieved into a calling function; and extracting decompressed target data and returning the target data to the target user according to the calling function, the compressed field name and the position information. According to the data retrieval method provided by the invention, the performance index data which is not used in predicate filtering is compressed in multiple rows, so that data retrieval can be completed in a single data table, the data retrieval efficiency is improved, and the user experience is improved.

Description

Performance index data retrieval method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of data retrieval technologies, and in particular, to a performance index data retrieval method and apparatus, an electronic device, and a storage medium.
Background
With the continuous development and perfection of big data ecosystems, the processing efficiency of data retrieval is more and more important.
At present, in order to improve the efficiency of data retrieval, most databases with high concurrency and low delay are used, for example, a Voltdb database is a relational database based on a memory, data can be stored in the memory, and under the condition of high concurrency (100 concurrency levels), the ms-level response time can be reached to return the required data. However, when the data size is large, if 10 hundred million 5000 lines of data are directly retrieved by using Voltdb, firstly, the number of lines of a single data table of Voltdb cannot exceed 1024 lines, and for 5000 lines of data, the data are usually divided into a plurality of data tables for storage and associated query, but the associated query can significantly affect the query efficiency of Voltdb, and the data are not compressed and stored by default in Voltdb, so that the memory space consumption is large, the processing efficiency is low, and the user experience is affected.
Disclosure of Invention
The invention provides a performance index data retrieval method, a performance index data retrieval device, electronic equipment and a performance index data retrieval medium, which are used for solving the technical problem that in the prior art, retrieval and query efficiency is seriously influenced through sub-table processing under the condition of large data volume, so that user experience is poor.
In a first aspect, the present invention provides a performance index data retrieval method, including:
determining SQL sentences input by a target user;
analyzing the SQL statement according to a preset structured query language execution engine to acquire a target field to be retrieved;
determining a compressed field name and position information corresponding to the target field to be retrieved through the structured query language execution engine in a preset metadata table, and changing the target field to be retrieved into a calling function;
and extracting decompressed target data and returning the target data to the target user according to the calling function, the compressed field name and the position information.
Further, according to the performance index data retrieval method provided by the present invention, the structured query language execution engine is composed of a user-defined function and a Java driver,
correspondingly, the extracting decompressed target data according to the calling function, the compressed field name and the position information and returning the decompressed target data to the target user includes:
the Java driver program determines target data corresponding to a target field to be retrieved according to the compressed field name and the position information;
decompressing the target data, extracting the decompressed target data from a performance index data table according to the calling function, and returning the decompressed target data to the target user;
wherein the calling function is determined by the user-defined function.
Further, according to the performance index data retrieval method provided by the present invention, the structured query language execution engine is composed of a user-defined function and a Java driver,
correspondingly, the decompressing the target data, extracting the decompressed target data from the performance index data table according to the call function, and returning the extracted target data to the target user includes:
updating the target data, decompressing the updated target data, and returning the decompressed target data to the Java driver;
and the Java driver returns the decompressed target data to the target user.
Further, according to the performance index data retrieval method provided by the present invention, before the determining the SQL statement input by the target user, the method includes:
determining a data table to be processed;
and determining a plurality of pieces of target index data according to the frequency of data query in the to-be-processed data table, and compressing and storing the plurality of pieces of target index data according to a preset storage mode.
Further, according to the performance index data retrieval method provided by the present invention, the compressing and storing the plurality of pieces of target index data according to a preset storage manner includes:
grouping the target index data according to set grouping information to obtain a plurality of groups;
determining a plurality of index values for the groups, and sequentially compressing target index data in the groups according to the determined index values to obtain a plurality of compressed data;
determining that the plurality of compressed data is stored into the performance indicator data table for a plurality of columns.
Further, according to the performance index data retrieval method provided by the present invention, after the target index data are compressed and stored in a preset storage manner, the method includes:
determining a compression field name of the plurality of compressed data;
and determining the metadata structure information of the plurality of compressed field names according to the compressed field names and the set arrangement sequence, and storing the metadata structure information into a metadata table.
Further, according to the performance index data retrieval method provided by the present invention, after determining the metadata structure information of the plurality of compressed field names, the method includes:
under the condition that new target index data are added at the tail ends of a plurality of target index data corresponding to a first compressed field name, adding one to the metadata structure information of the first compressed field name;
under the condition that target index data corresponding to the first compressed field name is deleted, determining the target index data as junk information;
wherein the first compressed field name is any one of the plurality of compressed field names.
In a second aspect, the present invention further provides a performance index data retrieving apparatus, including:
the determining module is used for determining the SQL sentences input by the target user;
the analysis module is used for analyzing the SQL statement according to a preset structured query language execution engine to acquire a target field to be retrieved;
a determining and changing module, which determines the compressed field name and the position information corresponding to the target field to be retrieved through the structured query language execution engine in a preset metadata table, and changes the target field to be retrieved into a calling function;
and the extraction module is used for extracting decompressed target data and returning the decompressed target data to the target user according to the calling function, the compressed field name and the position information.
In a third aspect, the present invention further provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of any of the above performance index data retrieval methods when executing the program.
In a fourth aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the performance index data retrieval method as described in any of the above.
In a fifth aspect, the present invention also provides a computer program product comprising computer executable instructions for implementing the steps of the performance indicator data retrieval method according to any one of the above.
The invention provides a performance index data retrieval method, a device, electronic equipment and a medium, which are characterized in that SQL sentences input by a target user are determined, the SQL sentences are analyzed according to a preset structured query language execution engine to obtain target fields to be retrieved, the structured query language execution engine determines compressed field names and position information corresponding to the target fields to be retrieved in a preset metadata table, the target fields to be retrieved are changed into calling functions, and decompressed target data are extracted and returned to the target user according to the calling functions, the compressed field names and the position information. According to the data retrieval method provided by the invention, the performance index data which is not used in predicate filtering is compressed in multiple rows, so that data retrieval can be completed in a single data table, the data retrieval efficiency is improved, and the user experience is improved.
Drawings
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of a performance index data retrieval method provided by the present invention;
FIG. 2 is a schematic overall flow chart of a performance index data retrieval method for use with the present invention;
FIG. 3 is a schematic structural diagram of a performance index data retrieval apparatus according to the present invention;
fig. 4 is a schematic structural diagram of an electronic device provided in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a schematic flow chart of a performance index data retrieval method provided by the present invention, and as shown in fig. 1, the performance index data retrieval method provided by the present invention specifically includes the following steps:
step 101: and determining the SQL statement input by the target user.
In this embodiment, a search keyword input by a target user at a client needs to be obtained, and then a corresponding SQL statement is determined according to the input search keyword. The search keywords may be keywords such as "payroll", "XX bill", and the like, where the client may be a mobile terminal used by the target user, or may also be a PC client used by the target user, and are not limited specifically herein.
It should be noted that an SQL (Structured Query Language) statement is a database Query and programming Language for accessing data and querying, updating and managing a relational database system, and is essentially a Language for operating on a database.
Step 102: and analyzing the SQL statement according to a preset structured query language execution engine to obtain a target field to be retrieved.
In this embodiment, the obtained SQL statement needs to be parsed by a preset structured query language execution engine, and a target field to be retrieved is determined from a parsing result. Wherein, a field refers to a certain characteristic in the document, namely a data item, and has a unique field identifier for computer identification. It should be noted that, the SQL statement is analyzed by a mature analysis method in the prior art, which is not described in detail here.
It should be noted that, the Structured Query Language (Structured Query Language) execution engine is a layer of SQL execution engine added in front of the Voltdb database, and processes the SQL statement to be executed and translates the SQL statement into an SQL statement that can be correctly recognized by the underlying Voltdb database.
It should be noted that, according to the determined target field, a target data table related to the text content corresponding to the SQL statement is further determined from the in-memory database, and if the data corresponding to the target field to be retrieved is compressed data, the data is stored in the metadata table, and if the data is uncompressed data, the data is stored in the performance index data table. The memory database is a relational database based on a memory, such as a Voltdb database, a conventional RDBMS database, an HBase database, and the like, and the Voltdb database is adopted in the embodiment, and may be specifically selected according to actual needs of a user, which is not specifically limited herein.
Step 103: and determining the compressed field name and the position information corresponding to the target field to be retrieved through the structured query language execution engine in a preset metadata table, and changing the target field to be retrieved into a calling function.
In this embodiment, a structured query language execution engine is required to determine a compressed field name and location information corresponding to a target field to be retrieved in a preset metadata table, and then the structured query language execution engine is used to change the target field to be retrieved into a calling function for extracting target data. It should be noted that the determination manner of the calling function is determined by using a user-defined function, and the following embodiments can be specifically seen, which are not described in detail herein.
It should be noted that, the metadata table is constructed in advance when multiple rows of data in the performance index data table are compressed, and can determine the compressed field name and the location information of the target field to be retrieved in the performance index data table, where if the compressed field name is index array 4 and the location information is 16, it indicates that the target field to be retrieved is in the 4 th compression index array in the performance index data table and is the 16 th row in the compression index data, and then determines the data corresponding to the 16 th row in the compression index array 4 as the target data.
Step 104: and extracting decompressed target data and returning the target data to the target user according to the calling function, the compressed field name and the position information.
In this embodiment, it is necessary to determine target data according to the calling function, the compression field name, and the location information determined in step 103, decompress the target data, extract the target data according to the calling function, and return the target data to the target user. It should be noted that, in this embodiment, the decompressed target data is sent to the target user through a preset structured query language execution engine.
It should be noted that the decompression processing mode adopts a user-defined function mode to decompress the target data, such as decompression function decompression, and decompresses a varbinary type of compressed data into varchar data with values separated by | for example. Similarly, when compressing the target data, a compression function compress may be used to compress varchar data, which is separated by | into various values, into a varbinary type.
It should be noted that, the target data is the performance index data after decompression processing, and because the compressed and stored columns exist in the performance index data table, when the application program directly uses SQL to operate the data table in the Voltdb database, an expected result cannot be obtained, the application program needs to first take out the compressed field data, decompress the compressed field data in the memory space of the application program, then screen out the required original index field data, and finally complete the operations of filtering, sorting, calculating and the like on the original index field data in the application program. In this way, the implementation complexity of the application program is increased sharply, and the performance is reduced, and originally, only one list of result response application programs needs to be transmitted, and the result transmits the data response application program of the whole compressed field.
It should be noted that the SQL execution engine is specifically divided into two parts, namely, a User-Defined Function (UDF) in the Voltdb database and a customized Java driver. The Java driver is responsible for performing translation operation on DDL, DML and DQL statements, the SQL grammar supported by a Voltdb database is used as a basis for expansion, and the statement types are explained as follows:
a) CREATE TABLE OF DDL statement
When a data table building statement needs to give a field definition, a complete intra compressed-column-name clause may be added at the end of the field definition, where the compressed-column-name is a compressed field name, and the fields specifying the same compressed-column-name may be stored in the field according to the sequence declared in the DDL statement, and meanwhile, the information of the compressed field may be written in the metadata table according to the metadata structure information.
It should be noted that the field specifying the complete INTOs clause cannot be used to create an index, as a primary key, as a partition field, or as a TTL field.
b) ALTER TABLE of DDL statement
When the metadata table structure information needs to be modified, only field information needing to be compressed is added, modified and deleted, and other field information does not need to be processed. When newly adding field information needing to be compressed, adding information of a corresponding column to the metadata structure information according to compressed-column-name information specified in a COMPRESS INTO clause, wherein the real data table structure does not need to be modified, and the BEFORE keyword is not supported at the moment; when the data type of the field needing to be compressed is modified, directly modifying the data in the metadata structure information; when deleting the data type of the field needing to be compressed, directly setting the org _ col _ stat field of the corresponding record in the metadata structure information as 0.
c) CREATE VIEW of DDL statement
CREATE VIEW mainly processes the SELECT clause in the sentence, and it should be noted that the processing mode refers to the DQL sentence and is not specifically described.
d) SELECT of DQL statements
The processing of the SELECT statement is described in several cases:
i. and for the case of SELECT, namely, retrieving all compressed fields, directly returning the compressed data to the client connected with the Java database, and carrying out primary decompression processing on the client to reduce the pressure of sequential decompression of the single fields of the server.
And ii, for the case of the SELECT column-name, decompressing the compressed data through a fetch _ AS _ xxx function, and extracting a value corresponding to the original field, wherein the replaced SQL statement is in the form of a SELECT fetch _ AS _ int (compressed-column-name, org-column-index) AS column-name.
And iii, for the WHERE clause, the ORDER BY clause, the GROUP BY clause and the HAVING clause, adopting a fetch _ as _ xxx function to process the compressed data into a basic type and then operating the basic type.
if table-reference is a subquery, then the SQL in the subquery is processed first.
v. if the situation of the association of a plurality of data tables is involved, the SQL sentences of the plurality of data tables are processed respectively.
e) INSERT of DML statement
<1> for the column-name clause in the INTO clause, the data packet to be compressed needs to be replaced by the compressed field name.
<2> for the processing of value-expression in the first type, in the client side of the Java database connection drive, filling the corresponding data into arrays in sequence according to the sequence of each original field in the compressed data, finally converting the arrays into character strings for compression, and filling the compressed data into the translated value-expression clause as the value corresponding to the compressed data;
<3> for the way the select-expression is handled in the second type, reference is made to the description of the DQL statement.
f) UPDATE of DML statement
The processing rule for the column-name value-expression clause is as follows:
<1> if only one original field data in one compressed data is updated, column-name is replaced with compressed-column-name, and value-expression after equal sign is replaced with update _ xxx _ value (compressed-column-name, org-column-index, value-expression).
<2> if a plurality of original field data in one compressed data are updated, a clause belonging to the same compressed data field needs to be merged and updated, similarly to the following form:
compressed-column-name is compressed (replace _ string _ value (compressed-column-name), org-column-index 1, value1), org-column-index 2, value 2)). The compressed data field is decompressed into a character string by a compression function, nested replacement is carried out one by a replace _ string _ value function, and the character string is compressed into binary data by a decompression function compression.
It should be noted that, for the processing of the where clause, reference is made to the processing of the where clause in the DQL clause, and details are not described here.
g) DELETE of DML statements
For the DELETE statement, only the WHERE clause and the ORDER BY clause need to be processed, and the processing rule refers to the DQL statement and is not described in detail herein.
According to the performance index data retrieval method provided by the invention, an SQL sentence input by a target user is determined, the SQL sentence is analyzed according to a preset structured query language execution engine, a target field to be retrieved is obtained, the structured query language execution engine determines a compressed field name and position information corresponding to the target field to be retrieved in a preset metadata table, the target field to be retrieved is changed into a calling function, and decompressed target data is extracted and returned to the target user according to the calling function, the compressed field name and the position information. According to the data retrieval method provided by the invention, the performance index data which is not used in predicate filtering is compressed in multiple rows, so that data retrieval can be completed in a single data table, the data retrieval efficiency is improved, and the user experience is improved.
In another embodiment of the present invention, the structured query language execution engine is comprised of user-defined functions and Java drivers,
correspondingly, the extracting decompressed target data according to the calling function, the compressed field name and the position information and returning the decompressed target data to the target user includes:
the Java driver program determines target data corresponding to a target field to be retrieved according to the compressed field name and the position information;
decompressing the target data, extracting the decompressed target data from a performance index data table according to the calling function, and returning the decompressed target data to the target user;
wherein the calling function is determined by the user-defined function.
In this embodiment, it is necessary to determine, by using the Java driver, target data corresponding to a target field to be retrieved from the performance index data table according to the determined compression field name and location information, then decompress the target data by using a decompression function, extract the decompressed target data from the performance index data table according to a call function, and return the decompressed target data to the target user. In the present embodiment, in order to distribute all the performance index data in one performance index data table, compression processing is performed on the performance index data that is not frequently used in advance, that is, the target data is compressed data.
It should be noted that, the target data is decompressed by using a predefined user-defined function, so as to obtain decompressed target data, where the compression process is also performed on the target data by using a predefined user-defined function, and the user-defined function is specifically shown in table 1 below. For example, the decompression function decompression in table 1 below is used to decompress the target data, so as to obtain the decompressed target function.
TABLE 1
Figure BDA0003439756410000111
Figure BDA0003439756410000121
Figure BDA0003439756410000131
It should be noted that, when the target data is compressed and stored by using the user-defined function, the performance index data needs to be stored in the Voltdb database in a manner that the network object and the time point are used as rows and the dimension and the index are used as columns, and a specific storage form is shown in table 2 below.
TABLE 2
Figure BDA0003439756410000132
According to the performance index data retrieval method provided by the invention, the target data corresponding to the target field to be retrieved is determined by the Java driver according to the compressed field name and the position information, the target data is decompressed, and the decompressed target data is extracted from the performance index data table according to the calling function and returned to the target user, so that the efficiency of performance index data retrieval can be improved, and the influence on query performance is reduced.
In another embodiment of the present invention, the structured query language execution engine is comprised of user-defined functions and Java drivers,
correspondingly, the decompressing the target data, extracting the decompressed target data from the performance index data table according to the call function, and returning the extracted target data to the target user includes:
updating the target data, decompressing the updated target data, and returning the decompressed target data to the Java driver;
and the Java driver returns the decompressed target data to the target user.
In this embodiment, the target data may also be updated, the updated target data is decompressed, the decompressed target data is returned to the Java driver, and the Java driver returns the decompressed target data to the target user. It should be noted that the decompression processing is performed in the Voltdb database, the UDF in the Voltdb database can perform custom processing on a single cell, and the UDF can be used to perform operations such as decompression, extraction, update and the like on target data in the Voltdb database server, so as to avoid transmitting the content of the whole compressed data back to the client for processing, and reduce the pressure of network IO and the CPU/memory pressure of the client.
For example, as shown in table 1, the floating point type data at the specified position in the compressed data is updated by using the update _ float _ value update function, and the new compressed data is returned and stored, and the updated compressed data is decompressed between subsequent applications, and the decompressed second target data is returned to the target user.
According to the performance index data retrieval method provided by the invention, the target data is updated, the updated target data is decompressed, the decompressed target data is returned to the Java driver, and then the decompressed target data is returned to the target user by the Java driver. The timeliness of the data can be guaranteed, and the accuracy of retrieval processing is improved.
In another embodiment of the present invention, before the determining the SQL statement input by the target user, the method further comprises:
determining a data table to be processed;
and determining a plurality of pieces of target index data according to the frequency of data query in the to-be-processed data table, and compressing and storing the plurality of pieces of target index data according to a preset storage mode.
In this embodiment, a to-be-processed data table needs to be acquired, a plurality of pieces of target index data to be compressed are determined according to the frequency of data in the to-be-processed data table being queried, and then the plurality of pieces of target index data are compressed and stored according to a preset storage method.
It should be noted that, since the data storage in a single data table in the Voltdb database has an upper limit of 1024 fields, the above data table storage method is not suitable for storing a large amount of data, in this embodiment, a plurality of pieces of target data in the data table to be processed are compressed before data retrieval, for frequently accessed fields, such as network object ID, sampling time, network object name, all statistical dimensions and part of the most common indexes, the existing storage mode is kept, the index data which is not used frequently is divided into a plurality of groups according to the index information, when each group of index data is stored, the index information is stored into an array, the index data in the array is compressed, and finally the compressed binary data is stored into a column in a Voltdb database as varbinary field type data, wherein the specific storage form is shown in the following table 3.
TABLE 3
Figure BDA0003439756410000151
Figure BDA0003439756410000161
The target data before compression is shown in the form of a character string, as shown in table 4 below.
TABLE 4
Figure BDA0003439756410000162
The compressed target data is presented in hexadecimal form. The specific form is shown in table 5 below.
TABLE 5
Figure BDA0003439756410000163
According to the performance index data retrieval method provided by the invention, the target data in the data table to be processed is compressed before the user inputs the retrieval key word, so that the processing efficiency of data retrieval can be improved, and the user experience is improved.
In another embodiment of the present invention, the compressing and storing the plurality of pieces of target index data according to a preset storage method includes:
grouping the target index data according to set grouping information to obtain a plurality of groups;
determining a plurality of index values for the groups, and sequentially compressing target index data in the groups according to the determined index values to obtain a plurality of compressed data;
determining that the plurality of compressed data is stored into the performance indicator data table for a plurality of columns.
In this embodiment, it is necessary to perform grouping processing on a plurality of pieces of target index data according to set grouping information to obtain a plurality of groups, then sequentially perform compression processing on the target index data in the plurality of groups according to a plurality of index values determined by the plurality of groups to obtain a plurality of pieces of compressed data, and store the plurality of pieces of compressed data as columns in the memory database in the performance index data table of the memory database. Each compressed data at least comprises two columns of target index data, and the compression processing of the target index data is to ensure that all data are in one data table, so that the time for data query is saved.
According to the performance index data retrieval method provided by the invention, a plurality of pieces of target index data are subjected to grouping processing according to a preset grouping form, then the target index data in each group are subjected to compression processing according to set index values, and the obtained compressed data are determined to be in a column form and are stored in a performance index data table, so that the data retrieval efficiency can be improved, and the data processing accuracy is ensured.
In another embodiment of the present invention, after the compressing and storing the plurality of pieces of target index data according to a preset storage method, the method includes:
determining a compression field name of the plurality of compressed data;
and determining the metadata structure information of the plurality of compressed field names according to the compressed field names and the set arrangement sequence, and storing the metadata structure information into a metadata table.
In this embodiment, the metadata structure needs to be constructed for the compressed target data. The metadata structure information of the compressed data is determined according to the compressed field names and the set arrangement sequence. The arrangement order may be sequentially arranged according to the order in the performance index data table, or a new arrangement order may be newly set for the performance index data table, so as to determine new position information, but the new position information is to have a corresponding relationship with the target data in the performance index data table.
It should be noted that each compressed field name corresponds to a plurality of target data, the compressed field name exists in the metadata table, the corresponding target data exists in the performance index data table, and there is a certain correspondence between the compressed field name and the target data, for example, the compressed field name is index array 1, and the compressed field name corresponds to target data 1, target data 2, target data 3, and target data 4. The specific corresponding relationship may be set according to the actual needs of the user, and is not specifically limited herein.
It should be noted that, because part of the index field data is compressed and stored, in order to accurately find the position of the target data in the index value array when reading and writing these compressed fields, additional metadata needs to be managed, that is, which original index fields are included in each compressed index field and the arrangement order of these index fields need to be managed. The table structure definition of the metadata table is specifically shown in table 6 below.
TABLE 6
Figure BDA0003439756410000181
According to the performance index data retrieval method provided by the invention, the metadata structure information of the plurality of compressed field names is determined by determining the compressed field names of the plurality of compressed data according to the compressed field names and the set arrangement sequence, the position of the compressed target data can be accurately determined through the constructed metadata structure table, and the accuracy of data retrieval is improved.
In another embodiment of the present invention, after determining the metadata structure information of the plurality of compressed field names, the method further includes:
under the condition that new target index data are added at the tail ends of a plurality of target index data corresponding to a first compressed field name, adding one to the metadata structure information of the first compressed field name;
under the condition that target index data corresponding to the first compressed field name is deleted, determining the target index data as junk information;
wherein the first compressed field name is any one of the plurality of compressed field names.
In this embodiment, addition or deletion processing may be performed on the compressed target index data corresponding to the metadata structure information. When new target index data are added at the tail ends of a plurality of target index data in the first compressed field name, adding one to the metadata structure information of the first compressed field name; when the target index data in the first compressed field name is deleted, the target index data needs to be confirmed as spam, and the rest of the performance index data information remains unchanged.
It should be noted that, the adjustment of the metadata in the metadata structure table does not affect the existing history data in the table, that is, when the index data is added, the array length of the new compressed data is longer than the array length of the history data, when the index is deleted, the value of the target index data in the history data is not cleared but becomes garbage data and is not accessible, and the value of the array position corresponding to the target index data in the new compressed data is null.
In another embodiment of the present invention, as shown in fig. 2, in this embodiment, a Java driver receives an SQL statement input by a user, and performs parsing on the SQL statement to obtain a target field to be retrieved, determines a compressed field name and location information corresponding to the target field from a metadata table according to the parsing result, and then determines target data corresponding to the target field from a performance index data table according to the compressed field name and the location information, decompresses the target data, extracts the target data according to a call function, and returns the extracted target data to the target user.
It should be noted that, in this embodiment, a Voltdb database is used as a bottom storage, a performance index data table and a metadata table are set, a data storage manner is designed to achieve the purpose of storing mass data in a single data table, and an upper query layer is packaged to transparently achieve SQL access support. It should be noted that, in this embodiment, a distributed relational memory database is used as the bottom data storage, and the binary data type provided by the Voltdb database is used to perform compressed storage on the multiple field data.
It should be noted that, in order to operate the target index data more conveniently and shield the influence of the compressed data on the bottom layer, a layer of SQL execution engine needs to be added in front of the Voltdb database, and the SQL statements to be executed are analyzed and translated into SQL statements that can be correctly identified by the bottom-layer Voltdb database and submitted to the Voltdb database for execution.
It should be noted that the SQL execution engine is mainly composed of a User-Defined Function (UDF) of Voltdb and a customized Java driver. The UDF in the Voltdb database can perform user-defined processing on a single cell, the UDF can be used for performing operation processing such as compression, decompression, extraction and updating on target index data in the Voltdb database, the content of the whole compressed data is prevented from being transmitted back to a client side for processing, and the pressure of network IO and the pressure of a CPU/memory of the client side are reduced. The Java driver is responsible for translating the standard SQL statement input by the target user into the SQL statement corresponding to the compressed data, so that no perception to the user is realized, and the development complexity caused by processing the target index data in the performance index data table through the metadata table and the user-defined function is reduced.
Fig. 3 is a performance index data retrieval device according to the present invention, and as shown in fig. 3, the performance index data retrieval device according to the present invention includes:
the determining module is used for determining the SQL sentences input by the target user;
the analysis module is used for analyzing the SQL statement according to a preset structured query language execution engine to acquire a target field to be retrieved;
a determining and changing module, which determines the compressed field name and the position information corresponding to the target field to be retrieved through the structured query language execution engine in a preset metadata table, and changes the target field to be retrieved into a calling function;
and the extraction module is used for extracting decompressed target data and returning the decompressed target data to the target user according to the calling function, the compressed field name and the position information.
The invention provides a performance index data retrieval device, which is characterized in that SQL sentences input by a target user are determined, the SQL sentences are analyzed according to a preset structured query language execution engine to obtain target fields to be retrieved, then the structured query language execution engine determines the compressed field names and the position information corresponding to the target fields to be retrieved in a preset metadata table, the target fields to be retrieved are changed into calling functions, and decompressed target data are extracted and returned to the target user according to the calling functions, the compressed field names and the position information. According to the data retrieval method provided by the invention, the performance index data which is not used in predicate filtering is compressed in multiple rows, so that data retrieval can be completed in a single data table, the data retrieval efficiency is improved, and the user experience is improved.
Since the principle of the apparatus according to the embodiment of the present invention is the same as that of the method according to the above embodiment, further details are not described herein for further explanation.
Fig. 4 is a schematic structural diagram of an electronic device provided in an embodiment of the present invention, and as shown in fig. 4, the present invention provides an electronic device, including: a processor (processor)401, a memory (memory)402, and a bus 403;
the processor 401 and the memory 402 complete communication with each other through the bus 403;
the processor 401 is configured to call the program instructions in the memory 402 to execute the methods provided in the above-mentioned embodiments of the methods, including, for example: determining an SQL sentence input by a target user, analyzing the SQL sentence according to a preset structured query language execution engine to obtain a target field to be retrieved, then determining a compressed field name and position information corresponding to the target field to be retrieved by the structured query language execution engine in a preset metadata table, changing the target field to be retrieved into a calling function, and extracting decompressed target data according to the calling function, the compressed field name and the position information and returning the decompressed target data to the target user.
In addition, the logic instructions in the memory 403 may be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the method provided by the above methods, the method comprising: determining an SQL sentence input by a target user, analyzing the SQL sentence according to a preset structured query language execution engine to obtain a target field to be retrieved, then determining a compressed field name and position information corresponding to the target field to be retrieved by the structured query language execution engine in a preset metadata table, changing the target field to be retrieved into a calling function, and extracting decompressed target data according to the calling function, the compressed field name and the position information and returning the decompressed target data to the target user.
In yet another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program that, when executed by a processor, is implemented to perform the methods provided above, the method comprising: determining an SQL sentence input by a target user, analyzing the SQL sentence according to a preset structured query language execution engine to obtain a target field to be retrieved, then determining a compressed field name and position information corresponding to the target field to be retrieved by the structured query language execution engine in a preset metadata table, changing the target field to be retrieved into a calling function, and extracting decompressed target data according to the calling function, the compressed field name and the position information and returning the decompressed target data to the target user.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A performance index data retrieval method is characterized by comprising the following steps:
determining SQL sentences input by a target user;
analyzing the SQL statement according to a preset structured query language execution engine to acquire a target field to be retrieved;
determining a compressed field name and position information corresponding to the target field to be retrieved through the structured query language execution engine in a preset metadata table, and changing the target field to be retrieved into a calling function;
and extracting decompressed target data and returning the target data to the target user according to the calling function, the compressed field name and the position information.
2. The performance index data retrieval method of claim 1, wherein the structured query language execution engine is comprised of user-defined functions and Java drivers,
correspondingly, the extracting decompressed target data according to the calling function, the compressed field name and the position information and returning the decompressed target data to the target user includes:
the Java driver program determines target data corresponding to a target field to be retrieved according to the compressed field name and the position information;
decompressing the target data according to the calling function, extracting the decompressed target data from a performance index data table and returning the decompressed target data to the target user;
wherein the calling function is determined by the user-defined function.
3. The performance index data retrieval method of claim 2, wherein the structured query language execution engine is comprised of user-defined functions and Java drivers,
correspondingly, the decompressing the target data, extracting the decompressed target data from the performance index data table according to the call function, and returning the extracted target data to the target user includes:
updating the target data according to the calling function, decompressing the updated target data, and returning the decompressed target data to the Java driver;
and the Java driver returns the decompressed target data to the target user.
4. The method of claim 1, wherein prior to said determining the target user entered SQL statement, comprising:
determining a data table to be processed;
and determining a plurality of pieces of target index data according to the frequency of data query in the to-be-processed data table, and compressing and storing the plurality of pieces of target index data according to a preset storage mode.
5. The performance index data retrieval method according to claim 4, wherein the compressing and storing the plurality of pieces of target index data in a preset storage manner includes:
grouping the target index data according to set grouping information to obtain a plurality of groups;
determining a plurality of index values for the groups, and sequentially compressing target index data in the groups according to the determined index values to obtain a plurality of compressed data;
determining that the plurality of compressed data is stored into the performance indicator data table for a plurality of columns.
6. The performance index data retrieval method according to claim 4, wherein after the compressing and storing the plurality of pieces of target index data in a preset storage manner, the method comprises:
determining a compression field name of the plurality of compressed data;
and determining the metadata structure information of the plurality of compressed field names according to the compressed field names and the set arrangement sequence, and storing the metadata structure information into a metadata table.
7. The performance index data retrieval method of claim 6, wherein the determining metadata structure information for the plurality of compressed field names comprises:
under the condition that new target index data are added at the tail ends of a plurality of target index data corresponding to a first compressed field name, adding one to the metadata structure information of the first compressed field name;
under the condition that target index data corresponding to the first compressed field name is deleted, determining the target index data as junk information;
wherein the first compressed field name is any one of the plurality of compressed field names.
8. A performance index data retrieval device, comprising:
the determining module is used for determining the SQL sentences input by the target user;
the analysis module is used for analyzing the SQL statement according to a preset structured query language execution engine to acquire a target field to be retrieved;
a determining and changing module, which determines the compressed field name and the position information corresponding to the target field to be retrieved through the structured query language execution engine in a preset metadata table, and changes the target field to be retrieved into a calling function;
and the extraction module is used for extracting decompressed target data and returning the decompressed target data to the target user according to the calling function, the compressed field name and the position information.
9. An electronic device, comprising: a processor, a memory, and a bus, wherein,
the processor and the memory are communicated with each other through the bus;
the memory stores program instructions executable by the processor, the processor invoking the program instructions to perform the steps of the performance indicator data retrieval method of any of claims 1 to 7.
10. A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the steps of the performance indicator data retrieval method according to any one of claims 1 to 7.
CN202111681305.XA 2021-12-28 2021-12-28 Performance index data retrieval method and device, electronic equipment and storage medium Pending CN114372077A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111681305.XA CN114372077A (en) 2021-12-28 2021-12-28 Performance index data retrieval method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111681305.XA CN114372077A (en) 2021-12-28 2021-12-28 Performance index data retrieval method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114372077A true CN114372077A (en) 2022-04-19

Family

ID=81142284

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111681305.XA Pending CN114372077A (en) 2021-12-28 2021-12-28 Performance index data retrieval method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114372077A (en)

Similar Documents

Publication Publication Date Title
US9710517B2 (en) Data record compression with progressive and/or selective decomposition
US10019284B2 (en) Method for performing transactions on data and a transactional database
CN103761318B (en) A kind of method and system of relationship type synchronization of data in heterogeneous database
CN101499094B (en) Data compression storing and retrieving method and system
EP1635272B1 (en) Method for evaluating XML twig queries using index structures and relational query processors.
US8661022B2 (en) Database management method and system
US11030242B1 (en) Indexing and querying semi-structured documents using a key-value store
CN111400323B (en) Data retrieval method, system, equipment and storage medium
CN112231321B (en) Oracle secondary index and index real-time synchronization method
CN104731945A (en) Full-text searching method and device based on HBase
KR101544560B1 (en) An online analytical processing system for big data by caching the results and generating 2-level queries by SQL parsing
US20230124432A1 (en) Database Indexing Using Structure-Preserving Dimensionality Reduction to Accelerate Database Operations
US11327962B1 (en) Real-time analytical database system for querying data of transactional systems
JP6199513B1 (en) Two-stage query processing system with integrated cache table
CN114372077A (en) Performance index data retrieval method and device, electronic equipment and storage medium
US20230139988A1 (en) Efficient scan through comprehensive bitmap-index over columnar storage format
US11086828B2 (en) Compression of column store tables
US11163781B2 (en) Extended storage of text analysis source tables
CN113505129A (en) Dictionary-based database query optimization method and device and terminal equipment
US10769214B2 (en) Encoding and decoding files for a document store
KR102013839B1 (en) Method and System for Managing Database, and Tree Structure for Database
US20230394017A1 (en) Systems and methods for column store indices
Bast Efficient and Effective Search on Wikidata Using the QLever Engine
Bansal Modeling Sparse and Evolving Data
CN117529714A (en) Method and system for recommending storage formats for a migrating RDBMS

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination