CN116361346B - Data table analysis method, device and equipment based on mask calculation and storage medium - Google Patents

Data table analysis method, device and equipment based on mask calculation and storage medium Download PDF

Info

Publication number
CN116361346B
CN116361346B CN202310647218.5A CN202310647218A CN116361346B CN 116361346 B CN116361346 B CN 116361346B CN 202310647218 A CN202310647218 A CN 202310647218A CN 116361346 B CN116361346 B CN 116361346B
Authority
CN
China
Prior art keywords
data
mask
operator
calculation
data table
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310647218.5A
Other languages
Chinese (zh)
Other versions
CN116361346A (en
Inventor
于帆
李明
赵鑫鑫
姜凯
王雄儒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Inspur Scientific Research Institute Co Ltd
Original Assignee
Shandong Inspur Scientific Research Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Inspur Scientific Research Institute Co Ltd filed Critical Shandong Inspur Scientific Research Institute Co Ltd
Priority to CN202310647218.5A priority Critical patent/CN116361346B/en
Publication of CN116361346A publication Critical patent/CN116361346A/en
Application granted granted Critical
Publication of CN116361346B publication Critical patent/CN116361346B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application discloses a data table analysis method, a device, equipment and a storage medium based on mask calculation, which relate to the technical field of computers and comprise the following steps: receiving a data query instruction sent by a client, and analyzing the data query instruction to obtain a target calculation operator; determining related data in a data table participating in operator operation, and performing mask calculation based on the related data to obtain an effective data mask; the operator operation is a data operation based on the target calculation operator; performing data operation based on the effective data mask and the original data in the data table to obtain effective column data; and performing acceleration operation on the effective column data based on the target calculation operator, and outputting target data according to an operation result so as to return the target data to the client. Therefore, the independent analysis process of the data table can be omitted, and the analysis of the data table is performed through mask calculation, so that the analysis efficiency of the data table is effectively improved.

Description

Data table analysis method, device and equipment based on mask calculation and storage medium
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a data table analysis method, apparatus, device, and storage medium based on mask calculation.
Background
The database is used as an integrated system for data record storage, adopts a plurality of different models to organize data, and can store the data in different forms such as rows, columns, tables and the like. The database supports the omnibearing storage, searching and analysis of data and is widely applied to the fields of business, industry, intelligent home, medical care and the like. With the advent of the artificial intelligence era, the large data and informatization process has been continuously advanced, the installed capacity of the database has been exponentially increased, and the speed requirements for querying the database data have been increasing.
In the prior art, the database mostly realizes the query processing of user access data through a CPU carried by a server. However, when the CPU processes the computationally intensive task, a significant burden is caused to a process and a memory, which affects task scheduling and processing speeds of other processes of the server, so that using the FPGA and the GPU as coprocessors to share the database computationally intensive task becomes an emerging technical direction, however, heterogeneous acceleration can accelerate the computationally intensive task, but the first step of heterogeneous acceleration computation, that is, the analysis of the database data table affects the data transmission rate, becomes a factor affecting the acceleration efficiency of heterogeneous computation.
Disclosure of Invention
Accordingly, the present invention is directed to a method, apparatus, device, and storage medium for analyzing a data table based on mask calculation, which can analyze a data query instruction into a calculation operator, and perform mask calculation by using related data in a data table participating in the operator operation to obtain an effective data mask, and then perform an acceleration operation on the data table data based on the effective data mask and the calculation operator to output target data, so that an independent analysis process of the data table can be omitted, and the analysis of the data table is performed by using the mask calculation, thereby effectively improving the efficiency of the analysis of the data table. The specific scheme is as follows:
in a first aspect, the present application discloses a data table parsing method based on mask calculation, including:
receiving a data query instruction sent by a client, and analyzing the data query instruction to obtain a target calculation operator;
determining related data in a data table participating in operator operation, and performing mask calculation based on the related data to obtain an effective data mask; the operator operation is a data operation based on the target calculation operator;
performing data operation based on the effective data mask and the original data in the data table to obtain effective column data;
And performing acceleration operation on the effective column data based on the target calculation operator, and outputting target data according to an operation result so as to return the target data to the client.
Optionally, the receiving the data query instruction sent by the client and analyzing the data query instruction to obtain the target computing operator includes:
receiving a data query instruction sent by a client based on a terminal application layer, and processing the data query instruction in an instruction analysis layer to obtain a bottom layer code corresponding to the data query instruction;
analyzing the bottom layer code to obtain a plurality of calculation operators;
and carrying out operator optimization on the plurality of calculation operators based on a preset optimization algorithm to obtain a target calculation operator.
Optionally, the determining related data in the data table participating in the operator operation, and performing mask calculation based on the related data to obtain an effective data mask, includes:
determining a data table participating in operator operation in a task management layer, and determining the line length, the initial byte and the continuous byte of data participating in operator operation in the data table;
generating a maximum data mask with the same length based on the line length at a query acceleration layer, and performing right shift operation on the maximum data mask based on the initial byte to obtain a right shift mask;
Performing left shift operation on the maximum data mask based on the initial byte and the continuous byte at a query acceleration layer to obtain a left shift mask;
and performing an exclusive nor operation on the right shift mask and the left shift mask to obtain an effective data mask.
Optionally, before performing the data operation based on the valid data mask and the original data in the data table to obtain valid column data, the method further includes:
and carrying out data encryption operation and data compression operation on the related data so as to carry out data transmission between the processor and the accelerator card based on the obtained compressed data.
Optionally, the performing data operation based on the valid data mask and the original data in the data table to obtain valid column data includes:
and performing AND operation by using the effective data mask and the original data in the data table to extract effective column data participating in the operator operation in the original data.
Optionally, after performing data operation based on the valid data mask and the original data in the data table to obtain valid column data, the method further includes:
and caching the original data to realize synchronous operation of the valid column data and the original data.
Optionally, the accelerating operation is performed on the valid column data based on the target computing operator, and target data is output according to an operation result, so as to return the target data to the client, including:
determining an operator algorithm corresponding to the target calculation operator at a query acceleration layer, and performing acceleration operation on the effective column data through the operator algorithm to obtain a corresponding operation result;
and outputting target data based on the generation sequence of the operation result, and returning the target data to the client.
In a second aspect, the present application discloses a data table parsing apparatus based on mask calculation, including:
the instruction analysis module is used for receiving a data query instruction sent by the client and analyzing the data query instruction to obtain a target calculation operator;
the mask calculation module is used for determining related data in a data table participating in operator operation and carrying out mask calculation based on the related data so as to obtain an effective data mask; the operator operation is a data operation based on the target calculation operator;
the data operation module is used for carrying out data operation based on the effective data mask and the original data in the data table so as to obtain effective column data;
And the data acceleration module is used for carrying out acceleration operation on the effective column data based on the target calculation operator, outputting target data according to an operation result and returning the target data to the client.
In a third aspect, the present application discloses an electronic device comprising:
a memory for storing a computer program;
a processor for executing the computer program to implement the data table parsing method based on mask calculation as described above.
In a fourth aspect, the present application discloses a computer readable storage medium storing a computer program which, when executed by a processor, implements a data table parsing method based on mask computation as described above.
In the application, firstly, a data query instruction sent by a client is received, the data query instruction is analyzed to obtain a target calculation operator, then related data in a data table participating in operator operation is determined, and mask calculation is performed based on the related data to obtain an effective data mask; the operator operation is a data operation based on the target calculation operator; and performing data operation based on the effective data mask and the original data in the data table to obtain effective column data, performing acceleration operation on the effective column data based on the target calculation operator, and outputting target data according to an operation result to return the target data to the client. Therefore, according to the data table analysis method based on the mask calculation, after the data query instruction is received, the data query instruction can be analyzed into a calculation operator, related data in a data table participating in operator calculation is determined, so that the related data is subjected to mask calculation to obtain an effective mask, original data in the data table is processed based on the effective mask, effective column data is obtained, and finally the calculation operator is utilized to accelerate the effective column data, and an operation result is returned to the client. Therefore, the independent analysis process of the data table can be omitted, and the analysis of the data table is performed through mask calculation, so that the analysis efficiency of the data table is effectively improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a method for analyzing a data table based on mask calculation provided in the present application;
FIG. 2 is a schematic diagram of a data table parsing method based on mask calculation provided in the present application;
FIG. 3 is a flowchart of a specific method for analyzing a data table based on mask calculation provided in the present application;
FIG. 4 is a representation of two-dimensional data provided herein;
fig. 5 is a schematic diagram of a transmission data structure provided in the present application;
FIG. 6 is a schematic diagram of a mask calculation process provided in the present application;
FIG. 7 is a flowchart of another specific method for analyzing a data table based on mask calculation provided in the present application;
FIG. 8 is an interaction timing diagram of a data table analysis method based on mask calculation provided in the present application;
Fig. 9 is a schematic structural diagram of a data table parsing device based on mask calculation provided in the present application;
fig. 10 is a block diagram of an electronic device provided in the present application.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In the prior art, the database mostly realizes the query processing of user access data through a CPU carried by a server. However, when the CPU processes the computationally intensive task, a significant burden is caused to a process and a memory, which affects task scheduling and processing speeds of other processes of the server, so that using the FPGA and the GPU as coprocessors to share the database computationally intensive task becomes an emerging technical direction, however, heterogeneous acceleration can accelerate the computationally intensive task, but the first step of heterogeneous acceleration computation, that is, the analysis of the database data table affects the data transmission rate, becomes a factor affecting the acceleration efficiency of heterogeneous computation.
In order to overcome the technical problems, the application provides a data table analysis method, a device, equipment and a storage medium based on mask calculation, which can analyze a data query instruction into a calculation operator, perform mask calculation through related data in a data table participating in the operator operation to obtain an effective data mask, and then perform acceleration operation on data table data based on the effective data mask and the calculation operator to output target data.
Referring to fig. 1, the embodiment of the invention discloses a data table analysis method based on mask calculation, which comprises the following steps:
and S11, receiving a data query instruction sent by a client, and analyzing the data query instruction to obtain a target calculation operator.
In this embodiment, a data query instruction sent by a client is received, and the data query instruction is parsed to obtain a target computing operator. That is, as shown in fig. 2, the overall architecture of the data table parsing method based on mask computation in the present application is shown, where a terminal application layer provides a graphical or terminal application interface for a user, and the user can set instructions for database behavior at the terminal application layer, so that it can be seen that the user can generate a data query instruction through a client interface provided by the terminal application layer according to requirements, and then the terminal application layer sends the data query instruction to the instruction parsing layer, so that the instruction parsing layer processes the data query instruction through three steps of instruction parsing, instruction parsing and operator optimization, so as to obtain a target computation operator corresponding to the data query instruction.
Step S12, determining related data in a data table participating in operator operation, and carrying out mask calculation based on the related data to obtain an effective data mask; the operator operation is a data operation based on the target calculation operator.
In this embodiment, relevant data in a data table participating in the operator operation is determined, and mask calculation is performed based on the relevant data, so as to obtain an effective data mask. That is, after determining the target computation operator, the task management layer needs to determine related data in the data table participating in the operator operation, that is, determine the line length, the start byte and the duration byte of the data participating in the operator operation in the data table participating in the operator operation, so as to provide a data base for mask computation. After determining the line length, the initial byte and the continuous byte of the data participating in the operator operation in the data table, mask calculation can be performed at the query acceleration layer based on the obtained data so as to obtain an effective data mask.
The task management layer comprises five parts, namely task scheduling, driving management, data encryption, data compression and data transmission, wherein the task scheduling carries out comprehensive scheduling management on the processes of operator operation, data encryption, compression and transmission, driving management and the like which are decomposed by the instruction, and the orderly execution of the operator operation and data processing processes is ensured. The drive management is to manage the drive program scheduling of the essential components such as PCIE (peripheral component interconnect express, high-speed serial computer expansion bus standard) buses, XRT (Xilinx Runtime Library) and the like of the accelerator card, so as to ensure the normal communication and management of the database host and the accelerator card. And the accelerator card is a physical processor (Physics Processing Unit, PPU) which is a specially designed processor product for accelerating the execution of physical simulation algorithms. The simulation algorithm capable of accelerating comprises rigid body dynamics, collision detection, fluid simulation, soft object and object rupture simulation. With such hardware devices, the processor in the computer system now can be freed from its unsophisticated physical simulation and artificial intelligence algorithms, leaving the CPU, GPU and PPU each responsible for its own most powerful part in the game.
It should be further noted that, before performing an and operation by using the valid data mask and the original data in the data table to obtain valid column data, the method further includes: and carrying out data encryption operation and data compression operation on the related data so as to carry out data transmission between the processor and the accelerator card based on the obtained compressed data. That is, according to the above, the task management layer includes five parts, i.e., task scheduling, driving management, data encryption, data compression and data transmission, but the data encryption and data compression are optional parts, and the data encryption, data compression and data transmission perform optional data processing and data transmission on the data participating in the operator operation, and the three parts of data transmission ensure the data transmission between the CPU and the accelerator card, while the data encryption and data compression are optional functions, and the user can compress and encrypt the data transmission process according to the requirements, so as to ensure the security and further transmission speed of the data.
And step S13, carrying out data operation based on the effective data mask and the original data in the data table to obtain effective column data.
In this embodiment, data operation is performed based on the valid data mask and the original data in the data table to obtain valid column data. That is, after the valid data mask is obtained, the query acceleration layer may parse the original data in the data table according to the valid mask to extract valid column data, so as to use the query acceleration module to perform operation acceleration on the valid column data by using the target calculation operator obtained by decomposing the instruction.
It should be noted that, after performing the data operation based on the valid data mask and the original data in the data table to obtain valid data, the method further includes: and caching the original data to realize synchronous operation of the valid column data and the original data. That is, after the valid line data in the original data table is extracted to the valid line data, the original line data may be cached, and further, it is to be noted that the data participating in the operator operation may be line data or column data in the data table, and when the data participating in the operator operation is the original column data, the original column data needs to be processed based on the obtained valid mask to obtain the valid line data, so that the synchronous operation of the valid line data and the original line data may be ensured by caching the original line data.
And step S14, performing acceleration operation on the effective column data based on the target calculation operator, and outputting target data according to an operation result so as to return the target data to the client.
In this embodiment, the acceleration operation is performed on the valid column data based on the target calculation operator, and the target data is output according to the operation result, so as to return the output data to the client. That is, after the valid column data is extracted through the valid data mask, the valid column data can be correspondingly operated by using the query acceleration module according to the obtained target calculation operator at the query acceleration layer, so as to obtain final target data, and the output data is returned to the client.
It can be seen that in this embodiment, a data query instruction sent by a client is received first, the data query instruction is parsed to obtain a target calculation operator, then relevant data in a data table participating in the operator operation is determined, and mask calculation is performed based on the relevant data to obtain an effective data mask; the operator operation is a data operation based on the target calculation operator; and performing data operation based on the effective data mask and the original data in the data table to obtain effective column data, performing acceleration operation on the effective column data based on the target calculation operator, and outputting target data according to an operation result to return the target data to the client. Therefore, according to the data table analysis method based on the mask calculation, after the data query instruction is received, the data query instruction can be analyzed into a calculation operator, related data in a data table participating in operator calculation is determined, so that the related data is subjected to mask calculation to obtain an effective mask, original data in the data table is processed based on the effective mask, effective column data is obtained, and finally the calculation operator is utilized to accelerate the effective column data, and an operation result is returned to the client. Therefore, the independent analysis process of the data table can be omitted, the analysis of the data table is carried out through mask calculation, the analysis efficiency of the data table is effectively improved, the data table is analyzed into the first step of heterogeneous calculation, the analysis process of the data table is combined with the heterogeneous query acceleration process through the mask calculation, the analysis of the data table and the query acceleration are synchronously carried out, the analysis efficiency of the data table is improved, and meanwhile, the acceleration efficiency and the acceleration speed of the heterogeneous calculation are further improved.
As can be seen from the foregoing embodiments, in the process of parsing the data table, the data in the data table needs to be calculated, so that the valid data mask is obtained, so that the valid data can be extracted through the valid data mask. Referring to fig. 3, the embodiment of the invention discloses a data table analysis method based on mask calculation, which comprises the following steps:
and S21, receiving a data query instruction sent by the client, and analyzing the data query instruction to obtain a target calculation operator.
Step S22, determining a data table participating in operator operation in a task management layer, and determining the row length, the starting byte and the duration byte of data participating in operator operation in the data table.
In this embodiment, a data table participating in the operator operation is determined at the task management layer, and a line length, a start byte and a duration byte of data participating in the operator operation in the data table are determined. That is, the data table for performing the data table analysis is a two-dimensional data table as shown in fig. 4. And the two-dimensional data table consists of a plurality of rows and a plurality of columns, the data table is a data sequence connected row by row after the two-dimensional data table is transmitted from the memory of the host to the accelerator card by the database host, and the control prefix is added at the front end of each row by the database host, wherein the control prefix comprises a row length, a starting byte and a continuous byte, and the structure of the control prefix is shown in figure 5. The control prefix is control data added by the host in the data transmission process and describing the row (column) attribute of the data table and the effective data attribute of the participation operator, and consists of a row LENGTH (LENGTH), an operator effective data START BYTE (start_byte) and a continuous BYTE (cont_byte). The line length describes the data length of the next transmission sequence, and the start byte and the duration byte describe the start position and the duration length of the valid data participating in the operator operation in the current transmission sequence. The row length, the start byte and the duration byte of the data participating in the operator operation in the data table need to be determined in the task management layer so as to perform mask calculation according to the data.
Step S23, generating a maximum data mask with the same length based on the line length at the query acceleration layer, and performing right shift operation on the maximum data mask based on the start byte to obtain a right shift mask.
In this embodiment, a maximum data mask with the same length is generated based on the line length at the query acceleration layer, and a right shift operation is performed on the maximum data mask based on the start byte, so as to obtain a right shift mask. That is, as shown in fig. 6, the MASK calculation process needs to first generate the maximum data MASK init_mask of the same length, that is, the full 1 sequence of length of the line length, according to the determined line length: INIT_MASK= { LENGTH 8 {1}, then the maximum data MASK is right shifted according to the start byte by the LENGTH: (LENGTH-start_byte+1) x 8, resulting in a right shift MASK right_mask, i.e.: RIGH_MASK=INIT_MASK > > (LENGTH-START_BYTE+1) 8.
And step S24, performing left shift operation on the maximum data mask based on the initial byte and the continuous byte at the query acceleration layer to obtain a left shift mask.
In this embodiment, the query acceleration layer performs a left shift operation on the maximum data mask based on the start byte and the persistent byte to obtain a left shift mask. That is, in the mask calculation process shown in fig. 6, after obtaining the right shift mask, it is necessary to determine the left shift mask, and perform a left shift operation on the maximum data mask according to the start byte and the continuation byte, where the left shift length is: (start_byte+cont_byte-1) x 8, resulting in a LEFT shift MASK left_mask, i.e.: left_mask=init_mask < < (start_byte+cont_byte-1) ×8.
And S25, performing an exclusive OR operation on the right shift mask and the left shift mask to obtain an effective data mask.
In this embodiment, the right shift mask and the left shift mask are exclusive nor operated to obtain an effective data mask. That is, after the right shift mask and the left shift mask are obtained, a valid data mask needs to be determined, and the determination process of the valid data mask is as follows: and performing an exclusive nor operation on the left shift MASK and the right shift MASK to obtain an effective data MASK valid_mask, namely: valid_mask=left_mask +_right_mask.
And step S26, carrying out data operation based on the effective data mask and the original data in the data table to obtain effective column data.
And step S27, performing acceleration operation on the effective column data based on the target calculation operator, and outputting target data according to an operation result so as to return the target data to the client.
It should be noted that, in this embodiment, the more detailed descriptions of step S21, step S26 and step S27 refer to the foregoing embodiments, and will not be repeated here.
It can be seen that, in this embodiment, in order to determine an effective computation mask, it is first required to determine a data table participating in an operator operation at a task management layer, determine a line length, a start byte and a duration byte of data participating in the operator operation in the data table, then generate a maximum data mask with the same length based on the line length at a query acceleration layer, and perform a right shift operation on the maximum data mask based on the start byte to obtain a right shift mask, and perform a left shift operation on the maximum data mask based on the start byte and the duration byte at the query acceleration layer to obtain a left shift mask, and finally perform an exclusive nor operation on the right shift mask and the left shift mask to obtain the effective data mask. In this way, the analysis process of the data table can be accelerated through mask calculation, the analysis process of the data table can be combined with the heterogeneous query acceleration process through mask calculation, the analysis of the data table and the query acceleration are synchronously carried out, and the acceleration efficiency and the speed of the heterogeneous calculation are further improved.
Based on the foregoing embodiments, it can be known that the data table may be parsed by the mask calculation, for which this embodiment describes in detail how the data table may be parsed by the mask calculation, as shown in fig. 7, the embodiment of the present invention discloses a data table parsing method based on the mask calculation, which includes:
step S31, a data query instruction sent by a client based on a terminal application layer is received, and the data query instruction is processed in an instruction analysis layer to obtain a bottom layer code corresponding to the data query instruction.
In this embodiment, a data query instruction sent by a client based on a terminal application layer is received, and the data query instruction is processed at an instruction analysis layer to obtain a bottom code corresponding to the data query instruction. That is, as shown in fig. 8, which is a timing chart of a data table analysis method based on mask calculation, the data query instruction sent by the client needs to be received in the terminal application layer, and the data query instruction is sent to the instruction analysis layer through the terminal application layer, so that the instruction analysis layer translates the instruction into a bottom code which can be understood by the CPU through instruction analysis. The terminal application layer includes an access interface provided by the database platform for the user and a graphical interface of the database application, and the user can realize operations such as registration, login, access, downloading and the like through the terminal application layer.
And S32, analyzing the bottom layer codes to obtain a plurality of calculation operators.
In this embodiment, the underlying code is parsed to obtain a plurality of computation operators. That is, after the instruction analysis layer obtains the bottom code corresponding to the data query instruction through instruction analysis, the bottom code needs to be analyzed through instruction decomposition to obtain a plurality of computation operators corresponding to the data query instruction, so that the instruction function of the data query instruction is completed through cooperation of the computation operators.
And step S33, carrying out operator optimization on the plurality of calculation operators based on a preset optimization algorithm to obtain a target calculation operator.
In this embodiment, the operator optimization is performed on the plurality of calculation operators based on a preset optimization algorithm, so as to obtain a target calculation operator. That is, the instruction analysis layer optimizes operators decomposed from the instructions through operator optimization through a preset algorithm, so that the operation efficiency and the acceleration efficiency among the operators are improved to the maximum extent. It should be noted that, the preset algorithm may be set according to the user requirement, and in this embodiment, the preset algorithm includes, but is not limited to, non-coherent parallelization, algorithm optimization, and the like.
Step S34, determining related data in a data table participating in operator operation, and carrying out mask calculation based on the related data to obtain an effective data mask; the operator operation is a data operation based on the target calculation operator.
And step S35, performing AND operation by using the effective data mask and the original row data in the data table to extract effective column data which participates in the operator operation in the original row data.
In this embodiment, the and operation is performed by using the valid data mask and the original row data in the data table, so as to extract valid column data in the original row data, which participates in the operator operation. That is, the data analysis module performs and operation on the effective data mask and the original data in the data table to extract the effective column data participating in the operator operation in the row data, so as to complete the data table analysis.
Step S36, determining an operator algorithm corresponding to the target calculation operator at the query acceleration layer, and performing acceleration operation on the effective column data through the operator algorithm to obtain a corresponding operation result.
And step S37, outputting target data based on the generation sequence of the operation result, and returning the target data to the client.
Therefore, in this embodiment, after receiving the data query instruction, the data query instruction may be parsed into a calculation operator, and relevant data in a data table participating in operator calculation is determined, so as to perform mask calculation on the relevant data to obtain an effective mask, and the original data in the data table is processed based on the effective mask, so as to obtain effective column data, and finally, the calculation operator is used to perform acceleration operation on the effective column data, and the operation result is returned to the client. Therefore, the independent analysis process of the data table can be omitted, and the analysis of the data table is performed through mask calculation, so that the analysis efficiency of the data table is effectively improved.
Referring to fig. 9, an embodiment of the present invention discloses a data table parsing apparatus based on mask calculation, including:
the instruction analysis module 11 is configured to receive a data query instruction sent by a client, and analyze the data query instruction to obtain a target calculation operator;
a mask calculation module 12, configured to determine relevant data in a data table participating in an operator operation, and perform mask calculation based on the relevant data, so as to obtain an effective data mask; the operator operation is a data operation based on the target calculation operator;
A data operation module 13, configured to perform data operation based on the valid data mask and the original data in the data table, so as to obtain valid column data;
the data acceleration module 14 is configured to perform an acceleration operation on the valid column data based on the target calculation operator, and output target data according to an operation result, so as to return the target data to the client.
Therefore, in the application, firstly, a data query instruction sent by a client is received, the data query instruction is analyzed to obtain a target calculation operator, then, related data in a data table participating in operator operation is determined, and mask calculation is performed based on the related data to obtain an effective data mask; the operator operation is a data operation based on the target calculation operator; and performing data operation based on the effective data mask and the original data in the data table to obtain effective column data, performing acceleration operation on the effective column data based on the target calculation operator, and outputting target data according to an operation result to return the target data to the client. Therefore, according to the data table analysis method based on the mask calculation, after the data query instruction is received, the data query instruction can be analyzed into a calculation operator, related data in a data table participating in operator calculation is determined, so that the related data is subjected to mask calculation to obtain an effective mask, original data in the data table is processed based on the effective mask, effective column data is obtained, and finally the calculation operator is utilized to accelerate the effective column data, and an operation result is returned to the client. Therefore, the independent analysis process of the data table can be omitted, the analysis of the data table is carried out through mask calculation, the analysis efficiency of the data table is effectively improved, the data table is analyzed into the first step of heterogeneous calculation, the analysis process of the data table is combined with the heterogeneous query acceleration process through the mask calculation, the analysis of the data table and the query acceleration are synchronously carried out, the analysis efficiency of the data table is improved, and meanwhile, the acceleration efficiency and the acceleration speed of the heterogeneous calculation are further improved.
In some embodiments, the instruction parsing module 11 may specifically include:
the instruction processing unit is used for receiving a data query instruction sent by the client based on the terminal application layer, and processing the data query instruction at the instruction analysis layer to obtain a bottom layer code corresponding to the data query instruction;
the code analysis unit is used for analyzing the bottom layer code to obtain a plurality of calculation operators;
and the operator optimization unit performs operator optimization on the plurality of calculation operators based on a preset optimization algorithm so as to obtain a target calculation operator.
In some embodiments, the mask calculation module 12 may specifically include:
the data determining unit is used for determining a data table participating in operator operation in the task management layer and determining the line length, the initial byte and the continuous byte of data participating in operator operation in the data table;
a first data mask determining unit, configured to generate, at a query acceleration layer, a maximum data mask with the same length based on the line length, and perform a right shift operation on the maximum data mask based on the start byte, so as to obtain a right shift mask;
a second data mask determining unit, configured to perform a left shift operation on the maximum data mask based on the start byte and the continuation byte at a query acceleration layer, so as to obtain a left shift mask;
And the mask operation unit is used for performing exclusive nor operation on the right shift mask and the left shift mask so as to obtain an effective data mask.
In some embodiments, the data table parsing apparatus based on mask calculation may further include:
and the data processing unit is used for carrying out data encryption operation and data compression operation on the related data so as to carry out data transmission between the processor and the accelerator card based on the obtained compressed data.
In some embodiments, the data operation module 13 may specifically include:
and the data operation unit is used for performing AND operation by utilizing the effective data mask and the original data in the data table so as to extract effective column data which participates in the operator operation in the original data.
In some embodiments, the data table parsing apparatus based on mask calculation may further include:
and the data caching unit is used for caching the original data so as to realize synchronous operation of the valid column data and the original data.
In some embodiments, the data acceleration module 14 may specifically include:
the data acceleration unit is used for determining an operator algorithm corresponding to the target calculation operator at the query acceleration layer, and carrying out acceleration operation on the effective column data through the operator algorithm to obtain a corresponding operation result;
And the data output unit is used for outputting target data based on the generation sequence of the operation result and returning the target data to the client.
Further, the embodiment of the present application further discloses an electronic device, and fig. 10 is a block diagram of an electronic device 20 according to an exemplary embodiment, where the content of the figure is not to be considered as any limitation on the scope of use of the present application.
Fig. 10 is a schematic structural diagram of an electronic device 20 according to an embodiment of the present application. The electronic device 20 may specifically include: at least one processor 21, at least one memory 22, a power supply 23, a communication interface 24, an input output interface 25, and a communication bus 26. The memory 22 is used for storing a computer program, and the computer program is loaded and executed by the processor 21 to implement relevant steps in the data table parsing method based on mask calculation disclosed in any of the foregoing embodiments. In addition, the electronic device 20 in the present embodiment may be specifically an electronic computer.
In this embodiment, the power supply 23 is configured to provide an operating voltage for each hardware device on the electronic device 20; the communication interface 24 can create a data transmission channel between the electronic device 20 and an external device, and the communication protocol in which the communication interface is in compliance is any communication protocol applicable to the technical solution of the present application, which is not specifically limited herein; the input/output interface 25 is used for acquiring external input data or outputting external output data, and the specific interface type thereof may be selected according to the specific application requirement, which is not limited herein.
The memory 22 may be a carrier for storing resources, such as a read-only memory, a random access memory, a magnetic disk, or an optical disk, and the resources stored thereon may include an operating system 221, a computer program 222, and the like, and the storage may be temporary storage or permanent storage.
The operating system 221 is used for managing and controlling various hardware devices on the electronic device 20 and computer programs 222, which may be Windows Server, netware, unix, linux, etc. The computer program 222 may further include a computer program that can be used to perform other specific tasks in addition to the computer program that can be used to perform the mask-calculation-based data table parsing method performed by the electronic device 20 as disclosed in any of the previous embodiments.
Further, the application also discloses a computer readable storage medium for storing a computer program; wherein the computer program, when executed by a processor, implements the previously disclosed data table parsing method based on mask computation. For specific steps of the method, reference may be made to the corresponding contents disclosed in the foregoing embodiments, and no further description is given here.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The foregoing has outlined the detailed description of the preferred embodiment of the present application, and the detailed description of the principles and embodiments of the present application has been provided herein by way of example only to facilitate the understanding of the method and core concepts of the present application; meanwhile, as those skilled in the art will have modifications in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims (9)

1. A method for analyzing a data table based on mask calculation, comprising:
receiving a data query instruction sent by a client, and analyzing the data query instruction to obtain a target calculation operator;
determining related data in a data table participating in operator operation, and performing mask calculation based on the related data to obtain an effective data mask; the operator operation is a data operation based on the target calculation operator;
performing data operation based on the effective data mask and the original data in the data table to obtain effective column data;
performing acceleration operation on the effective column data based on the target calculation operator, and outputting target data according to an operation result so as to return the target data to the client;
Wherein, determining relevant data in a data table participating in operator operation, and performing mask calculation based on the relevant data to obtain an effective data mask, including:
determining a data table participating in operator operation in a task management layer, and determining the line length, the initial byte and the continuous byte of data participating in operator operation in the data table;
generating a maximum data mask with the same length based on the line length at a query acceleration layer, and performing right shift operation on the maximum data mask based on the initial byte to obtain a right shift mask;
performing left shift operation on the maximum data mask based on the initial byte and the continuous byte at a query acceleration layer to obtain a left shift mask;
and performing an exclusive nor operation on the right shift mask and the left shift mask to obtain an effective data mask.
2. The method for analyzing a data table based on mask calculation according to claim 1, wherein the receiving the data query instruction sent by the client and analyzing the data query instruction to obtain the target calculation operator includes:
receiving a data query instruction sent by a client based on a terminal application layer, and processing the data query instruction in an instruction analysis layer to obtain a bottom layer code corresponding to the data query instruction;
Analyzing the bottom layer code to obtain a plurality of calculation operators;
and carrying out operator optimization on the plurality of calculation operators based on a preset optimization algorithm to obtain a target calculation operator.
3. The method for analyzing a data table based on mask calculation according to claim 1, wherein before performing a data operation based on the valid data mask and the original data in the data table to obtain valid column data, further comprising:
and carrying out data encryption operation and data compression operation on the related data so as to carry out data transmission between the processor and the accelerator card based on the obtained compressed data.
4. The method for analyzing a data table based on mask calculation according to claim 1, wherein the performing a data operation based on the valid data mask and the original data in the data table to obtain valid column data includes:
and performing AND operation by using the effective data mask and the original data in the data table to extract effective column data participating in the operator operation in the original data.
5. The method for analyzing a data table based on mask calculation according to claim 1, wherein after performing a data operation based on the valid data mask and the original data in the data table to obtain valid column data, further comprising:
And caching the original data to realize synchronous operation of the valid column data and the original data.
6. The method for analyzing a data table based on mask calculation according to any one of claims 1 to 5, wherein the accelerating operation on the valid column data based on the target calculation operator and outputting target data according to an operation result to return the target data to the client comprises:
determining an operator algorithm corresponding to the target calculation operator at a query acceleration layer, and performing acceleration operation on the effective column data through the operator algorithm to obtain a corresponding operation result;
and outputting target data based on the generation sequence of the operation result, and returning the target data to the client.
7. A data table parsing apparatus based on mask calculation, comprising:
the instruction analysis module is used for receiving a data query instruction sent by the client and analyzing the data query instruction to obtain a target calculation operator;
the mask calculation module is used for determining related data in a data table participating in operator operation and carrying out mask calculation based on the related data so as to obtain an effective data mask; the operator operation is a data operation based on the target calculation operator;
The data operation module is used for carrying out data operation based on the effective data mask and the original data in the data table so as to obtain effective column data;
the data acceleration module is used for carrying out acceleration operation on the effective column data based on the target calculation operator, outputting target data according to an operation result and returning the target data to the client;
wherein, the mask calculation module includes:
the data determining unit is used for determining a data table participating in operator operation in the task management layer and determining the line length, the initial byte and the continuous byte of data participating in operator operation in the data table;
a first data mask determining unit, configured to generate, at a query acceleration layer, a maximum data mask with the same length based on the line length, and perform a right shift operation on the maximum data mask based on the start byte, so as to obtain a right shift mask;
a second data mask determining unit, configured to perform a left shift operation on the maximum data mask based on the start byte and the continuation byte at a query acceleration layer, so as to obtain a left shift mask;
and the mask operation unit is used for performing exclusive nor operation on the right shift mask and the left shift mask so as to obtain an effective data mask.
8. An electronic device, comprising:
a memory for storing a computer program;
a processor for executing the computer program to implement the mask calculation-based data table parsing method as claimed in any one of claims 1 to 6.
9. A computer readable storage medium storing a computer program which when executed by a processor implements a mask calculation based data table parsing method according to any one of claims 1 to 6.
CN202310647218.5A 2023-06-02 2023-06-02 Data table analysis method, device and equipment based on mask calculation and storage medium Active CN116361346B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310647218.5A CN116361346B (en) 2023-06-02 2023-06-02 Data table analysis method, device and equipment based on mask calculation and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310647218.5A CN116361346B (en) 2023-06-02 2023-06-02 Data table analysis method, device and equipment based on mask calculation and storage medium

Publications (2)

Publication Number Publication Date
CN116361346A CN116361346A (en) 2023-06-30
CN116361346B true CN116361346B (en) 2023-08-08

Family

ID=86905478

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310647218.5A Active CN116361346B (en) 2023-06-02 2023-06-02 Data table analysis method, device and equipment based on mask calculation and storage medium

Country Status (1)

Country Link
CN (1) CN116361346B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117311988B (en) * 2023-11-27 2024-03-12 沐曦集成电路(南京)有限公司 Protocol operation optimization method, device and equipment with mask and medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105531706A (en) * 2013-07-17 2016-04-27 索特斯波特有限公司 Search engine for information retrieval system
CN109344138A (en) * 2018-10-09 2019-02-15 广东网安科技有限公司 A kind of log analytic method and system
TW202006565A (en) * 2018-07-09 2020-02-01 慧榮科技股份有限公司 Apparatus and method for searching linked lists
CN111090397A (en) * 2019-12-12 2020-05-01 苏州浪潮智能科技有限公司 Data deduplication method, system, equipment and computer readable storage medium
CN111723388A (en) * 2020-06-23 2020-09-29 湖南国科微电子股份有限公司 Password operation protection method, device, equipment and medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9710185B2 (en) * 2014-07-10 2017-07-18 Samsung Electronics Co., Ltd. Computing system with partial data computing and method of operation thereof
US20180039399A1 (en) * 2014-12-29 2018-02-08 Palantir Technologies Inc. Interactive user interface for dynamically updating data and data analysis and query processing
US10817490B2 (en) * 2017-04-28 2020-10-27 Microsoft Technology Licensing, Llc Parser for schema-free data exchange format
CN110443059A (en) * 2018-05-02 2019-11-12 中兴通讯股份有限公司 Data guard method and device
US20200004533A1 (en) * 2018-06-29 2020-01-02 Microsoft Technology Licensing, Llc High performance expression evaluator unit

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105531706A (en) * 2013-07-17 2016-04-27 索特斯波特有限公司 Search engine for information retrieval system
TW202006565A (en) * 2018-07-09 2020-02-01 慧榮科技股份有限公司 Apparatus and method for searching linked lists
CN109344138A (en) * 2018-10-09 2019-02-15 广东网安科技有限公司 A kind of log analytic method and system
CN111090397A (en) * 2019-12-12 2020-05-01 苏州浪潮智能科技有限公司 Data deduplication method, system, equipment and computer readable storage medium
CN111723388A (en) * 2020-06-23 2020-09-29 湖南国科微电子股份有限公司 Password operation protection method, device, equipment and medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于布尔异或掩码转算术加法掩码的安全设计;饶金涛;李军;何卫国;;通信技术(03);第696-699页 *

Also Published As

Publication number Publication date
CN116361346A (en) 2023-06-30

Similar Documents

Publication Publication Date Title
CN116361346B (en) Data table analysis method, device and equipment based on mask calculation and storage medium
CN102981911B (en) Distributed message handling system and device and method thereof
CN112527848B (en) Report data query method, device and system based on multiple data sources and storage medium
CN111881930A (en) Thermodynamic diagram generation method and device, storage medium and equipment
CN107229628B (en) Distributed database preprocessing method and device
CN111680799A (en) Method and apparatus for processing model parameters
CN109902027A (en) A kind of TPC-DS test method and system convenient to carry out
CN113971455A (en) Distributed model training method and device, storage medium and computer equipment
CN112394972B (en) Cloud application publishing method, device, equipment and storage medium
CN111309473A (en) Data downloading method, device, equipment and medium
CN114253646B (en) Digital sand table display and generation method, device and storage medium
CN114189507B (en) Application screen control method and device, electronic equipment and storage medium
CN116861455B (en) Event data processing method, system, electronic device and storage medium
CN113742112B (en) Electrocardiogram image generation method, system and electronic device
CN107995176B (en) SCADA system picture cache system
US20230298021A1 (en) Method for acquiring a random number for blockchain, device and storage medium
Li et al. An adaptive caching mechanism for Web services
CN116049185A (en) Target item query method and device, electronic equipment and storage medium
CN117215807A (en) Caching method and device of interface query result, storage medium and computer equipment
Zhuang Implementation and Performance Evaluation of Algorithms Running on Distributed Systems
CN113408246A (en) Method, device and system for acquiring single number
CN114816403A (en) Request processing method, device, equipment and storage medium
CN113626742A (en) Webpage generation method and device, electronic equipment and readable storage medium
CN115660387A (en) Health certificate operation and maintenance management method and system
CN117319341A (en) Request table generation method, system, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant