WO2023060878A1 - 一种数据查询方法、系统、异构加速平台及存储介质 - Google Patents

一种数据查询方法、系统、异构加速平台及存储介质 Download PDF

Info

Publication number
WO2023060878A1
WO2023060878A1 PCT/CN2022/089912 CN2022089912W WO2023060878A1 WO 2023060878 A1 WO2023060878 A1 WO 2023060878A1 CN 2022089912 W CN2022089912 W CN 2022089912W WO 2023060878 A1 WO2023060878 A1 WO 2023060878A1
Authority
WO
WIPO (PCT)
Prior art keywords
parameter
node
data
operation code
cpu
Prior art date
Application number
PCT/CN2022/089912
Other languages
English (en)
French (fr)
Inventor
刘科
张闯
孙颉
任智新
孙忠祥
Original Assignee
苏州浪潮智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 苏州浪潮智能科技有限公司 filed Critical 苏州浪潮智能科技有限公司
Priority to US18/279,346 priority Critical patent/US11893011B1/en
Publication of WO2023060878A1 publication Critical patent/WO2023060878A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24534Query rewriting; Transformation
    • G06F16/24537Query rewriting; Transformation of operators
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • G06F16/2445Data retrieval commands; View definitions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24532Query optimisation of parallel queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24534Query rewriting; Transformation
    • G06F16/24542Plan optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results

Definitions

  • the present application relates to the technical field of heterogeneous acceleration of database software, in particular to a data query method, system, heterogeneous acceleration platform and storage medium.
  • a heterogeneous acceleration platform is generally used in the field to process database data.
  • the where clause in the SQL query statement mainly plays the role of filtering and screening, and can support users to input complex query conditions.
  • the statement can include comparison operations (greater than, less than, equal to), logical operations (and, or, not), arithmetic Operation (addition, subtraction, multiplication and division), priority (brackets), etc.
  • the lexical and grammatical analysis of the where clause input by the user is performed to form a data structure of binary tree + linked list, which is called a constraint condition. For each record in the database, the constraints of the binary tree + linked list are traversed to determine whether the current record meets the filtering conditions.
  • the purpose of this application is to provide a data query method, a data query system, an electronic device, and a storage medium, which enable the heterogeneous acceleration platform to support any type of where clause query and improve the application of the heterogeneous acceleration platform scope.
  • the present application provides a data query method, which is applied to a heterogeneous acceleration platform.
  • the heterogeneous acceleration platform includes a CPU and a parallel processor.
  • the data query method includes:
  • the data query method includes:
  • the CPU is used to convert the where clause in the SQL query statement into a data structure comprising a binary tree and a linked list; wherein, each node in the data structure corresponds to the where clause An operator in;
  • the CPU controlling the CPU to generate an opcode stream of the data structure according to the node information; wherein the node information includes a node position and a corresponding function name of each node in the data structure;
  • using the parallel processor to use the target record as a parameter source to perform a screening operation corresponding to the operation code stream to obtain a Boolean value corresponding to the target record including:
  • the current operation code is a calculation operation code
  • the parameter source includes the target record, a preset constant and a data stack, and the data stack is used to store the operation results of the calculation operation code and the logic operation code in the operation code code stream;
  • the current operation code is the logical operation code
  • read a Boolean type operation result from the data stack and perform a logical operation corresponding to the logical operation code on the Boolean type operation result to obtain an operation result ;
  • reading the target parameter corresponding to the current opcode from the parameter source includes:
  • the filtering operation on the target record is stopped, and it is determined that the target record does not conform to the where clause.
  • controlling the CPU to generate an operation code stream of the data structure according to node information includes:
  • controlling the CPU to determine parameter information of a function corresponding to each node in the data structure according to the node information including:
  • the first operation is: determine the parameter source of the node according to the node position of the node in the data structure;
  • the second operation is: determine the data type of the operation object according to the function name corresponding to the node in the data structure , and determine the parameter type of the node according to the data type of the operation object;
  • the third operation is: determine the parameter size according to the parameter type of the node.
  • the present application also provides a data query system, which is applied to a heterogeneous acceleration platform, the heterogeneous acceleration platform includes a CPU and a parallel processor, and the data query system includes:
  • a function realization module configured to determine the operator in the database management system, and realize the function of the corresponding function of the operator in the parallel processor;
  • a conversion module configured to use the CPU to convert the where clause in the SQL query statement into a data structure comprising a binary tree and a linked list if the SQL query statement is received; wherein each node in the data structure corresponds to an operator in the where clause;
  • a code stream generating module configured to control the CPU to generate an operation code stream of the data structure according to node information; wherein the node information includes a node position and a corresponding function name of each node in the data structure;
  • the screening module is configured to use the parallel processor to perform a screening operation corresponding to the operation code stream on the records in the database management system, so as to obtain a query result conforming to the where clause.
  • the present application also provides a storage medium on which a computer program is stored, and when the computer program is executed, the steps performed by the above data query method are realized.
  • the present application also provides a heterogeneous acceleration platform, including a memory, a CPU, and a parallel processor, wherein a computer program is stored in the memory, and when the CPU and the parallel processor call the computer program in the memory, the above-mentioned The steps of the data query method.
  • the present application provides a data query method, which is applied to a heterogeneous acceleration platform, and the heterogeneous acceleration platform includes a CPU and a parallel processor.
  • the data query method includes: determining the operator in the database management system, and The function of the corresponding function of the operator is realized in the parallel processor; if the SQL query statement is received, the CPU is used to convert the where clause in the SQL query statement into a data structure comprising a binary tree and a linked list; wherein, the Each node in the data structure corresponds to an operator in the where clause; the CPU is controlled to generate the operation code stream of the data structure according to the node information; wherein, the node information includes in the data structure The node position of each node and the corresponding function name; using the parallel processor to perform the filtering operation corresponding to the operation code code stream on the records in the database management system, and obtain the query results conforming to the where clause.
  • the application realizes the function corresponding to the operator in the database management system in the parallel processor in advance, and converts the where clause in the SQL query statement into a data structure including a binary tree and a linked list when receiving the SQL query statement, and converts the above data structure into a stream of opcodes that the parallel processor can understand.
  • the operation code stream is generated according to the node position of each node in the data structure and the corresponding function name, so the parallel processor can use the operation code stream to perform the screening operation of the where clause, and obtain the information that meets the where clause from the database management system. sentence query results.
  • the parallel processor can combine the function functions realized by itself to complete the screening operation corresponding to any kind of where clause, and is no longer limited by the fixed template , so the present application enables the heterogeneous acceleration platform to support any type of where clause query, and improves the applicable scope of the heterogeneous acceleration platform.
  • the present application also provides a data query system, a storage medium, and a heterogeneous acceleration platform, which have the above-mentioned beneficial effects and will not be repeated here.
  • Fig. 1 is a flow chart of a data query method provided by the embodiment of the present application.
  • Fig. 2 is the schematic diagram of the data structure of a kind of binary tree and linked list provided by the embodiment of the present application;
  • FIG. 3 is a flow chart of a parallel processor querying data in parallel provided by an embodiment of the present application
  • FIG. 4 is a flow chart of a method for performing a screening operation by a parallel processor provided in an embodiment of the present application
  • FIG. 5 is a schematic diagram of the overall structure of an operation code stream provided by an embodiment of the present application.
  • FIG. 6 is a schematic diagram of a code stream header structure provided by an embodiment of the present application.
  • FIG. 7 is a schematic diagram of a code stream structure of an operation terminal part provided by an embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of a parameter information code stream provided by an embodiment of the present application.
  • Fig. 9 is the schematic diagram of a kind of FPGA workflow provided by the embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of a data query system provided by an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of a heterogeneous acceleration platform provided by an embodiment of the present application.
  • FIG. 12 is a schematic structural diagram of a storage medium provided by an embodiment of the present application.
  • FIG. 1 is a flowchart of a data query method provided by the embodiment of the present application. The specific steps may include:
  • the present embodiment can be applied to a heterogeneous acceleration platform, and the heterogeneous acceleration platform includes a CPU (Central Processing Unit, central processing unit) and a parallel processor, and the above-mentioned parallel processor can be an FPGA (Field Programmable Gate Array, Field Programmable Logic Gate Array), GPU (Graphics Processing Unit, Graphics Processing Unit) and other processing chips with parallel processing capabilities.
  • a CPU Central Processing Unit, central processing unit
  • a parallel processor can be an FPGA (Field Programmable Gate Array, Field Programmable Logic Gate Array), GPU (Graphics Processing Unit, Graphics Processing Unit) and other processing chips with parallel processing capabilities.
  • the database management system in this step can be PostgreSQL (an open source client/server relational database management system), MySQL, Oracle, etc., and the specific type of the database management system is not limited here.
  • This step can determine the operator in the database management system, each operator has its corresponding function, the function corresponding to each operator in the database management system is the minimum functional unit, this step constructs each A functional unit corresponding to the operator; wherein the functional unit is used to execute the operation of the function corresponding to the operator.
  • the parallel processor can realize all minimum functional units. Multiple record tuples can be stored in the database management system.
  • the heterogeneous acceleration platform when the user needs to query specific data in the database management system, usually the heterogeneous acceleration platform first inputs a SQL query statement, and the SQL query statement includes a where clause.
  • the CPU may be used to convert the where clause in the SQL query statement into a data structure including a binary tree and a linked list.
  • Fig. 2 is a schematic diagram of the data structure of a binary tree and linked list provided by the embodiment of the present application, if the where clause is: I_discount>0.07 and I_quantity ⁇ 24and((I_discount+I_extendedprice)+(I_discount*I_extendedprice )>0.1 or I_quantity>12), the above expression is a where clause expression with arithmetic operation, size comparison, logic operation and parenthesis priority.
  • the data structure obtained by converting the above where clause by the CPU is shown in Figure 2 .
  • the function function represented by float_gt in Figure 2 is that the floating point number is greater than, the function function represented by float_lt is that the floating point number is less than, the function function represented by float_add is the addition of floating point numbers, the function function represented by float_mul is the multiplication of floating point numbers, and represents logic and calculation , or means logical or calculation, I_discount means discount, I_quantity means quantity, and I_extendedprice means total price.
  • Each node in the above data structure corresponds to an operator in the where clause. Taking FIG. 2 as an example, float_gt, float_lt, float_add, float_mul, and and or in FIG. 2 represent a node respectively. Specifically, each circular node in FIG.
  • the and and or nodes represent logical operators, which are supported by the FPGA itself.
  • S103 Control the CPU to generate an operation code stream of the data structure according to the node information
  • the CPU can be controlled to generate the operation code stream of the above data structure according to the node information.
  • the operation code stream includes the function execution order and each function that the parallel processor can recognize. Parameter sources, parameter types, and parameter sizes.
  • the CPU may be controlled to determine parameter information of a function corresponding to each node in the data structure according to the node information; wherein, the parameter information includes parameter source, parameter type, and parameter size; The parameter information is used to generate the operation code corresponding to each node, and all the operation codes are collected to obtain the operation code stream.
  • the above node information includes the node position of each node in the data structure and the function name corresponding to each node.
  • the CPU may be controlled to perform a first operation, a second operation, and a third operation according to the node information, so as to obtain parameter information of a function corresponding to each node in the data structure.
  • the above-mentioned first operation is: determine the parameter source of the node according to the node position of the node in the data structure; specifically, the data source of the leaf node is the data recorded in the database management system or a preset constant, and the data source of the non-leaf node Include at least the operation results of other opcodes stored in the data stack.
  • the second operation is: determine the data type of the operation object according to the function name corresponding to the node in the data structure, and determine the parameter type of the node according to the data type of the operation object.
  • the third operation is: determine the parameter size according to the parameter type of the node.
  • S104 Using the parallel processor to perform a filtering operation corresponding to the operation code stream on the records in the database management system, to obtain a query result conforming to the where clause.
  • this embodiment can use the parallel processor to perform the screening operation corresponding to the operation code stream in parallel on multiple records in the database management system, so as to determine that the database management system conforms to the where clause query results. Since the parallel processor pre-realizes the function of the corresponding function of the operator in the database management system, after receiving the operation code stream, it can perform the corresponding calculation operation according to the function execution sequence in the operation code stream and the parameter source of each function And logical operations, so as to judge whether each record in the database management system meets the requirements of the where clause according to the operation results.
  • the function corresponding to the operator in the database management system is realized in the parallel processor in advance, and when the SQL query statement is received, the where clause in the SQL query statement is converted into a data structure including a binary tree and a linked list, and the above-mentioned
  • the data structure is converted into a stream of opcodes that the parallel processor can understand.
  • the operation code stream is generated according to the node position of each node in the data structure and the corresponding function name, so the parallel processor can use the operation code stream to perform the screening operation of the where clause, and obtain the information that meets the where clause from the database management system. sentence query results.
  • this embodiment implements the function function in the parallel processor with the function of the function as the granularity, the parallel processor can combine the function functions realized by itself to complete the screening operation corresponding to any kind of where clause, and is no longer restricted by the fixed template. Therefore, this embodiment can enable the heterogeneous acceleration platform to support any type of where clause query, and improve the applicable scope of the heterogeneous acceleration platform.
  • Fig. 3 is a kind of parallel processor parallel query data flowchart provided by the embodiment of the present application, this embodiment is a further introduction to S104 in the embodiment corresponding to Fig. 1, this embodiment can be compared with Fig. 1
  • the corresponding embodiments are combined to obtain a further implementation mode, and this embodiment may include the following steps:
  • S302 Use the parallel processor to use the target record as a parameter source to perform a screening operation corresponding to the opcode stream, to obtain a Boolean value corresponding to the target record;
  • S304 Determine whether all the records in the database management system have been read; if all the records in the database management system have been read, end the process; if not all the records in the database management system have been read, Then go to step 301 .
  • the maximum parallel processing number of the parallel processor can be set according to the parameters of the parallel processor, and then based on the maximum parallel processing number, a corresponding number of records can be read as target records.
  • the Boolean value corresponding to each target record can be obtained. If the Boolean value is true, it means that the target record is set to match the query result of the where clause. After the target records are screened, it can be judged whether all the records in the database management system have been read, and if not, the relevant operations of S301-S304 can be re-executed.
  • FIG. 4 is a flow chart of a method for performing a screening operation by a parallel processor according to an embodiment of the present application. As shown in Figure 4, the parallel processor can complete the screening operation corresponding to the opcode stream by performing the following steps:
  • the operation codes corresponding to the functions include calculation operation codes and logical operation codes.
  • the operation performed is as follows: read the target parameter corresponding to the current operation code from the parameter source, and perform the calculation operation of the function corresponding to the current operation code on the target parameter , to obtain an operation result;
  • the parameter source includes the target record, a preset constant and a data stack, and the data stack is used to store the operation results of the calculation operation code and the logic operation code in the operation code code stream;
  • the operation to be performed is as follows: read the operation result of the Boolean type from the data stack, and perform the logical operation corresponding to the operation code of the Boolean type on the operation result of the Boolean type to obtain the operation result;
  • S405 Determine whether the operation codes in the operation code stream have all been read; if all the operation codes in the operation code stream have been read, proceed to S406; if the operation codes in the operation code stream If the reading is not completed, go to S401.
  • this embodiment can read the target parameter corresponding to the current operation code in the following manner, including: determining the number of parameters and the parameter offset address according to the current operation code; The address reads the target parameter corresponding to the current opcode from the parameter source.
  • CPU is good at management and scheduling;
  • FPGA can be used to design special circuits for high-performance parallel computing.
  • a large number of academic theoretical research and simulated experimental data show that database data processing in a CPU-FPGA heterogeneous environment can greatly improve the overall performance of the database system.
  • CPU-FPGA heterogeneous platform to query data In order to further improve the overall performance of the database server, it has become a mainstream trend to use CPU-FPGA heterogeneous platform to query data.
  • how to flexibly support different SQL statements has become the key to database CPU-FPGA heterogeneous acceleration from theoretical research to practical application. Dealing with the where clause in SQL statements is one of the difficulties that need to be overcome.
  • this application provides a solution based on CPU-FPGA heterogeneous processing of Postgresql where clauses.
  • This solution uses Postgresql database software as the basis for the extended development of CPU-FPGA heterogeneous acceleration, and proposes A set of implementation schemes for parsing the where clause in the SQL statement in the FPGA expands the application scenarios and scope that the FPGA can process the SQL statement.
  • the specific implementation of this embodiment is as follows:
  • Step A Implement the functions corresponding to each operator in Postgresql in the FPGA.
  • each arithmetic and comparison operation of each data type in Postgresql has a corresponding function ID.
  • the functions corresponding to all the function IDs can be pre-realized in the FPGA.
  • Table 1 shows the function information table of the floating-point number data type. Postgresql has the function ID correspondence shown in Table 1. In this embodiment, the functions corresponding to the following function IDs can be pre-realized in the FPGA.
  • Table 1 Function information table of floating-point data type
  • Step B Convert the where clause into a binary tree + linked list data structure in the CPU, and generate an operation code stream.
  • this step traverses Postgresql's analysis results of user SQL statements, calculates the source, type and size of parameters according to the location of each node and the function ID information during the traversal process, and forms the information of each node into an operation code.
  • the operation code is classified into calculation operation code and logic operation code, and finally the original analysis result of Postgresql to SQL statement is converted into an operation code stream that can be processed by FPGA.
  • Postgresql will perform lexical and grammatical analysis, process the operators in the statement, constant expressions and parentheses priority, and convert the where clause into the data structure of binary tree + linked list, each on the binary tree + linked list
  • a node corresponds to an operator entered by the user.
  • the CPU executes, it traverses each node and executes the corresponding operator function, and finally obtains a Boolean value, which is used to express whether the current record meets the filter conditions of the where clause.
  • FIG. 5 is a schematic diagram of an overall structure of an operation code stream provided by an embodiment of the present application.
  • Fig. 6 is a schematic diagram of a code stream header structure provided by the embodiment of the present application.
  • the code stream header identifies in a fixed format how many opcodes there are and how each opcode is included in the code stream. offset and size.
  • op_count indicates how many opcodes there are in total
  • op_offset indicates the offset of opcode 1 in the code stream
  • op1_len indicates the length of opcode 1
  • opN_offset indicates the offset of the last opcode N in the code stream
  • opN_len indicates The length of the last opcode N.
  • Fig. 7 is a schematic diagram of the code stream structure of an operation terminal part provided by the embodiment of the present application.
  • the operation terminal part identifies the relevant information of the operation code, including the type of the operation code, and the ID number of the operation function used by the operation code , the number of parameters, the offset and size of the parameters in the opcode.
  • the type of operation code can be classified into two types according to the ID of the operation function: calculation operation code (such as: addition, subtraction, multiplication, division, comparison, etc.), logic operation code (such as: and, or, not, etc.) ).
  • type indicates the type of operation code (computational operation code or logical operation code)
  • op_func indicates the ID number of the operation code function
  • nargs indicates the number of operation code parameters
  • arg1_offset indicates the offset of the first parameter relative to the location of the operation code revelation
  • arg1_len Indicates the length of the first parameter
  • argN_offset indicates the offset of the last parameter relative to the opcode revelation position
  • argN_len indicates the length of the last parameter.
  • FIG. 8 is a schematic structural diagram of a parameter information code stream provided by an embodiment of the present application.
  • the parameter information in the opcode identifies the parameter type, parameter size, and parameter source. Specifically, when the CPU traverses the binary tree+linked list data structure, the parameter information of the function is calculated according to the position of the node and the type of the called function.
  • the data type of the operation object can be deduced from the function ID. Since the type of the parameter is known, the parameter size can be obtained through the parameter type. In Fig.
  • arg_tag represents the parameter tag
  • arg_type represents the parameter type
  • arg_size represents the parameter size
  • Step C The FPGA screens the Postgresql records according to the operation code stream.
  • each operation code in the operation code stream is analyzed in the FPGA one by one, and the parameters are obtained according to the source, type and size of the parameter, and the function function unit corresponding to the function ID in the operation code is called.
  • the intermediate results are saved in the form of a stack.
  • the CPU After the CPU starts the query scan, it sends the operation code stream to the FPGA.
  • FPGA parses each opcode in the opcode stream, and for each Postgresql record (called a tuple in Postgresql), traverses each function ID in the opcode stream in turn, and calls the corresponding functional unit that has been implemented in the FPGA.
  • the result of executing the last opcode will be a Boolean value qual, indicating whether the current record satisfies the filtering process of the where clause.
  • FIG. 9 is a schematic diagram of a FPGA workflow provided by the embodiment of the present application, which specifically includes the following operations: obtain the opcode number op_count from the opcode code stream, and obtain the opcode address op_offset one by one, and judge The opcode type type.
  • the opcode type is the calculation opcode T_OpExpr, get the number of parameters nargs and get the offset arg_offset of the parameters one by one, determine the source of the parameters by judging the value of the arg_tag field, so as to get the value of the parameter from the specified position according to the arg_tag of each parameter.
  • the value of arg_tag is T_Var(104)
  • the value of the index var_no column is obtained from the tuple
  • the value of arg_tag is T_Const(105)
  • the value is obtained from arg_data (that is, the preset constant)
  • arg_tag The value of is other than T_Var(104) and T_Const(105)
  • the value is obtained from the stack stack.
  • Call the function unit corresponding to op_func to calculate, and store the operation result on the stack.
  • the opcode type is the logical opcode T_BoolExper(113)
  • get the number of parameters nargs that is, the operation result of Boolean type
  • read nargs bool type data that is, the operation result of Boolean type
  • copy the data to the parameter buffer area and record each parameter Offset in the parameter buffer area
  • call the function unit corresponding to op_func to calculate the operation result, store the operation result in the qual variable and update the qual result, and store the operation result on the stack.
  • the function corresponding to each operator in Postgresql is used as the minimum functional unit, and all minimum functional units are implemented in the FPGA in advance.
  • the function ID in Postgresql is used as the index of the function call in the FPGA, and at the same time, the relevant information of the parameters of each function call is calculated to form an opcode stream that the FPGA can recognize.
  • the FPGA realizes the parsing operation of the where clause.
  • this solution can dynamically support SQL statement where clause analysis without hardcoding the SQL statement template in the IP core. This solution has no limitation on the number of constraints.
  • This solution supports common data in Postgresql Type: int, float, date, timestamp, etc., also supports arithmetic operations and expressions with parentheses.
  • the solution with reference to this embodiment can also be applied to other database software such as MySQL, Oracle, etc., and similar processing can be performed on the SQL statement input by the user.
  • MySQL MySQL, Oracle, etc.
  • TPC-H 1GB data set Take the TPC-H 1GB data set as an example to illustrate the above process, select the 15th record and the 17th record in the data set, and the 5th, 6th, and 7th column data of the 15th record and the 17th record, as shown in Table 2
  • the data in the fifth column is the quantity l_quantity
  • the data in the sixth column is the total price l_extendedprice
  • the data in the seventh column is the discount l_discount.
  • the FPGA can parse opcodes one by one according to the offset and length of each opcode in the code stream.
  • the parameter is obtained from the corresponding position, and according to the function ID corresponding to the operation code, the corresponding operation function is called to push the result onto the stack, and finally realizes the judgment on whether a record satisfies the filter condition.
  • FIG. 10 is a schematic structural diagram of a data query system provided by an embodiment of the present application.
  • a data query system 410 provided in an embodiment of the present application is applied to a heterogeneous acceleration platform, the heterogeneous acceleration platform includes a CPU and a parallel processor, and the data query system 410 includes:
  • a function realization module 411 configured to determine the operator in the database management system, and realize the function of the corresponding function of the operator in the parallel processor;
  • the conversion module 412 is used to convert the where clause in the SQL query statement into a data structure comprising a binary tree and a linked list by using the CPU if the SQL query statement is received; wherein each node in the data structure Corresponds to an operator in the where clause;
  • a code stream generation module 413 configured to control the CPU to generate an operation code stream of the data structure according to node information; wherein the node information includes a node position and a corresponding function name of each node in the data structure;
  • the screening module 414 is configured to use the parallel processor to perform a screening operation corresponding to the operation code stream on the records in the database management system, so as to obtain a query result conforming to the where clause.
  • the function corresponding to the operator in the database management system is realized in the parallel processor in advance, and when the SQL query statement is received, the where clause in the SQL query statement is converted into a data structure including a binary tree and a linked list, and the above-mentioned
  • the data structure is converted into a stream of opcodes that the parallel processor can understand.
  • the operation code stream is generated according to the node position of each node in the data structure and the corresponding function name, so the parallel processor can use the operation code stream to perform the screening operation of the where clause, and obtain the information that meets the where clause from the database management system. sentence query results.
  • this embodiment implements the function function in the parallel processor with the function of the function as the granularity, the parallel processor can combine the function functions realized by itself to complete the screening operation corresponding to any kind of where clause, and is no longer restricted by the fixed template. Therefore, this embodiment can enable the heterogeneous acceleration platform to support any type of where clause query, and improve the applicable scope of the heterogeneous acceleration platform.
  • screening modules include:
  • a record reading unit configured to read a plurality of target records from the database management system
  • An operation code execution unit configured to use the parallel processor to use the target record as a parameter source to perform a screening operation corresponding to the operation code stream to obtain a Boolean value corresponding to the target record;
  • a query result generating unit configured to set the target record whose Boolean value is true as the query result conforming to the where clause;
  • the judging unit is used to judge whether all the records in the database management system have been read; if not all the records in the database management system have been read, start the workflow corresponding to the record reading unit.
  • the operation code execution unit is used to read the current operation code from the operation code stream; it is also used to determine the operation code type of the current operation code; it is also used to determine if the current operation code is a calculation operation code, then read the target parameter corresponding to the current opcode from the parameter source, and execute the calculation operation of the function corresponding to the current opcode on the target parameter to obtain the operation result;
  • the parameter source includes The target record, preset constants and data stack, the data stack is used to store the operation results of the calculation operation code and logic operation code in the operation code code stream; it is also used for if the current operation code is the logic operation code operation code, then read the Boolean type operation result from the data stack, and perform the logical operation corresponding to the logical operation code on the Boolean type operation result to obtain the operation result; it is also used to store the operation result to the data stack; it is also used to judge whether the operation codes in the operation code stream have all been read; if the operation codes in the operation code stream
  • the process of the operation code execution unit reading the target parameter corresponding to the current operation code from the parameter source includes: determining the number of parameters and the parameter offset address according to the current operation code; The parameter offset address reads the target parameter corresponding to the current operation code from the parameter source.
  • a logical operation result analysis unit configured to determine whether the logical operation result is true before storing the logical operation result in the data stack; if the logical operation result is true, execute the logical operation result The operation of storing to the data stack; if the result of the logical operation is not true, stop the filtering operation on the target record, and determine that the target record does not conform to the where clause.
  • the code stream generation module includes:
  • a parameter information determining unit configured to control the CPU to determine parameter information of a function corresponding to each node in the data structure according to the node information; wherein, the parameter information includes a parameter source, a parameter type, and a parameter size;
  • An operation code summary unit configured to control the CPU to generate an operation code corresponding to each node according to the parameter information, and collect all the operation codes to obtain the operation code stream.
  • an operation code summary unit configured to control the CPU to perform a first operation, a second operation, and a third operation according to the node information, and obtain parameter information of a function corresponding to each node in the data structure;
  • the first operation is: determine the parameter source of the node according to the node position of the node in the data structure;
  • the second operation is: determine the data type of the operation object according to the function name corresponding to the node in the data structure , and determine the parameter type of the node according to the data type of the operation object;
  • the third operation is: determine the parameter size according to the parameter type of the node.
  • FIG. 12 is a schematic structural diagram of a storage medium provided by an embodiment of the present application.
  • the present application also provides a storage medium 601 on which a computer program 610 is stored. When the computer program 610 is executed, the steps provided in the above-mentioned embodiments can be realized.
  • the storage medium 601 can include: various media that can store program codes such as U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk, etc. .
  • FIG. 11 is a schematic structural diagram of a heterogeneous acceleration platform provided by an embodiment of the present application.
  • the present application also provides a heterogeneous acceleration platform 501, including a memory 510, a CPU and a parallel processor 520, the memory 510 stores a computer program 511, and the CPU and the parallel processor 520 call the memory 510
  • the computer program 511 in is used to realize the steps of the above-mentioned data query method.
  • the heterogeneous acceleration platform 501 may also include various network interfaces, power supplies and other components.

Abstract

提供了一种数据查询方法,应用于异构加速平台,数据查询方法包括:确定数据库管理系统中的操作符,并在并行处理器中实现操作符对应函数的功能(S101);若接收到SQL查询语句,则利用CPU将SQL查询语句中的where子句转换为包括二叉树和链表的数据结构(S102);控制CPU根据节点信息生成数据结构的操作码码流;利用并行处理器对数据库管理系统中的记录执行操作码码流对应的筛选操作,得到符合where子句的查询结果(S103)。能够使异构加速平台支持任意类型的where子句查询,提高异构加速平台的适用范围。还提供了一种数据查询系统、一种电子设备及一种存储介质,具有以上有益效果

Description

一种数据查询方法、系统、异构加速平台及存储介质
本申请要求在2021年10月13日提交中国专利局、申请号为202111190053.0、发明名称为“一种数据查询方法、系统、异构加速平台及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及数据库软件异构加速技术领域,特别涉及一种数据查询方法、系统、异构加速平台及存储介质。
背景技术
随着摩尔定律的逐渐减慢,在数据库服务器中单纯依靠提高CPU工艺制程、增加CPU数量的方式已很难大幅提升系统的整体性能。为了进一步提高数据库服务器的整体性能,本领域通常使用异构加速平台进行数据库数据处理。
SQL查询语句中的where子句主要起到过滤筛选的作用,可以支持用户输入复杂的查询条件,语句中可以包含比较运算(大于、小于、等于),逻辑运算(与、或、非),算术运算(加减乘除),优先级(括号)等。在数据库管理系统中会对用户输入where子句进行词法、语法解析形成二叉树+链表的数据结构,称之为约束条件。对于数据库中每一条记录通过遍历执行二叉树+链表的约束条件,来判断当前记录是否满足过滤筛选条件。
相关技术中,通常只是针对定制化的SQL查询模板进行了异构加速的探索与实践。但是在工程实践中,数据库管理系统所执行的SQL查询是千变万化的,如果仅能对定制化SQL语句进行异构加速,将会大大限制了异构加速平台的适用范围与应用价值。
因此,如何使异构加速平台支持任意类型的where子句查询,提高异构加速平台的适用范围是本领域技术人员目前需要解决的技术问题。
发明内容
本申请的目的是提供一种数据查询方法、一种数据查询系统、一种电子设备及一种存储介质,能够使异构加速平台支持任意类型的where子句查询,提高异构加速平台的适用范围。
为解决上述技术问题,本申请提供一种数据查询方法,应用于异构加速平台,所述异构加速平台包括CPU和并行处理器,所述数据查询方法包括:该数据查询方法包括:
确定数据库管理系统中的操作符,并在所述并行处理器中实现所述操作符对应函数的功能;
若接收到SQL查询语句,则利用所述CPU将所述SQL查询语句中的where子句转换为包括二叉树和链表的数据结构;其中,所述数据结构中的每一节点对应所述where子句中的一个操作符;
控制所述CPU根据节点信息生成所述数据结构的操作码码流;其中,所述节点信息包括所述数据结构中每一节点的节点位置和对应的函数名;
利用所述并行处理器对所述数据库管理系统中的记录执行所述操作码码流对应的筛选操作,得到符合所述where子句的查询结果。
可选的,利用所述并行处理器对所述数据库管理系统中的记录执行所述操作码码流对应的筛选操作,得到符合所述where子句的查询结果,包括:
从所述数据库管理系统中读取多条目标记录;
利用所述并行处理器将所述目标记录作为参数来源执行所述操作码码流对应的筛选操作,得到所述目标记录对应的布尔值;
将所述布尔值为真的目标记录设置为符合所述where子句的查询结果;
判断所述数据库管理系统中的记录是否均读取完毕;
若所述数据库管理系统中的记录未均读取完毕,则执行从所述数据库管理系统中读取多条目标记录的操作。
可选的,利用所述并行处理器将所述目标记录作为参数来源执行所述操作码码流对应的筛选操作,得到所述目标记录对应的布尔值,包括:
从所述操作码码流中读取当前操作码;
确定所述当前操作码的操作码类型;
若所述当前操作码为计算操作码,则从所述参数来源中读取所述当前操作码对应的目标参数,并对所述目标参数执行所述当前操作码对应函数的计算操作,得到操作结果;其中,所述参数来源包括所述目标记录、预设常量和数据栈,所述数据栈用于存储所述操作码码流中计算操作码和逻辑操作码的操作结果;
若所述当前操作码为所述逻辑操作码,则从所述数据栈中读取布尔类型的操作结果,并对所述布尔类型的操作结果执行所述逻辑操作码对应的逻辑操作得到操作结果;
将所述操作结果存储至所述数据栈;
判断所述操作码码流中的操作码是否均读取完毕;
若所述操作码码流中的操作码均读取完毕,则将最近一次逻辑操作得到的操作结果作为所述目标记录对应的布尔值;
若所述操作码码流中的操作码未均读取完毕,则执行从所述操作码码流中读取当前操作码的操作。
可选的,从所述参数来源中读取所述当前操作码对应的目标参数,包括:
根据所述当前操作码确定参数数量和参数偏移地址;
根据所述参数数量和所述参数偏移地址从所述参数来源中读取所述当前操作码对应的目标参数。
可选的,在将所述逻辑操作结果存储至所述数据栈之前,还包括:
判断所述逻辑操作结果是否为真;
若所述逻辑操作结果为真,则执行将所述逻辑操作结果存储至所述数据栈的操作;
若所述逻辑操作结果不为真,则停止对所述目标记录的筛选操作,并判定所述目标记录不符合所述where子句。
可选的,控制所述CPU根据节点信息生成所述数据结构的操作码码流,包括:
控制所述CPU根据所述节点信息确定所述数据结构中每一节点对应函数的参数信息;其中,所述参数信息包括参数来源、参数类型和参数大小;
控制所述CPU根据所述参数信息生成每一节点对应的操作码,并汇总所有所述操作码得到所述操作码码流。
可选的,控制所述CPU根据所述节点信息确定所述数据结构中每一节点对应函数的参数信息,包括:
控制所述CPU根据所述节点信息执行第一操作、第二操作和第三操作,得到所述数据结构中每一节点对应函数的参数信息;
其中,所述第一操作为:根据节点在所述数据结构中的节点位置确定节点的参数来源;所述第二操作为:根据所述数据结构中节点对应的函数名称确定操作对象的数据类型,并根据所述操作对象的数据类型确定节点的参数类型;所述第三操作为:根据节点的参数类型确定参数大小。
本申请还提供了一种数据查询系统,应用于异构加速平台,所述异构加速平台包括CPU和并行处理器,所述数据查询系统包括:
功能实现模块,用于确定数据库管理系统中的操作符,并在所述并行处理器中实现所述操作符对应函数的功能;
转换模块,用于若接收到SQL查询语句,则利用所述CPU将所述SQL查询语句中的where子句转换为包括二叉树和链表的数据结构;其中,所述数据结构中的每一节点对应所述where子句中的一个操作符;
码流生成模块,用于控制所述CPU根据节点信息生成所述数据结构的操作码码流;其中,所述节点信息包括所述数据结构中每一节点的节点位置和对应的函数名;
筛选模块,用于利用所述并行处理器对所述数据库管理系统中的记录执行所述操作码码流对应的筛选操作,得到符合所述where子句的查询结果。
本申请还提供了一种存储介质,其上存储有计算机程序,所述计算机程序执行时实现上述数据查询方法执行的步骤。
本申请还提供了一种异构加速平台,包括存储器、CPU和并行处理器,所述存储器中存储有计算机程序,所述CPU和所述并行处理器调用所述存储器中的计算机程序时实现上述数据查询方法的步骤。
本申请提供了一种数据查询方法,应用于异构加速平台,所述异构加速平台包括CPU和并行处理器,所述数据查询方法包括:确定数据库管理系统中的操作符,并在所述并行处理器中实现所述操作符对应函数的功能;若接收到SQL查询语句,则利用所述CPU将所述SQL查询语句中的where子句转换为包括二叉树和链表的数据结构;其中,所述数据结构中的每一节点对应所述where子句中的一个操作符;控制所述CPU根据节点信息生成所述数据结构的操作码码流;其中,所述节点信息包括所述数据结构中每一节点的节点位置和对应的函数名;利用所述并行处理器对所述数据库管理系统中的记录执行所述操作码码流对应的筛选操作,得到符合所述where子句的查询结果。
本申请预先在并行处理器中实现数据库管理系统中操作符对应函数的功能,在接收到SQL查询语句时将SQL查询语句中的where子句转换为包括二叉树和链表的数据结构,并将上述数据结构转化为并行处理器能够识别的操作码码流。操作码码流根据数据结构中每一节点的节点位置和对应的函数名生成,因此并行处理器能够利用操作码码流执行where子句的筛选操作,从数据库管理系统中得到符合所述where子句的查询结果。由于本申请以函数的功能为粒度在并行处理器中实现函数功能,并行处理器可以通过自身已实现的函数功能进行组合以完成任意种where子句对应的筛选操作,不再受到固定模板的限制,因此本申请能够使异构加速平台支持任意类型的where子句查询,提高异构加速平台的适用范围。本申请同时还提供了一种数据查询系统、一种存储介质和一种异构加速平台,具有上述有益效果,在此不再赘述。
附图说明
为了更清楚地说明本申请实施例,下面将对实施例中所需要使用的附图做简单的介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例所提供的一种数据查询方法的流程图;
图2为本申请实施例所提供的一种二叉树和链表的数据结构的示意图;
图3为本申请实施例所提供的一种并行处理器并行查询数据流程图;
图4为本申请实施例所提供的一种并行处理器执行筛选操作的方法的流程图;
图5为本申请实施例所提供的一种操作码码流的整体结构示意图;
图6为本申请实施例所提供的一种码流头部结构示意图;
图7为本申请实施例所提供的一种操作码头部码流结构示意图;
图8为本申请实施例所提供的一种参数信息码流结构示意图;
图9为本申请实施例所提供的一种FPGA工作流程的示意图;
图10为本申请实施例所提供的一种数据查询系统的结构示意图;
图11为本申请实施例所提供的一种异构加速平台的结构示意图;
图12为本申请实施例所提供的一种存储介质的结构示意图。
具体实施方式
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
下面请参见图1,图1为本申请实施例所提供的一种数据查询方法的流程图,具体步骤可以包括:
S101:确定数据库管理系统中的操作符,并在所述并行处理器中实现所述操作符对应函数的功能;
其中,本实施例可以应用于异构加速平台,异构加速平台包括CPU(Central Processing Unit,中央处理器)和并行处理器,上述并行处理器可以为FPGA(Field Programmable Gate Array,现场可编程逻辑门阵列)、GPU(Graphics Processing Unit,图形处理器)等具有并行处理能力的处理芯片。
本步骤中的数据库管理系统可以为PostgreSQL(一种开源的客户/服务器关系型数据库管理系统)、MySQL、Oracle等,此处不对数据库管理系统的具体类型进行限定。本步骤可以确定数据库管理系统中的操作符,每一操作符均有其对应的函数,数据库管理系统中每一操作符对应的函数为最小功能单元,本步骤在并行处理器中构建每一所述操作符对应的功能单元;其中,所述功能单元用于执行所述操作符对应函数的操作。本申请通过在并行处理器中实现每一操作符对应函数的功能,使得并行处理器能够实现所有的最小功能单元。数据库管理系统中可以存储多条记录tuple。
S102:若接收到SQL查询语句,则利用所述CPU将所述SQL查询语句中的where子句转换为包括二叉树和链表的数据结构;
其中,当用户需要在数据库管理系统中查询特定数据时,通常先异构加速平台输入SQL查询语句,SQL查询语句包括where子句。在本步骤中,可以利用CPU将所述SQL查询语句中的where子句转换为包括二叉树和链表的数据结构。
请参见图2,图2为本申请实施例所提供的一种二叉树和链表的数据结构的示意图,若where子句为:I_discount>0.07 and I_quantity<24and((I_discount+I_extendedprice)+(I_discount*I_extendedprice)>0.1 or I_quantity>12),上述表达式为带有算术运算,大小比较,逻辑运算以及括号优先级的where子句表达式,CPU将上述where子句转换得到的数据结构如图2所示。图2中float_gt表示的函数功能为浮点数大于,float_lt表示的函数功能为浮点数小于,float_add表示的函数功能为浮点数相加,float_mul表示的函数功能为浮点数相乘,and表示逻辑与计算,or表示逻辑或计算,I_discount表示折扣,I_quantity表示数量,I_extendedprice表示总价。上述数据结构中的每一节点对应所述where子句中的一个操作符,以图2为例,图2中的float_gt、float_lt、float_add、float_mul、and和or分别表示一个节点。具体的,图2中每一个圆形的节点对应一个算术运算符或比较运算符,在FPGA中已经预先实现对应的功能形成功能单元,按照函数ID号即可调取对应的函数功能。and与or节点表示逻辑运算符,FPGA本身就能够支持。
S103:控制所述CPU根据节点信息生成所述数据结构的操作码码流;
在将where子句转换为二叉树+链表的数据结构之后,可以控制CPU根据节点信息生成上述数据结构的操作码码流,操作码码流包括并行处理器能够识别的函数功能执行顺序和每一函数的参数来源、参数类型、参数大小。
具体的,可以控制所述CPU根据所述节点信息确定所述数据结构中每一节点对应函数的参数信息;其中,所述参数信息包括参数来源、参数类型和参数大小;控制所述CPU根据所述参数信息生成每一节点对应的操作码,并汇总所有所述操作码得到所述操作码码流。
可选的,上述节点信息包括数据结构中每一节点的节点位置和每一节点对应的函数名。相应的,可以控制所述CPU根据所述节点信息执行第一操作、第二操作和第三操作,得到所述数据结构中每一节点对应函数的参数信息。上述第一操作为:根据节点在所述数据结构中的节点位置确定节点的参数来源;具体的,叶子节点的数据来源为数据库管理系统中记录的数据或预设常量,非叶子节点的数据来源至少包括数据栈中存储的其他操作码的操作结果。所述第二操作为:根据所述数据结构中节点对应的函数名称确定操作对象的数据类型,并根据所述操作对象的数据类型确定节点的参数类型。所述第三操作为:根据节点的参数类型确定参数大小。
S104:利用所述并行处理器对所述数据库管理系统中的记录执行所述操作码码流对应的筛选操作,得到符合所述where子句的查询结果。
其中,在得到操作码码流的基础上,本实施例可以利用并行处理器对数据库管理系统中的多条记录并行执行操作码码流对应的筛选操作,以便确定数据库管理系统中符合where子句的查询结果。由于并行处理器预先实现了数据库管理系统中操作符对应函数的功能,在接收到操作码码流后,可以根据操作 码码流中的函数执行顺序以及每一函数的参数来源执行相应的计算操作和逻辑操作,以便根据操作结果判断数据库管理系统中的各条记录是否符合where子句的要求。
本实施例预先在并行处理器中实现数据库管理系统中操作符对应函数的功能,在接收到SQL查询语句时将SQL查询语句中的where子句转换为包括二叉树和链表的数据结构,并将上述数据结构转化为并行处理器能够识别的操作码码流。操作码码流根据数据结构中每一节点的节点位置和对应的函数名生成,因此并行处理器能够利用操作码码流执行where子句的筛选操作,从数据库管理系统中得到符合所述where子句的查询结果。由于本实施例以函数的功能为粒度在并行处理器中实现函数功能,并行处理器可以通过自身已实现的函数功能进行组合以完成任意种where子句对应的筛选操作,不再受到固定模板的限制,因此本实施例能够使异构加速平台支持任意类型的where子句查询,提高异构加速平台的适用范围。
请参见图3,图3为本申请实施例所提供的一种并行处理器并行查询数据流程图,本实施例是对图1对应实施例中S104的进一步介绍,可以将本实施例与图1对应的实施例相结合得到进一步的实施方式,本实施例可以包括以下步骤:
S301:从所述数据库管理系统中读取多条目标记录;
S302:利用所述并行处理器将所述目标记录作为参数来源执行所述操作码码流对应的筛选操作,得到所述目标记录对应的布尔值;
S303:将所述布尔值为真的目标记录设置为符合所述where子句的查询结果;
S304:判断所述数据库管理系统中的记录是否均读取完毕;若所述数据库管理系统中的记录均读取完毕,则结束流程;若所述数据库管理系统中的记录未均读取完毕,则进入步骤301。
其中,根据并行处理器的参数可以设置并行处理器的最大并行处理数,进而基于最大并行处理数读取相应条数的记录作为目标记录。在执行操作码码流对应的筛选操作后,可以得到每一目标记录对应的布尔值,若布尔值为真则说明目标记录设置为符合所述where子句的查询结果。在对目标记录筛选完毕后,可以判断数据库管理系统中的记录是否均读取完毕,若未读取完毕则可以重新执行S301~S304的相关操作。
请参见图4,图4为本申请实施例所提供的一种并行处理器执行筛选操作的方法的流程图。如图4所示,并行处理器可以通过执行以下步骤完成操作码码流对应的筛选操作:
S401:从所述操作码码流中读取当前操作码;
S402:确定所述当前操作码的操作码类型;
S403:根据操作码类型执行对应的操作,得到操作结果;
具体的,函数对应的操作码包括计算操作码和逻辑操作码。
若当前操作码为计算操作码时,执行的操作如下:从所述参数来源中读取所述当前操作码对应的目标参数,并对所述目标参数执行所述当前操作码对应函数的计算操作,得到操作结果;其中,所述参数来源包括所述目标记录、预设常量和数据栈,所述数据栈用于存储所述操作码码流中计算操作码和逻辑操作码的操作结果;
若当前操作码为逻辑操作码时,执行的操作如下:从所述数据栈中读取布尔类型的操作结果,并对所述布尔类型的操作结果执行所述逻辑操作码对应的逻辑操作得到操作结果;
S404:将所述操作结果存储至所述数据栈;
S405:判断所述操作码码流中的操作码是否均读取完毕;若所述操作码码流中的操作码均读取完毕,则进入S406;若所述操作码码流中的操作码未均读取完毕,则进入S401。
S406:将最近一次逻辑操作得到的操作结果作为所述目标记录对应的布尔值。
具体的,本实施例可以通过以下方式读取所述当前操作码对应的目标参数,包括:根据所述当前操作码确定参数数量和参数偏移地址;根据所述参数数量和所述参数偏移地址从所述参数来源中读取所述当前操作码对应的目标参数。
具体的,在将所述逻辑操作结果存储至所述数据栈之前,还可以判断所述逻辑操作结果是否为真;若所述逻辑操作结果为真,则执行将所述逻辑操作结果存储至所述数据栈的操作;若述逻辑操作结果不 为真,则停止对所述目标记录的筛选操作,并判定所述目标记录不符合所述where子句。举例说明上述过程,例如where子句为:A+B>10 and A×B>10,按照where子句对应的操作码码流的函数执行顺序需要先计算A+B>10是否为真,再计算A×B>10是否为真,若上述两个结果均为真则符合where子句的筛选条件。当检测到A+B>10为假时,可以直接退出筛选操作,无需继续执行,通过上述方式可以提高数据筛选的效率。
CPU作为通用处理器,擅长管理和调度;FPGA可以用来设计专用电路,用于实现高性能的并行计算。数据库业务中所处理的各个记录之间并不存在依赖关系,非常适合进行并行处理。大量的学术理论研究与模拟实验数据表明,在CPU-FPGA异构环境下进行数据库数据处理能够大幅提升数据库系统的整体性能。为了进一步提高数据库服务器的整体性能,使用CPU-FPGA异构平台来查询数据已成为一种主流的趋势。但是如何灵活的支持不同的SQL语句,成为数据库CPU-FPGA异构加速从理论研究迈向实践应用的关键,处理SQL语句中的where子句部分是需要克服的难点之一。
在现有的数据库软件的CPU-FPGA异构加速开发中,发现存在较多应用场景与开发模式的限制,以及与数据库软件较难匹配的问题。开发过程中发现较多使用与应用场景限制,具体如下:(1)相关技术中约束条件格式只能硬编码到IP核中,只能处理固定模板的SQL语句,无法动态识别输入的任意SQL语句;(2)相关技术只能处理比较、逻辑运算,无法处理加减乘除以及带括号的复杂运算;(3)相关技术中约束条件的数据类型固定,只能为int整形且个数有限制。
为了解决上述相关技术中存在的缺陷,本申请提供了基于CPU-FPGA异构处理Postgresql where子句的方案,该方案以Postgresql数据库软件为基础进行了CPU-FPGA异构加速的扩展开发,提出了一套在FPGA中解析SQL语句中where子句的实现方案,扩展了FPGA能够处理SQL语句的应用场景及范围,本实施例的具体实现方式如下:
步骤A:在FPGA中与实现Postgresql中每个操作符所对应函数的功能。
其中,Postgresql中对每一种数据类型的算术与比较运算都有一个对应的函数ID,本实施例可以在FPGA中预先实现所有函数ID所对应的功能。
请参见表1,表1示出了浮点数数据类型的函数信息表,Postgresql中有表1所示的函数ID对应关系,本实施例可以在FPGA中预先实现如下函数ID所对应的函数功能。
表1 浮点数数据类型的函数信息表
函数名 函数ID 功能 对应操作符
float8eq 293 浮点数相等
float8ne 294 浮点数不相等 !=
float8lt 295 浮点数小于
float8le 296 浮点数小于等于 <=
float8gt 297 浮点数大于
float8ge 298 浮点数大于等于 >=
float8mul 216 浮点数相乘 *
float8div 217 浮点数相除 /
float8pl 218 浮点数相加 +
float8mi 219 浮点数相减 -
步骤B:在CPU中将where子句转换为二叉树+链表的数据结构,并生成操作码码流。
其中,本步骤通过遍历Postgresql对用户SQL语句的解析结果,遍历的过程中根据各个节点所在的位置及函数ID信息,推算出参数的来源,类型及大小,将各个节点的信息形成操作码。根据所执行的函数ID将操作码归类为计算操作码与逻辑操作码,最终将Postgresql对SQL语句的原始解析结果转为FPGA可以处理的操作码码流。
对于用户输入的SQL语句,Postgresql会进行词法、语法解析,处理语句中的操作符,常量表达式及括号优先级,将where子句转换为二叉树+链表的数据结构,二叉树+链表上的每一个节点对应用户输入的一个操作符。CPU执行时通过遍历每一个节点,执行对应的操作符函数,最后会得到一个布尔值,该布尔值用于表达当前的记录是否符合where子句的筛选条件。
在CPU开启查询扫描前,遍历Postgresql生成的二叉树+链表记录下每个节点的函数ID,根据节点所处的位置及函数ID信息,推算出函数参数的来源、类型及大小构成一个操作码数据结构,最后汇总所有的节点的操作码,形成固定格式的操作码码流。请参见图5,图5为本申请实施例所提供的一种操作码码流的整体结构示意图。
请参见图6,图6为本申请实施例所提供的一种码流头部结构示意图,码流的头部以固定的格式标识出一共有多少个操作码以及每个操作码在码流中的偏移及大小。图6中op_count表示一共有多少个操作码,op_offset表示操作码1在码流中的偏移,op1_len表示操作码1的长度,opN_offset表示最后一个操作码N在码流中的偏移,opN_len表示最后一个操作码N的长度。
请参见图7,图7为本申请实施例所提供的一种操作码头部码流结构示意图,操作码头部标识出操作码的相关信息,包括操作码的类型,操作码使用操作函数的ID号、参数个数、参数在操作码中的偏移与大小。本实施例可以根据操作函数的ID将操作码的类型归类为两种:计算操作码(如:加、减、乘、除、比较等),逻辑操作码(如:与、或、非等)。图7中type表示操作码类型(计算操作码或逻辑操作码),op_func表示操作码函数ID号,nargs表示操作码参数个数,arg1_offset表示第一个参数相对操作码启示位置的偏移,arg1_len表示第一参数的长度,argN_offset表示最后一个参数相对操作码启示位置的偏移,argN_len表示最后一个参数的长度。
请参见图8,图8为本申请实施例所提供的一种参数信息码流结构示意图。操作码中的参数信息标识了参数类型、参数大小以及参数来源。具体的,在CPU遍历二叉树+链表数据结构时,根据节点所处的位置及调用函数的类型,推算出函数的参数信息。具体的,参数标记arg_tag表示从当前参数的来源,参数来源有以下三种途径:(1)从当前tuple中的某一列中获取,此时标记arg_tag=104,同时使用var_no字段表示使用tuple中的列编号;(2)参数来源为常量,此时标记arg_tag=105,同时使用arg_data字段存放常量的值;(3)参数来源为栈,此时arg_tag为默认值。可选的,可以通过函数ID推算出操作对象的数据类型,由于已知参数的类型,因此可以通过参数类型得到参数大小。图8中arg_tag表示参数标记,arg_type表示参数类型,arg_size表示参数大小,var_no表示arg_tag==104类型时在tuple中的列号,arg_data表示arg_tag==105类型时存放用户输入的固定数值(即预设常量)。
步骤C:FPGA根据操作码码流对Postgresql的记录进行筛选。
本步骤在FPGA中根据操作码码流中的各个操作码,逐个进行解析,根据参数的来源、类型及大小获取参数,调用操作码中函数ID所对应的函数功能单元。在运算过程中以栈的形式对中间结果进行保存。
在数据库异构加速开发中,FPGA厂商所提供的加速开发库,其设计往往只提供通用的基础处理逻辑。在实际应用到具体的数据库软件的CPU-FPGA异构加速的开发中,需要根据数据库软件本身的设计架构及数据结构,进行整体方案的规划、适配与调整。
具体的,CPU在启动查询扫描后,将操作码码流发送至FPGA。FPGA解析操作码码流中的各个操作码,对Postgresql的每一个记录(Postgresql中称为tuple),依次遍历操作码码流中的每一个函数ID,调用FPGA中已实现的对应功能单元。执行到最后一个操作码得到的结果会是一个布尔值qual,表示当前的记录是否满足where子句的筛选过程。
请参见图9,图9为本申请实施例所提供的一种FPGA工作流程的示意图,具体包括以下操作:从操作码码流中获取操作码个数op_count,并逐个获取操作码地址op_offset,判断操作码类型type。
若操作码类型为计算操作码T_OpExpr,则获取参数个数nargs并逐个获取参数的偏移arg_offset,通过判断arg_tag字段的值确定参数来源,以便根据每个参数的arg_tag从指定位置获取参数的值。具体的,若arg_tag的值为T_Var(104),则从tuple中获取索引为var_no列的数值;若arg_tag的值为T_Const(105),则从arg_data中获取数值(即预设常量);若arg_tag的值为T_Var(104)和T_Const(105)之外的其他值,则从栈stack中获取数值。将从参数来源获取的参数拷贝至参数缓存区,并记录各个参数在参数缓存区中的偏移。迭代执行上述获取参数的操作,直至操作码的最后一个参数获取完毕。调用op_func对应的函数功能单元进行计算,将操作结果入栈存储。
若操作码类型为逻辑操作码T_BoolExper(113),则获取参数个数nargs,从栈中读取nargs个bool类型数据(即布尔类型的操作结果),拷贝数据到参数缓存区,并记录各个参数在参数缓存区中的偏移,调用op_func对应的函数功能单元计算操作结果,将操作结果存放在qual变量中并更新qual结果,将操作结果入栈存储。
迭代执行上述执行操作码,直至操作码的最后一个参数获取完毕。调用op_func对应的函数功能单元进行计算,将操作结果入栈存储。处理完所有的操作码后,判断qual值是否为true来决定当前记录是否满足过滤条件。若为ture,则说明tuple符合条件;若为false,则说明tuple不符合条件。
在上述实施例中,以Postgresql中每个操作符所对应的函数为最小功能单元,预先在FPGA中实现所有最小功能单元。通过对Postgresql where子句的解析结果进行遍历利用Postgresql中的函数ID作为FPGA内函数调用的索引,同时推算出各个函数调用时参数的相关信息,构成FPGA能够识别的操作码码流。FPGA根据操作码码流中提供的函数ID及参数信息,实现对where子句的解析操作。相比于本领域的相关技术,本方案可以动态支持SQL语句where子句解析,无需将SQL语句模板硬编码在IP核中,本方案没有约束条件个数的限制,本方案支持Postgresql中常用数据类型:int、float、date、timestamp等,还支持算术运算及带括号的表达式。参照本实施例的方案也可以适用于MySQL,Oracle等其它数据库软件,同样可以对用户输入的SQL语句进行类似的处理。只要找到其中对函数调用及参数的处理方法,也可以将对应的数据结构转换为FPGA可以处理的操作码码流,扩展数据库异构加速中对SQL语句的支持。
以TPC-H 1GB数据集为例说明上述过程,选取数据集中的第15条记录与第17条记录,第15条记录与第17条记录的第5、6、7列数据,如表2所示,第5列数据为数量l_quantity,第6列数据为总价l_extendedprice,第7列数据为折扣l_discount。
表2 记录对照表
  l_quantity l_extendedprice l_discount
第15条记录 21 27076.98 0.09
第17条记录 41 64061.68 0.04
FPGA可以根据每个操作码在码流中的偏移和长度逐个解析操作码。根据参数字段中的arg_tag从对应位置获取参数,根据操作码对应的函数ID,调用对应的操作函数将结果入栈,最终实现对一条记录是否满足过滤条件的判断。
请参见图10,图10为本申请实施例所提供的一种数据查询系统的结构示意图。本申请实施例所提供的一种数据查询系统410,应用于异构加速平台,所述异构加速平台包括CPU和并行处理器,所述数据查询系统410包括:
功能实现模块411,用于确定数据库管理系统中的操作符,并在所述并行处理器中实现所述操作符对应函数的功能;
转换模块412,用于若接收到SQL查询语句,则利用所述CPU将所述SQL查询语句中的where子句转换为包括二叉树和链表的数据结构;其中,所述数据结构中的每一节点对应所述where子句中的一个操作符;
码流生成模块413,用于控制所述CPU根据节点信息生成所述数据结构的操作码码流;其中,所述节点信息包括所述数据结构中每一节点的节点位置和对应的函数名;
筛选模块414,用于利用所述并行处理器对所述数据库管理系统中的记录执行所述操作码码流对应的筛选操作,得到符合所述where子句的查询结果。
本实施例预先在并行处理器中实现数据库管理系统中操作符对应函数的功能,在接收到SQL查询语句时将SQL查询语句中的where子句转换为包括二叉树和链表的数据结构,并将上述数据结构转化为并行处理器能够识别的操作码码流。操作码码流根据数据结构中每一节点的节点位置和对应的函数名生成,因此并行处理器能够利用操作码码流执行where子句的筛选操作,从数据库管理系统中得到符合所述where子句的查询结果。由于本实施例以函数的功能为粒度在并行处理器中实现函数功能,并行处理器可以通过自身已实现的函数功能进行组合以完成任意种where子句对应的筛选操作,不再受到固定模板的限制,因此本实施例能够使异构加速平台支持任意类型的where子句查询,提高异构加速平台的适用范围。
可选的,筛选模块包括:
记录读取单元,用于从所述数据库管理系统中读取多条目标记录;
操作码执行单元,用于利用所述并行处理器将所述目标记录作为参数来源执行所述操作码码流对应的筛选操作,得到所述目标记录对应的布尔值;
查询结果生成单元,用于将所述布尔值为真的目标记录设置为符合所述where子句的查询结果;
判断单元,用于判断所述数据库管理系统中的记录是否均读取完毕;若所述数据库管理系统中的记录未均读取完毕,则启动记录读取单元对应的工作流程。
可选的,操作码执行单元用于从所述操作码码流中读取当前操作码;还用于确定所述当前操作码的操作码类型;还用于若所述当前操作码为计算操作码,则从所述参数来源中读取所述当前操作码对应的目标参数,并对所述目标参数执行所述当前操作码对应函数的计算操作,得到操作结果;其中,所述参数来源包括所述目标记录、预设常量和数据栈,所述数据栈用于存储所述操作码码流中计算操作码和逻辑操作码的操作结果;还用于若所述当前操作码为所述逻辑操作码,则从所述数据栈中读取布尔类型的操作结果,并对所述布尔类型的操作结果执行所述逻辑操作码对应的逻辑操作得到操作结果;还用于将所述操作结果存储至所述数据栈;还用于判断所述操作码码流中的操作码是否均读取完毕;若所述操作码码流中的操作码均读取完毕,则将最近一次逻辑操作得到的操作结果作为所述目标记录对应的布尔值;若所述操作码码流中的操作码未均读取完毕,则执行从所述操作码码流中读取当前操作码的操作。
可选的,操作码执行单元从所述参数来源中读取所述当前操作码对应的目标参数的过程包括:根据所述当前操作码确定参数数量和参数偏移地址;根据所述参数数量和所述参数偏移地址从所述参数来源中读取所述当前操作码对应的目标参数。
可选的,还包括:
逻辑操作结果分析单元,用于在将所述逻辑操作结果存储至所述数据栈之前,判断所述逻辑操作结果是否为真;若所述逻辑操作结果为真,则执行将所述逻辑操作结果存储至所述数据栈的操作;若所述逻辑操作结果不为真,则停止对所述目标记录的筛选操作,并判定所述目标记录不符合所述where子句。
可选的,码流生成模块包括:
参数信息确定单元,用于控制所述CPU根据所述节点信息确定所述数据结构中每一节点对应函数的参数信息;其中,所述参数信息包括参数来源、参数类型和参数大小;
操作码汇总单元,用于控制所述CPU根据所述参数信息生成每一节点对应的操作码,并汇总所有所述操作码得到所述操作码码流。
可选的,操作码汇总单元,用于控制所述CPU根据所述节点信息执行第一操作、第二操作和第三操作,得到所述数据结构中每一节点对应函数的参数信息;
其中,所述第一操作为:根据节点在所述数据结构中的节点位置确定节点的参数来源;所述第二操作为:根据所述数据结构中节点对应的函数名称确定操作对象的数据类型,并根据所述操作对象的数据类型确定节点的参数类型;所述第三操作为:根据节点的参数类型确定参数大小。
由于系统部分的实施例与方法部分的实施例相互对应,因此系统部分的实施例请参见方法部分的实施例的描述,这里暂不赘述。
请参见图12,图12为本申请实施例所提供的一种存储介质的结构示意图。本申请还提供了一种存储介质601,其上存有计算机程序610,该计算机程序610被执行时可以实现上述实施例所提供的步骤。该存储介质601可以包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
请参见图11,图11为本申请实施例所提供的一种异构加速平台的结构示意图。本申请还提供了一种异构加速平台501,包括存储器510、CPU和并行处理器520,所述存储器510中存储有计算机程序511,所述CPU和所述并行处理器520调用所述存储器510中的计算机程序511时实现上述数据查询方法的步骤。当然所述异构加速平台501还可以包括各种网络接口,电源等组件。
说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即可。对于实施例公开的系统而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。应当指出,对于本技术领域的普通技术人员来说,在不脱离本申请原理的前提下,还可以对本申请进行若干改进和修饰,这些改进和修饰也落入本申请权利要求的保护范围内。
还需要说明的是,在本说明书中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的状况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。

Claims (10)

  1. 一种数据查询方法,其特征在于,应用于异构加速平台,所述异构加速平台包括CPU和并行处理器,所述数据查询方法包括:
    确定数据库管理系统中的操作符,并在所述并行处理器中实现所述操作符对应函数的功能;
    若接收到SQL查询语句,则利用所述CPU将所述SQL查询语句中的where子句转换为包括二叉树和链表的数据结构;其中,所述数据结构中的每一节点对应所述where子句中的一个操作符;
    控制所述CPU根据节点信息生成所述数据结构的操作码码流;其中,所述节点信息包括所述数据结构中每一节点的节点位置和对应的函数名;
    利用所述并行处理器对所述数据库管理系统中的记录执行所述操作码码流对应的筛选操作,得到符合所述where子句的查询结果。
  2. 根据权利要求1所述数据查询方法,其特征在于,利用所述并行处理器对所述数据库管理系统中的记录执行所述操作码码流对应的筛选操作,得到符合所述where子句的查询结果,包括:
    从所述数据库管理系统中读取多条目标记录;
    利用所述并行处理器将所述目标记录作为参数来源执行所述操作码码流对应的筛选操作,得到所述目标记录对应的布尔值;
    将所述布尔值为真的目标记录设置为符合所述where子句的查询结果;
    判断所述数据库管理系统中的记录是否均读取完毕;
    若所述数据库管理系统中的记录未均读取完毕,则执行从所述数据库管理系统中读取多条目标记录的操作。
  3. 根据权利要求2所述数据查询方法,其特征在于,利用所述并行处理器将所述目标记录作为参数来源执行所述操作码码流对应的筛选操作,得到所述目标记录对应的布尔值,包括:
    从所述操作码码流中读取当前操作码;
    确定所述当前操作码的操作码类型;
    若所述当前操作码为计算操作码,则从所述参数来源中读取所述当前操作码对应的目标参数,并对所述目标参数执行所述当前操作码对应函数的计算操作,得到操作结果;其中,所述参数来源包括所述目标记录、预设常量和数据栈,所述数据栈用于存储所述操作码码流中计算操作码和逻辑操作码的操作结果;
    若所述当前操作码为所述逻辑操作码,则从所述数据栈中读取布尔类型的操作结果,并对所述布尔类型的操作结果执行所述逻辑操作码对应的逻辑操作得到操作结果;
    将所述操作结果存储至所述数据栈;
    判断所述操作码码流中的操作码是否均读取完毕;
    若所述操作码码流中的操作码均读取完毕,则将最近一次逻辑操作得到的操作结果作为所述目标记录对应的布尔值;
    若所述操作码码流中的操作码未均读取完毕,则执行从所述操作码码流中读取当前操作码的操作。
  4. 根据权利要求3所述数据查询方法,其特征在于,从所述参数来源中读取所述当前操作码对应的目标参数,包括:
    根据所述当前操作码确定参数数量和参数偏移地址;
    根据所述参数数量和所述参数偏移地址从所述参数来源中读取所述当前操作码对应的目标参数。
  5. 根据权利要求3所述数据查询方法,其特征在于,在将所述逻辑操作结果存储至所述数据栈之前,还包括:
    判断所述逻辑操作结果是否为真;
    若所述逻辑操作结果为真,则执行将所述逻辑操作结果存储至所述数据栈的操作;
    若所述逻辑操作结果不为真,则停止对所述目标记录的筛选操作,并判定所述目标记录不符合所述where子句。
  6. 根据权利要求1至5任一项所述数据查询方法,其特征在于,控制所述CPU根据节点信息生成所述数据结构的操作码码流,包括:
    控制所述CPU根据所述节点信息确定所述数据结构中每一节点对应函数的参数信息;其中,所述参数信息包括参数来源、参数类型和参数大小;
    控制所述CPU根据所述参数信息生成每一节点对应的操作码,并汇总所有所述操作码得到所述操作码码流。
  7. 根据权利要求6所述数据查询方法,其特征在于,控制所述CPU根据所述节点信息确定所述数据结构中每一节点对应函数的参数信息,包括:
    控制所述CPU根据所述节点信息执行第一操作、第二操作和第三操作,得到所述数据结构中每一节点对应函数的参数信息;
    其中,所述第一操作为:根据节点在所述数据结构中的节点位置确定节点的参数来源;所述第二操作为:根据所述数据结构中节点对应的函数名称确定操作对象的数据类型,并根据所述操作对象的数据类型确定节点的参数类型;所述第三操作为:根据节点的参数类型确定参数大小。
  8. 一种数据查询系统,其特征在于,应用于异构加速平台,所述异构加速平台包括CPU和并行处理器,所述数据查询系统包括:
    功能实现模块,用于确定数据库管理系统中的操作符,并在所述并行处理器中实现所述操作符对应函数的功能;
    转换模块,用于若接收到SQL查询语句,则利用所述CPU将所述SQL查询语句中的where子句转换为包括二叉树和链表的数据结构;其中,所述数据结构中的每一节点对应所述where子句中的一个操作符;
    码流生成模块,用于控制所述CPU根据节点信息生成所述数据结构的操作码码流;其中,所述节点信息包括所述数据结构中每一节点的节点位置和对应的函数名;
    筛选模块,用于利用所述并行处理器对所述数据库管理系统中的记录执行所述操作码码流对应的筛选操作,得到符合所述where子句的查询结果。
  9. 一种异构加速平台,其特征在于,包括存储器、CPU和并行处理器,所述存储器中存储有计算机程序,所述CPU和所述并行处理器调用所述存储器中的计算机程序时实现如权利要求1至7任一项所述数据查询方法的步骤。
  10. 一种存储介质,其特征在于,所述存储介质中存储有计算机可执行指令,所述计算机可执行指令被处理器加载并执行时,实现如权利要求1至7任一项所述数据查询方法的步骤。
PCT/CN2022/089912 2021-10-13 2022-04-28 一种数据查询方法、系统、异构加速平台及存储介质 WO2023060878A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/279,346 US11893011B1 (en) 2021-10-13 2022-04-28 Data query method and system, heterogeneous acceleration platform, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111190053.0 2021-10-13
CN202111190053.0A CN113641701B (zh) 2021-10-13 2021-10-13 一种数据查询方法、系统、异构加速平台及存储介质

Publications (1)

Publication Number Publication Date
WO2023060878A1 true WO2023060878A1 (zh) 2023-04-20

Family

ID=78426562

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/089912 WO2023060878A1 (zh) 2021-10-13 2022-04-28 一种数据查询方法、系统、异构加速平台及存储介质

Country Status (3)

Country Link
US (1) US11893011B1 (zh)
CN (1) CN113641701B (zh)
WO (1) WO2023060878A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116627892A (zh) * 2023-05-31 2023-08-22 中国人民解放军国防科技大学 一种数据近存储计算方法、装置和存储介质

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113641701B (zh) * 2021-10-13 2022-02-18 苏州浪潮智能科技有限公司 一种数据查询方法、系统、异构加速平台及存储介质
CN114328595B (zh) * 2021-11-30 2024-01-09 苏州浪潮智能科技有限公司 一种数据查询方法、装置、电子设备及存储介质
CN114647635B (zh) * 2022-03-31 2024-01-23 苏州浪潮智能科技有限公司 数据处理系统
CN116432185B (zh) * 2022-12-30 2024-03-26 支付宝(杭州)信息技术有限公司 一种异常检测方法、装置、可读存储介质及电子设备
CN116644090B (zh) * 2023-07-27 2023-11-10 天津神舟通用数据技术有限公司 一种数据查询方法、装置、设备及介质

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103678621A (zh) * 2013-12-18 2014-03-26 上海达梦数据库有限公司 基于常量替换的sql语句优化方法
US20170068820A1 (en) * 2014-01-20 2017-03-09 Prevoty, Inc. Systems and methods for sql value evaluation to detect evaluation flaws
CN110515973A (zh) * 2019-08-30 2019-11-29 上海达梦数据库有限公司 一种数据查询的优化方法、装置、设备及存储介质
CN110990423A (zh) * 2019-12-12 2020-04-10 上海达梦数据库有限公司 Sql语句的执行方法、装置、设备和存储介质
CN111241130A (zh) * 2019-12-29 2020-06-05 航天信息股份有限公司 一种基于sql语言的语法语句的扩展方法及系统
CN113467905A (zh) * 2021-06-10 2021-10-01 浪潮(北京)电子信息产业有限公司 一种任务处理方法及系统
CN113641701A (zh) * 2021-10-13 2021-11-12 苏州浪潮智能科技有限公司 一种数据查询方法、系统、异构加速平台及存储介质

Family Cites Families (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH03268058A (ja) * 1990-03-19 1991-11-28 Hitachi Ltd 文書作成方法及びその装置
US6763352B2 (en) * 1999-05-21 2004-07-13 International Business Machines Corporation Incremental maintenance of summary tables with complex grouping expressions
US20020123984A1 (en) * 1999-08-23 2002-09-05 Naveen Prakash Dynamic query of server applications
US6493701B2 (en) * 2000-11-22 2002-12-10 Sybase, Inc. Database system with methodogy providing faster N-ary nested loop joins
CA2327167C (en) * 2000-11-30 2007-10-16 Ibm Canada Limited-Ibm Canada Limitee Method and system for composing a query for a database and traversing the database
US7039702B1 (en) * 2002-04-26 2006-05-02 Mcafee, Inc. Network analyzer engine system and method
US20070290901A1 (en) * 2004-06-02 2007-12-20 Koninklijke Philips Electronics, N.V. Encoding and Decoding Apparatus and Corresponding Methods
US8126870B2 (en) * 2005-03-28 2012-02-28 Sybase, Inc. System and methodology for parallel query optimization using semantic-based partitioning
US20060236254A1 (en) * 2005-04-18 2006-10-19 Daniel Mateescu System and method for automated building of component based applications for visualizing complex data structures
JP2011172005A (ja) * 2010-02-18 2011-09-01 Toshiba Corp 免責画面表示装置及び免責画面表示方法
US9031975B2 (en) * 2012-11-06 2015-05-12 Rockwell Automation Technologies, Inc. Content management
US11099841B2 (en) * 2014-06-26 2021-08-24 Sap Se Annotations for parallelization of user-defined functions with flexible partitioning
US9454571B2 (en) * 2014-06-26 2016-09-27 Sap Se Optimization of parallelization of user-defined functions with flexible partitioning
US10545974B2 (en) * 2016-07-18 2020-01-28 Sap Se Hierarchical window database query execution
WO2018209558A1 (zh) * 2017-05-16 2018-11-22 深圳市创客工场科技有限公司 积木式编程转换成程序代码的方法及装置
KR101919771B1 (ko) * 2017-06-12 2019-02-11 주식회사 티맥스데이터 데이터베이스 애플리케이션을 위한 최적화 기법
CN108804554B (zh) * 2018-05-22 2021-03-05 上海达梦数据库有限公司 一种数据库查询方法、装置、服务器和存储介质
CN110858202A (zh) * 2018-08-21 2020-03-03 北京京东尚科信息技术有限公司 数据库查询语句中where子句的生成方法和生成装置
US10915418B1 (en) * 2019-08-29 2021-02-09 Snowflake Inc. Automated query retry in a database environment
US11580102B2 (en) * 2020-04-02 2023-02-14 Ocient Holdings LLC Implementing linear algebra functions via decentralized execution of query operator flows

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103678621A (zh) * 2013-12-18 2014-03-26 上海达梦数据库有限公司 基于常量替换的sql语句优化方法
US20170068820A1 (en) * 2014-01-20 2017-03-09 Prevoty, Inc. Systems and methods for sql value evaluation to detect evaluation flaws
CN110515973A (zh) * 2019-08-30 2019-11-29 上海达梦数据库有限公司 一种数据查询的优化方法、装置、设备及存储介质
CN110990423A (zh) * 2019-12-12 2020-04-10 上海达梦数据库有限公司 Sql语句的执行方法、装置、设备和存储介质
CN111241130A (zh) * 2019-12-29 2020-06-05 航天信息股份有限公司 一种基于sql语言的语法语句的扩展方法及系统
CN113467905A (zh) * 2021-06-10 2021-10-01 浪潮(北京)电子信息产业有限公司 一种任务处理方法及系统
CN113641701A (zh) * 2021-10-13 2021-11-12 苏州浪潮智能科技有限公司 一种数据查询方法、系统、异构加速平台及存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116627892A (zh) * 2023-05-31 2023-08-22 中国人民解放军国防科技大学 一种数据近存储计算方法、装置和存储介质
CN116627892B (zh) * 2023-05-31 2024-05-07 中国人民解放军国防科技大学 一种数据近存储计算方法、装置和存储介质

Also Published As

Publication number Publication date
CN113641701A (zh) 2021-11-12
CN113641701B (zh) 2022-02-18
US11893011B1 (en) 2024-02-06
US20240045860A1 (en) 2024-02-08

Similar Documents

Publication Publication Date Title
WO2023060878A1 (zh) 一种数据查询方法、系统、异构加速平台及存储介质
US8326821B2 (en) Transforming relational queries into stream processing
CN109491989B (zh) 数据处理方法及装置、电子设备、存储介质
US8417690B2 (en) Automatically avoiding unconstrained cartesian product joins
US9930113B2 (en) Data retrieval via a telecommunication network
CN111125440B (zh) 一种基于Monad的持久层复合条件查询方法与存储介质
AU2017254893A1 (en) Adapting database queries for data virtualization over combined database stores
US20230101890A1 (en) Object Scriptability
US11132363B2 (en) Distributed computing framework and distributed computing method
CN114036183A (zh) 一种数据etl处理方法、装置、设备及介质
CN113010502B (zh) 数据质量稽核方法、装置、设备和存储介质
CN112667598B (zh) 基于业务需求变化的数据模型快速构建系统
CN113111239A (zh) 一种通用数据库操作方法、装置及其存储介质
CN111125216A (zh) 数据导入Phoenix的方法及装置
WO2024016594A1 (zh) 一种伪列实现方法、装置、电子设备及存储介质
CN114281842A (zh) 一种数据库分表查询的方法及设备
Haritsa Robust query processing: Mission possible
CN115857918A (zh) 数据处理方法、装置、电子设备及存储介质
CN114385145A (zh) 一种Web系统后端架构设计方法及计算机设备
CN114547083A (zh) 数据处理方法、装置及电子设备
EP2990960A1 (en) Data retrieval via a telecommunication network
CN111143398A (zh) 基于扩展sql函数的超大集合查询方法及装置
US20230074230A1 (en) Automatic generation of exporter configuration rules
WO2020238597A1 (zh) 基于Hadoop的数据更新方法、装置、系统及介质
CN111897772B (zh) 一种大文件数据导入方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22879828

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 18279346

Country of ref document: US