WO2022052639A1 - 数据查询方法及装置 - Google Patents

数据查询方法及装置 Download PDF

Info

Publication number
WO2022052639A1
WO2022052639A1 PCT/CN2021/107464 CN2021107464W WO2022052639A1 WO 2022052639 A1 WO2022052639 A1 WO 2022052639A1 CN 2021107464 W CN2021107464 W CN 2021107464W WO 2022052639 A1 WO2022052639 A1 WO 2022052639A1
Authority
WO
WIPO (PCT)
Prior art keywords
sql statement
chinese
statement
sql
query
Prior art date
Application number
PCT/CN2021/107464
Other languages
English (en)
French (fr)
Inventor
焦阳
王宁
Original Assignee
北京达佳互联信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京达佳互联信息技术有限公司 filed Critical 北京达佳互联信息技术有限公司
Publication of WO2022052639A1 publication Critical patent/WO2022052639A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24534Query rewriting; Transformation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation

Definitions

  • the present disclosure relates to the field of computer technology, and in particular, to a data query method, a data query device and a storage medium.
  • the first way is to introduce the elastic search (Elasticsearch, ES) search engine as the core of the system's search capabilities, and build multiple business models stored in the database into a system-wide object of the search engine, so that the database can focus on data storage. Search engines focus on data searches.
  • Elasticsearch Elasticsearch, ES
  • Apache Calcite supports and provides standard Structured Query Language (SQL) language, multiple query optimizations, and the ability to connect to various data sources. .
  • SQL Structured Query Language
  • the present disclosure provides a data query method, a data query device and a storage medium.
  • the technical solutions provided by the embodiments of the present disclosure are as follows:
  • a data query method comprising: acquiring a structured query language SQL statement based on a data query request; acquiring an abstract syntax tree based on the SQL statement, where the abstract syntax tree includes rule nodes, and the rule nodes use To characterize the grammatical rules followed by the SQL statement; obtain the elastic search ES query statement based on the parsing result corresponding to the rule node; send the data query result, the data query result is based on the ES query statement from the ES search engine The results obtained from the query.
  • it further includes: obtaining search keywords; obtaining a keyword value set based on the search keywords, where the keyword value set is a query from the ES search engine based on the search keywords obtain the result; send the keyword value set; receive the data query request returned based on the keyword value set.
  • acquiring search keywords includes: based on a visual query request, sending query recommendation information, where the query recommendation information at least includes a set of candidate search keywords; and receiving search keywords returned based on the query recommendation information.
  • acquiring the structured query language SQL statement based on the data query request includes: acquiring the Chinese SQL statement carried in the data query request; based on a digital dictionary, mapping the Chinese SQL statement to obtain the SQL statement,
  • the digital dictionary contains Chinese SQL statements and the mapping relationship between the SQL statements.
  • the method further includes: performing syntax verification on the Chinese SQL statement based on SQL grammar rules; performing value verification on the Chinese SQL statement based on value verification rules; In the case of the syntax check and the value check, the steps of mapping the Chinese SQL statement based on a digital dictionary to obtain the SQL statement are performed.
  • mapping the Chinese SQL statement based on a digital dictionary, and obtaining the SQL statement includes: in the case that the first type of keywords is not included in the Chinese SQL statement, based on the digital dictionary, generating the SQL statement containing the first type of keywords, the first type of keywords being used to represent data sorting conditions; in the case that the Chinese SQL statement contains the second type of keywords, based on the number A dictionary, generating the SQL statement containing a plurality of corresponding non-Chinese keywords, the second type of keywords corresponds to the plurality of non-Chinese keywords in the digital dictionary; the Chinese SQL statement contains In the case of the third type of keywords, the corresponding search conditions are determined based on the reference relationship, and the SQL statement containing the search conditions is generated based on the digital dictionary, and the third type of keywords is used to characterize the Chinese The reference relationship exists in the SQL statement.
  • the obtaining an abstract syntax tree based on the SQL statement includes: using a first parser to generate the abstract syntax tree based on the SQL statement.
  • using a first parser based on the SQL statement to generate the abstract syntax tree includes: determining, based on the SQL statement, using a lexical analyzer, lexical units included in the SQL statement; A predicate unit, using a syntax analyzer to determine the rule nodes contained in the SQL statement; based on the rule nodes, generate the corresponding abstract syntax tree; wherein, the lexical analyzer and the syntax analyzer are based on The SQL grammar rules are generated by using the first parser.
  • the method further includes: reading rule nodes included in the abstract syntax tree and leaf nodes corresponding to the rule nodes, where the leaf nodes are used to represent lexical units in the SQL statement; based on the The rule node generates an initial parsing result according to the correspondence between the SQL syntax rules and the syntax rules of the ES domain-specific language; based on the leaf nodes, generates an intermediate parsing result; based on the context of the SQL statement, the initial parsing result is The result and the intermediate parsing result are combined to generate the parsing result corresponding to the rule node.
  • acquiring the elastic search ES query statement based on the parsing result corresponding to the rule node includes: combining the parsing results based on the context of the SQL statement to generate the corresponding ES query statement.
  • a data query apparatus comprising: a memory for storing executable instructions; a processor for reading and executing the executable instructions stored in the memory, so as to realize the following operations: querying request based on data , obtain a structured query language SQL statement; based on the SQL statement, obtain an abstract syntax tree, the abstract syntax tree includes rule nodes, and the rule nodes are used to represent the grammar rules followed by the SQL statement; based on the rules The analysis result corresponding to the node is obtained, and the elastic search ES query statement is obtained; the data query result is sent, and the data query result is the result obtained from the ES search engine based on the ES query statement.
  • the processor is configured to read and execute the executable instructions stored in the memory, so as to realize the following operations: obtain a search keyword; obtain a keyword value set based on the search keyword, where the Describe the keyword value set as the result obtained from the ES search engine based on the search keyword; send the keyword value set; receive the data query request returned based on the keyword value set .
  • the processor is configured to read and execute executable instructions stored in the memory, so as to implement the following operations: based on the visual query request, send query recommendation information, where the query recommendation information at least includes candidate Search keyword sets; receive search keywords returned based on the query recommendation information.
  • the processor is configured to read and execute the executable instructions stored in the memory, so as to realize the following operations: obtain the Chinese SQL statement carried in the data query request;
  • the statement is mapped to obtain the SQL statement, and the digital dictionary contains the mapping relationship between the Chinese SQL statement and the SQL statement.
  • the processor is configured to read and execute the executable instructions stored in the memory, so as to realize the following operations: perform syntax verification on the Chinese SQL statement based on SQL syntax rules; verify based on the value The rule is to perform value verification on the Chinese SQL statement; when the Chinese SQL statement passes the syntax verification and the value verification, the execution is based on a digital dictionary, and the Chinese SQL statement is mapped to obtain the steps of the SQL statement.
  • the processor is configured to read and execute the executable instructions stored in the memory, so as to realize the following operations: in the case that the first type of keywords is not included in the Chinese SQL statement, based on the The digital dictionary is used to generate the SQL statement containing the first type of keywords, and the first type of keywords is used to represent data sorting conditions; in the case that the Chinese SQL statement contains the second type of keywords, Based on the digital dictionary, the SQL statement containing a plurality of corresponding non-Chinese keywords is generated, and the second type of keywords corresponds to the plurality of non-Chinese keywords in the digital dictionary; in the Chinese In the case that the third type of keyword is included in the SQL statement, the corresponding search condition is determined based on the reference relationship, and the SQL statement containing the search condition is generated based on the digital dictionary, and the third type of keyword is used for Indicates that the reference relationship exists in the Chinese SQL statement.
  • the processor is configured to read and execute the executable instructions stored in the memory, so as to realize the following operation: based on the SQL statement, using a first parser to generate the abstract syntax tree.
  • the processor is configured to read and execute the executable instructions stored in the memory, so as to realize the following operations: based on the SQL statement, use a lexical analyzer to determine the lexical syntax included in the SQL statement unit; based on the lexical unit, a syntax analyzer is used to determine the rule nodes contained in the SQL statement; based on the rule nodes, the corresponding abstract syntax tree is generated; wherein, the lexical analyzer and the grammar
  • the parser is generated by using the first parser based on SQL grammar rules.
  • the processor is configured to read and execute the executable instructions stored in the memory, so as to realize the following operations: read the rule node included in the abstract syntax tree and the leaf corresponding to the rule node node, the leaf node is used to represent the lexical unit in the SQL statement; based on the rule node, according to the corresponding relationship between the SQL grammar rule and the grammar rule of the ES domain-specific language, an initial parsing result is generated; based on the The leaf node generates an intermediate parsing result; based on the context of the SQL statement, the initial parsing result and the intermediate parsing result are combined to generate the parsing result corresponding to the rule node.
  • the processor is configured to read and execute the executable instructions stored in the memory, so as to realize the following operations: based on the context of the SQL statement, combine each of the parsing results to generate a corresponding The ES query statement.
  • a storage medium that stores instructions, and when the instructions in the storage medium are executed by a processor, the processor is caused to perform the following operations: based on a data query request, obtain a structured query language SQL statement; Based on the SQL statement, obtain an abstract syntax tree, where the abstract syntax tree includes rule nodes, and the rule nodes are used to represent the grammar rules followed by the SQL statement; based on the parsing results corresponding to the rule nodes, obtain elastic search ES query statement; send a data query result, where the data query result is a result obtained from an ES search engine based on the ES query statement.
  • a computer program product comprising one or more instructions executable by one or more processors of a computer device to cause the computer device to perform any of the above aspects
  • the data query method involved comprising
  • Fig. 1 is the framework schematic diagram of a kind of Apache Calcite provided in the related art
  • FIG. 2A is a schematic diagram of interaction among a dual parser, a business platform, and an ES search engine provided in an embodiment of the present disclosure
  • 2B is a schematic flowchart of a data query method based on a domain-specific language provided in an embodiment of the present disclosure
  • FIG. 3 is a schematic flowchart of another domain-specific language-based data query method provided in an embodiment of the present disclosure
  • FIG. 4A is a schematic diagram of an AST provided in an embodiment of the present disclosure.
  • FIG. 4B is a schematic flowchart of converting a non-Chinese SQL language into ES DSL provided in an embodiment of the present disclosure
  • FIG. 5 is a schematic flowchart of another domain-specific language-based data query method provided in an embodiment of the present disclosure
  • 6A-6E are schematic diagrams of a set of interactive interfaces provided in an embodiment of the present disclosure.
  • FIGS. 7A-7C are respectively schematic diagrams of three situations in which the Chinese SQL provided in the embodiment of the present disclosure is converted into a non-Chinese SQL statement;
  • FIG. 8 is a schematic diagram of the logical structure of a data query apparatus based on a domain-specific language provided in an embodiment of the present disclosure
  • FIG. 9 is a schematic diagram of the entity structure of a data query apparatus based on a domain-specific language provided in an embodiment of the present disclosure.
  • DSL Domain-Specific Language
  • DSL is a language designed to be used in a specific context in a specific domain, where the domain refers to a business (such as banking, insurance, etc.) context, or it can refer to a certain application (such as a web application , database, etc.) context.
  • GPL General-Purpose Language
  • DSL is not very universal. DSL is designed only for a certain applicable domain, but DSL is also sufficient for expressing problems in this domain and building corresponding solutions.
  • HTML Hyper Text Markup Language
  • Web web page
  • Java Java can run on personal computers or mobile terminals, and can be embedded in applications in various industries such as banking, finance, insurance, and manufacturing.
  • SQL Structured Query Language
  • SQL is a special-purpose programming language, a database query and programming language, used to access data and query, update, and manage relational database systems.
  • SQL is a high-level non-procedural programming language that allows users to work on high-level data structures. SQL does not require the user to specify the data storage method, nor does it require the user to know the specific data storage method, so different database systems with completely different underlying structures can use the same SQL as the interface for data input and management. SQL statements can be nested, which makes SQL extremely flexible and powerful.
  • AST referred to as Syntax tree, in computer science, AST is an abstract representation of the syntax structure of source code. The AST expresses the syntax structure of the programming language in the form of a tree, and each node on the tree represents a structure in the source code.
  • ES Elastic Search
  • ES is an open source, highly scalable, distributed full-text search engine.
  • ES is also developed in Java and uses Lucene (search engine) as its core to implement all indexing and search functions, but ES's The purpose is to hide the complexity of Lucene through a simple RESTful API (Application Programming Interface), so as to make full-text search simple.
  • RESTful is a design style and development method of web applications, suitable for mobile Scenarios where Internet manufacturers are used as business interfaces.
  • a solution for data query based on a domain-specific language includes: in response to a received data query request sent by a business platform, based on the data query request, acquiring an SQL statement, using a set basic parser, that is, a first parser, to parse the SQL statement, and generating a node containing each rule Then, based on the AST, obtain each parsing result corresponding to each rule node, and generate the corresponding ES query statement based on each parsing result, then receive the data query result returned by the ES search engine, and send the data query result to business platform.
  • a set basic parser that is, a first parser
  • a dual parser is provided, and the dual parser includes at least an outer parser and an inner parser.
  • the external parser includes at least a Chinese parser and a validator, the Chinese parser is used to convert Chinese SQL statements into non-Chinese SQL statements, and the validator is used to perform value verification on the keyword values sent by the business platform, wherein , Chinese SQL statements refer to SQL pseudocodes written in Chinese natural language combined with SQL grammar rules, while non-Chinese SQL statements refer to traditional SQL statements (rather than pseudocodes).
  • the internal parser includes at least a basic parser, a chained parser, and a local parser.
  • the basic parser is used to generate the corresponding AST based on non-Chinese SQL statements, that is, the basic parser is used to obtain AST based on SQL statements, and the chained parser uses Based on the AST, the corresponding ES query statement (belonging to a DSL) is generated, and the local parser is used to return the syntax verification result and syntax parsing information of the Chinese SQL statement.
  • the external parser and the internal parser may be used independently, or may be integrated into one device, which is not limited in the present disclosure.
  • the internal parser follows the SQL grammar and is implemented based on ANTLR (Another Tool for Language Recognition, which refers to an open-source parser that can automatically generate a syntax tree according to input and display it visually).
  • ANTLR Another Tool for Language Recognition
  • the external parser implements the conversion of Chinese SQL statements and non-Chinese SQL statements based on Java, that is to say, the internal parser is used to convert SQL statements into ES query statements, and the external parser is used to convert Chinese SQL statements are converted into SQ statements, providing a near-natural language query method, so that grammar rules can be flexibly expanded according to actual business needs.
  • the process of data query based on a domain-specific language is as follows:
  • S201 The server receives the data query statement sent by the business platform.
  • the data query statement includes a Chinese SQL statement or an SQL statement.
  • the data query statement is an SQL statement
  • the SQL statement itself is a non-Chinese structured query language statement
  • S202 is directly executed; if the data query statement is in Chinese
  • the server executes S202 after converting the Chinese SQL statement into a corresponding SQL statement.
  • the above S201 that is, an implementation manner in which the server obtains the SQL statement based on the data query request.
  • the server may be a cluster device that provides background services to the business platform.
  • the user logs into the service platform on the terminal and triggers a data query request
  • the terminal sends the data query request to the server
  • the server receives the data query request, and obtains the SQL statement based on the data query request.
  • the data query request sent by the terminal to the server usually carries SQL statements (instead of Chinese SQL statements), and the server only needs to parse the data query request. to get the SQL statement.
  • the data query request sent by the terminal to the server usually carries Chinese SQL statements, that is, SQL pseudocodes written by using Chinese natural language combined with SQL grammar rules, then After parsing the data query request to obtain the Chinese SQL statement, the server also needs to convert the Chinese SQL statement into an SQL statement.
  • Chinese SQL statements that is, SQL pseudocodes written by using Chinese natural language combined with SQL grammar rules
  • the server uses the set basic parser based on the data query statement to generate an abstract syntax tree including each rule node, wherein the rule node is used to represent that the data query statement contains grammar rules.
  • the non-Chinese structured query statement refers to an SQL statement.
  • the above S202 that is, an implementation manner of the server acquiring an abstract syntax tree based on the SQL statement, wherein the abstract syntax tree includes a rule node, and the rule node is used to represent the grammar rule followed by the SQL statement.
  • S203 The server obtains the parsing results corresponding to each rule node based on the abstract syntax tree, generates a corresponding elastic search domain-specific language based on the obtained parsing results, and sends the elastic search domain-specific language to the elastic search search engine.
  • the elastic search (Elastic Search, ES) domain-specific language refers to an ES query statement.
  • the server receives the data query result returned by the elastic search search engine based on the specific language of the elastic search domain, and sends the data query result to the business platform.
  • the elastic search search engine refers to an ES search engine.
  • the data query result is a result obtained from the ES search engine based on the ES query statement.
  • S301 The server receives the data query statement sent by the business platform through the external parser.
  • the data query statement includes a Chinese SQL statement or an SQL statement.
  • the data query statement is an SQL statement
  • the SQL statement itself is a non-Chinese SQL statement
  • S302 is directly executed; if the data query statement is a Chinese SQL statement
  • the server executes S302 after converting the Chinese SQL statement into a corresponding SQL statement.
  • the server receives the data query request sent by the business platform, and obtains the SQL statement based on the data query request through an external parser.
  • the data query request carries a Chinese SQL statement, and the server needs to call an external parser to convert the Chinese SQL statement into the SQL statement.
  • the data query request directly carries the SQL statement, and the server needs to call an external parser to read the SQL statement.
  • the process of converting the Chinese SQL statement into the SQL statement includes: the server acquiring the Chinese SQL statement carried in the data query request; and mapping the Chinese SQL statement based on a digital dictionary to obtain the SQL statement , wherein the data dictionary contains the mapping relationship between Chinese SQL statements and SQL statements.
  • the non-Chinese SQL statement refers to an SQL statement.
  • the server performs value verification on the SQL statement through an external parser. In the embodiments of the present disclosure, only the server directly performs value verification on the SQL statement as an example for description. In some embodiments, after the server receives the data query request in the above S301, if the data query request carries a Chinese SQL statement, the server parses the data query request, obtains the Chinese SQL statement, and then directly executes the Chinese SQL statement.
  • the SQL statement performs the following operations: based on the SQL syntax rules, perform syntax verification on the Chinese SQL statement; based on the value verification rules, perform value verification on the Chinese SQL statement; in the Chinese SQL statement, pass the syntax verification and sum the value In the case of verification, perform the steps of mapping the Chinese SQL statement based on the digital dictionary to obtain the SQL statement.
  • the data query statement is composed of various search keywords, and corresponding value verification rules are preset based on the set search keyword types.
  • the set search keyword type may include, but is not limited to, any one or a combination of a string type, an integer type, a floating point type, a time type, a Boolean type, an array type, and an object type.
  • the validator in the external parser performs value validation on each keyword value corresponding to each search keyword included in the data query statement based on the set value validation rule.
  • the validator in the external parser takes the value "T1" for the keyword corresponding to the search keyword "taskId” contained in the data query statement 1 Perform value verification.
  • the keyword value "T1" corresponding to the search keyword "taskId” satisfies the string length range of 1-10 characters. Therefore, the keyword value corresponding to the search keyword "taskId” is "T1". Pass value validation.
  • the server sends the SQL statement from the external parser to the internal parser when the SQL statement passes the value verification.
  • the external parser sends the data query statement to the internal parser.
  • the external parser sends the data query statement 1 to the internal parser.
  • S304 The server uses the internal parser based on the data query statement and uses the set basic parser to generate an AST including each rule node.
  • the basic parser refers to the first parser.
  • the above-mentioned S304 that is, the server receives the SQL statement through the internal parser, and based on the SQL statement, adopts the first parser to generate an abstract syntax tree AST, and the AST includes a rule node, and the rule node is used to represent the SQL statement to follow. grammar rules.
  • the internal parser can use the ANTLR metalanguage to describe the SQL grammar rules, and based on the written SQL grammar rules, use the ANTLR grammar generation tool to generate the basic parser.
  • the ANTLR grammar generation tool to generate the basic parser.
  • corresponding lexical analyzers and grammar analyzers are pre-generated.
  • the SQL grammar rules at least include lexical rules, grammar rules and syntactic rules.
  • the lexical rules are used to represent the structure of the lexical unit
  • the grammar rules are used to represent the structure of the vocabulary composed of the morphological units
  • the syntactic rules are used to represent a data query statement. relationship between words.
  • the ANTLR metalanguage can be used to describe the SELECT syntax rules in the SQL syntax rules:
  • ANTLR metalanguage can be used to describe grammar rules such as SELECT grammar, WHERE grammar, ORDER grammar, LIMIT grammar, and GROUP BY grammar.
  • WHERE syntax whereClause: WHERE expression, where expression represents an expression, and the WHERE syntax is used to query data based on search conditions.
  • the metalanguage description of the GROUPBY syntax is: groupbyClause: GROUP BY ID(COMMA ID)*, the GROUPBY syntax is used to indicate that the data query results are grouped according to the specified column.
  • ASC means ascending order
  • DESC means descending order
  • ORDER syntax is used to order the data query results according to the specified field.
  • the expression in the WHERE grammar may include, but is not limited to, any one or combination of the following rules:
  • LPAREN expression RPAREN used to represent expressions in parentheses
  • USH)rightExpr expression; used to indicate left-shift, right-shift, and unsigned right-shift operations;
  • GTE)rightExpr expression; used to indicate less than, less than or equal to, greater than, greater than or equal to operation;
  • N MPPEQ)rightExpr expression; used to indicate equal, not equal, fuzzy query, reverse fuzzy query, phrase matching, reverse phrase matching , prefix phrase matching, reverse prefix phrase matching operations;
  • inClause used to represent querying multiple fields.
  • grammar rules such as SELECT syntax, WHERE syntax, ORDER syntax, LIMIT syntax, and GROUP BY syntax are used as examples. In practical applications, syntax rules can be set according to actual requirements.
  • the server determines each lexical unit included in the data query statement by using the set lexical analyzer based on the data query statement through the internal parser.
  • the server uses an internal parser to determine the lexical unit contained in the SQL statement by using a lexical analyzer based on the SQL statement.
  • the lexical analyzer is generated by using the first parser based on SQL grammar rules.
  • the lexical analyzer sequentially reads each character contained in the data query sentence, and determines each lexical unit contained in the data query sentence based on each set lexical rule.
  • the lexical unit types may include, but are not limited to, keywords, identifiers, literals, operators, and delimiters, where keywords are used to represent reserved characters that cannot be used as identifiers, such as "SELECT", Identifiers are used to characterize search keywords such as table names and column names included in SQL syntax, and literals are used to characterize strings and numeric values.
  • the internal parser uses a lexer pre-generated by ANTLR to determine that data query sentence 1 contains the following lexical units:
  • Keywords SELECT, FROM, WHERE;
  • a corresponding priority may be preset for each lexical unit type, and further, if a lexical unit satisfies at least one lexical rule, the lexical analyzer determines a lexical unit of a lexical unit type based on the preset priority Types of.
  • SELECT For example, suppose that the priority of keywords is set higher than that of identifiers. Taking “SELECT” as an example, "SELECT" satisfies both the lexical rules of keywords and the lexical rules of identifiers.
  • the lexical analyzer is based on the preset lexical rules. Priority, determines the token type of "SELECT" as a keyword.
  • the server determines each rule node included in the data query statement based on each lexical unit through the internal parser and adopts the set syntax analyzer, and generates a corresponding AST based on each rule node.
  • the server uses an internal parser and uses a syntax analyzer to determine the rule node contained in the SQL statement based on the lexical unit; based on the rule node, the corresponding AST is generated.
  • the parser is also generated by the first parser based on the SQL grammar rules.
  • each rule node is used to represent each grammar rule included in the data query statement.
  • the parser adopts the set state transition table, determines each grammar rule corresponding to each lexical unit based on the set grammar rules, and determines corresponding rule nodes based on the obtained grammar rules, and determines the corresponding rule nodes based on the obtained grammar rules.
  • Each rule node and each leaf node generate a corresponding AST, wherein the leaf node is used to represent the lexical unit included in the data query statement.
  • S305 The server obtains each parsing result corresponding to each rule node based on the AST through the internal parser, and generates an ES DSL based on each parsing result.
  • the ES DSL refers to the ES query statement.
  • the server obtains each parsing result corresponding to each rule node based on the AST through the internal parser, and generates an ES query statement based on each parsing result.
  • the chain parser in the internal parser adopts the set depth-first algorithm, and sequentially reads each rule node and each leaf node contained in the AST;
  • the chain parser generates corresponding initial parsing results based on each rule node and the corresponding relationship between the preset SQL grammar rules and ES DSL grammar rules, and generates corresponding intermediate parsing results based on each leaf node;
  • the chain parser combines each initial parsing result and each intermediate parsing result based on the context of the data query statement to generate corresponding parsing results;
  • the chain parser combines the parsing results based on the context of the data query statement to generate the corresponding ES DSL.
  • the above process of obtaining the parsing result that is, the server reads the rule node contained in the abstract syntax tree and the leaf node corresponding to the rule node through the chain parser, and the leaf node is used to represent the lexical unit in the SQL statement; Based on the rule node, the initial parsing result is generated according to the correspondence between the SQL grammar rules and the grammar rules of the ES domain-specific language; based on the leaf node, the intermediate parsing result is generated; based on the context of the SQL statement, the initial parsing result is generated Combine with the intermediate parsing result to generate the parsing result corresponding to the rule node; based on the context of the SQL, combine the parsing results to generate the corresponding ES query statement.
  • the internal parser uses the set depth-first algorithm, first of all, Read the rule node ⁇ selectOperation ⁇ contained in the AST, enter the SELECT grammar rule, and generate the corresponding initial parsing result 1 according to the corresponding relationship between the SELECT grammar rule in the preset SQL grammar rule and the ES DSL grammar rule.
  • the ES DSL is used to characterize the data whose keyword value corresponding to "taskId" is "T1" from all data. , the ES DSL is shown in the following code:
  • each rule node can be further entered or exited through a preset function.
  • a preset function For example, as shown in FIG. 4A , the rule node is entered through the preset entry rule function enterRule( ), and the preset exit rule function exitRule is used. () to exit the rule node.
  • the process of depth-first traversal of the AST is actually the process of chain parsing each grammar rule, that is, when depth-first traversal is performed for the AST, refer to First, enter the parsing class of the SELECT grammar rule, that is, QuerySelectFieldsParser for parsing, and save the parsing result in the context, then match the FROM grammar rule, enter the QueryFromParser for parsing, and save the parsing result to the context, then, in turn Parse the WHERE syntax rules, GROUPBY syntax rules, ORDER syntax rules, and LIMIT syntax rules, and save the corresponding parsing results in the context. Finally, based on each initial parsing result in the context, merge and generate ES DSL for ES search.
  • the server sends the ES DSL to the ES search engine through the internal parser.
  • the ES DSL refers to the ES query statement.
  • the server sends the ES query statement to the ES search engine through the internal parser.
  • the ES search engine performs data query based on ES DSL, and obtains the data query result.
  • the server performs a data query based on the ES query statement through the ES search engine, and obtains a data query result.
  • the ES search engine is a distributed, high-expansion, and high-real-time search and data analysis engine.
  • the ES search engine adopts a cluster architecture.
  • a cluster is a collection of one or more nodes that collectively store the entire data and provide joint indexing and search functions across all nodes.
  • a node acts as a single server that is part of the cluster, storing data and participating in the indexing of the cluster.
  • search function in which, each index can be divided into shards, and each shard can have 0 or more copies.
  • the replicas are called replica shards, and the primary and secondary shards of each shard are not in the same node.
  • an ES cluster consists of node 1 and node 2, where index 1 is divided into shard 0 and shard 1, node 1 contains the primary shard of shard 0 and the replica shard of shard 1, and node 2 contains A replica shard for shard 0 and a primary shard for shard 1.
  • a node in the ES cluster receives the ES DSL, it broadcasts the ES DSL to other nodes in the shard containing the target index.
  • the above A1 that is, when the coordinating node device in the ES cluster receives the ES query statement, broadcasts the ES query statement to other node devices of the shard of the target index.
  • the node that receives the ES DSL is referred to as the coordinating node.
  • the coordinating node After receiving the ES DSL, the coordinating node determines the target index for querying according to the ES DSL, and broadcasts the ES DSL to the main server containing the target index. in other nodes of the shard or replica shard.
  • node 1 in the ES cluster receives the ES DSL, and calls node 1 as the coordinating node. At this time, node 1 determines the target index for querying as index 1 according to the ES DSL, and broadcasts the ES DSL to the node containing index 1.
  • Other nodes perform query operations on each shard containing the target index according to the ES DSL, add the queried document IDs to their respective ordered priority queues, and add the document IDs and The sorting value corresponding to the document ID is returned to the coordinating node.
  • the above A2 that is, other node devices, based on the ES query statement, perform query operations on each shard containing the target index, add the queried document identifiers to their respective ordered priority queues, and place their respective ordered priority queues.
  • the document ID in the queue and the sorting value corresponding to the document ID are returned to the coordinating node device.
  • the document is a basic information unit that can be indexed, and the document identifier is used to uniquely identify a document.
  • node 2 performs a query operation on the replica shard of shard 0 according to the ES DSL, adds the queried document identifiers doc_ID1 and doc_ID2 to the ordered priority queue of node 2, and adds the ordered priority queue of node 2
  • the document identifiers doc_ID1, doc_ID2, the sorting value corresponding to doc_ID1, and the sorting value corresponding to doc_ID2 are sent to node 1, where the sorting value corresponding to doc_ID1 indicates that doc_ID1 has the highest priority, and the sorting value corresponding to doc_ID2 indicates that doc_ID2 has a lower priority than The priority of doc_ID1.
  • the documents corresponding to doc_ID1 and doc_ID2 contain task data with task number T1.
  • the coordinating node merges the obtained document identifiers into the ordered priority queue of the coordinating node according to the sorting value to obtain a global query result list, and obtains a data query result based on the global query result list.
  • the coordinating node device merges the obtained document identifiers into the ordered priority queue of the coordinating node according to the sorting value, obtains the global query result list, and obtains the data query result based on the global query result list.
  • node 1 merges the obtained doc_ID1 and doc_ID2 into the ordered priority queue of the coordinating node according to the sorting value, and obtains a global query result list.
  • the document IDs in the global query result list are doc_ID1 and doc_ID2 in sequence.
  • node 1 Based on the global query result list, obtain the corresponding document from the replica shard of shard 0 of node 2 as data query result 1.
  • S309 The server sends the data query result to the business platform through the internal parser.
  • the external parser in order to reduce the difficulty of using the system and facilitate users to perform convenient searches in complex business systems, provides a search method for Chinese SQL statements based on the data dictionary.
  • FIG. 5 the embodiment of the present disclosure is further described by taking a visualized Chinese query scenario as an example.
  • S501 The service platform obtains the mode selection information input by the user.
  • the mode selection information is used to represent the query methods provided by the business platform, for example, Chinese SQL query methods and non-Chinese SQL query methods.
  • the business platform can also provide other query methods according to business scenarios, but In the end, it will be converted into a non-Chinese SQL query method, which will not be repeated here.
  • the Chinese SQL query method means that users can input Chinese SQL statements to query data, which is usually aimed at non-technical personnel
  • the non-Chinese SQL query method means that users can input professional SQL statements to query data, which is usually oriented to Are technical people (eg developers, testers, etc.).
  • the business platform acquires the mode selection information input by the user, if based on the mode selection information, it is determined that the mode selection information represents the Chinese SQL query mode, then continue to execute S502.
  • S502 The business platform sends a visual query request to the external parser.
  • the external parser is a parsing function module integrated on the server.
  • the business platform sends a visual query request to the external parser of the server.
  • the server obtains the query recommendation information based on the visual query request through the external parser, and sends the query recommendation information to the business platform.
  • the query recommendation information includes at least a set of candidate search keywords, that is, the query recommendation information at least includes a data dictionary, which may include any item or combination of the data dictionary, wherein the data dictionary and the operator table The sum operator table is set according to the business provided by the business platform, and the data dictionary contains the mapping relationship between Chinese SQL statements and non-Chinese SQL statements.
  • the query recommendation information includes a data dictionary, an operator table and an operator table as an example for description.
  • the keyword table in the data dictionary provides the mapping relationship between Chinese SQL statements and search keywords in non-Chinese SQL statements, where the Chinese keywords are used to represent the Chinese names of the search keywords, and the English keywords are used for To characterize the search keywords used by the ES search engine, the operator table and operator table are shown in Table 4 and Table 5.
  • the external parser obtains Chinese keywords, English keywords and field types by sorting out the business model set according to the actual business in the business platform, and obtains corresponding index fields by constructing an index of the ES search engine, And build a data dictionary based on Chinese keywords, English keywords, field types and index fields.
  • a semantic model corresponding to the business platform is constructed based on the data dictionary, SQL grammar rules, operator table and operator table.
  • the data dictionary can be synchronized in real time to realize the dynamic update of the data dictionary, so as to continuously expand the search boundary of the business platform and improve the business platform. search capability.
  • the internal parser may also load the above-mentioned semantic model when generating the AST based on the data query statement using the basic parser, and then directly perform parsing according to the semantic model.
  • S504 The business platform receives the query recommendation information returned by the external parser.
  • the external parser is a parsing function module integrated on the server.
  • the business platform receives the query recommendation information returned by the server through the external parser.
  • the business platform When the business platform receives the query recommendation information returned by the external parser, it displays the corresponding interactive interface to the user based on the query recommendation information.
  • the business platform presents the interactive interface shown in Fig. 6A to the user based on the data dictionary, operator table and operator table included in the query recommendation information.
  • S505 The business platform obtains the search keyword input by the user.
  • the service platform is a platform for users to log in on the terminal side, and supports various services such as data query and data modification.
  • the service platform obtains the search keyword "priority" input by the user.
  • the business platform obtains the operator input by the user based on the operator table.
  • S506-S508 The business platform sends the search keywords to the ES search engine through the external parser and the internal parser.
  • S506-S508 include: S506, the business platform sends the search keywords to the external parser of the server; S507, the external parser sends the search keywords to the internal parser; S508, the internal parser sends the search keywords to the ES search engine .
  • the server receives the search keywords returned based on the query recommendation information, that is, the server obtains the search keywords.
  • the ES search engine determines a corresponding keyword value set based on the search keyword.
  • the ES search engine determines the corresponding keyword value set 1 ⁇ "highest", “highly”, “medium”, “low”, “extremely low” ⁇ .
  • S510-S512 The business platform receives the keyword value set returned by the ES search engine through the internal parser and the external parser.
  • S510-S512 include: S510, the ES search engine sends the keyword value set to the internal parser; S511, the internal parser sends the keyword value set to the external parser; S512, the external parser sends the keyword value set to the business platform gather.
  • the server obtains a keyword value set based on the search keyword, and the keyword value set is the result obtained from the ES search engine based on the search keyword, and then sends the keyword value gather.
  • the business platform receives the keyword value set returned by the ES search engine through the internal parser and external parser, displays the keyword value set through the interactive interface, and obtains the key corresponding to the above search keywords entered by the user through the interactive interface. word value.
  • the business platform when the business platform receives the keyword value set 1, it displays the keyword value set 1 through the interactive interface.
  • the business platform obtains the search keywords input by the user through the interactive interface.
  • the value of the keyword corresponding to "Priority" is "Highest”.
  • the business platform displays corresponding operators through an interactive interface.
  • the business platform is based on the operator table, and displays the operators "AND”, “OR”, and “ORDER BY" through the interactive interface.
  • keyword recommendation, value recommendation, and operator recommendation are provided for the business platform through an external parser, thereby realizing a good visual interaction and providing a convenient DSL filling method for customers.
  • S513 The business platform obtains the data query statement input by the user.
  • the data query statement includes a Chinese SQL statement or an SQL statement.
  • the business platform in addition to inputting search keywords and corresponding keyword value sets, can also provide users with some packaged and predefined function APIs, the functions include but are not limited to: project-type functions , user functions, time functions, etc.
  • the project class function may include projectsWhereUserlsln( ), which represents the project collection that the current user participates in.
  • the user class functions may include currentUser() and Group($ ⁇ item defines user group ⁇ ), where currentUser() is used to represent the current login person, and Group($ ⁇ item defines user group ⁇ ) is used to query a certain user groups.
  • time class functions include, but are not limited to:
  • endOfDay() used to represent the end time of the day (23:59:59 of the day);
  • startOfDay used to represent the start time of the day (00:00:00 of the day);
  • endOfWeek() used to characterize the end time of the current week (23:59:59 on Sunday);
  • startOfWeek() used to characterize the start time of the current week (00:00:00 on Monday);
  • endOfMonth() used to characterize the end time of the current month (23:59:59 on the last day of the current calendar month);
  • startOfMonth() used to characterize the start time of the current month (00:00:00 on the 1st of the current calendar month);
  • endOfYear() used to represent the end time of the current year (23:59:59 on December 31 of the current year);
  • startOfYear() which is used to represent the start time of the current year (00:00:00 on January 1 of the current year).
  • the business platform sends the data query statement 2 to the external parser.
  • S515 The external parser determines whether the data query statement is Chinese SQL, and if so, executes S516, otherwise, directly sends the data query statement to the internal parser for parsing.
  • the server receives the data query request returned based on the keyword value set through the external parser. Further, in the case that the data query request carries a Chinese SQL statement, the external parser executes S516, and in the case that the data query request carries an SQL statement, the SQL statement is sent to the internal parser for analysis.
  • the external parser sends data query statement 2 to the internal parser.
  • S517 The internal parser performs syntax verification on the data query statement.
  • the internal parser uses a lexical analyzer and a grammatical analyzer based on SQL grammar rules to perform grammatical verification on the data query statement, generate corresponding grammatical parsing results, and generate corresponding grammatical parsing information, and pass the local The parser returns the parsing result and parsing information to the external parser.
  • the grammatical parsing result is used to represent whether the data query statement has passed the grammatical check, and the grammatical parsing information at least includes the lexical unit type of each lexical unit in the data query statement and the position information of each lexical unit, and the grammatical parsing information is used in the
  • the search box of the interactive interface prompts different search keywords according to the cursor.
  • the internal parser uses a lexical analyzer and a grammatical analyzer based on SQL grammar rules to perform grammatical verification on the data query statement 2, generate corresponding grammatical parsing results, and generate corresponding grammatical parsing information, and pass the local parser to the
  • the grammatical parsing result and grammatical parsing information are returned to the external parser, wherein the grammatical parsing result indicates that the data query statement 2 has passed the grammar check, and the grammatical parsing information includes the lexical unit type of each lexical unit in the data query statement 2 and the lexical unit type of each lexical unit. location information.
  • the external parser determines that the data query statement has passed the syntax verification based on the syntax verification result returned by the internal parser, it will perform value verification on the data query statement. Syntax error message.
  • the validator in the external parser selects each keyword corresponding to each search keyword included in the data query statement based on the set value verification rule and the syntax parsing information returned by the internal parser. value for value validation.
  • the validator obtains the keyword value "2020-01-01” corresponding to "Expiration date” in the data query statement 2 based on the parsing information returned by the internal parser, and determines "2020-01-01” based on the data dictionary.
  • the field type of "Expiration Date” is time. At this time, the validator determines that the value validation of the keyword value "2020-01-01” fails based on the set value validation rule.
  • S520 The service platform receives the value verification result returned by the external parser.
  • the business platform determines, based on the value verification result, that the data query statement passes the value verification, S521 is executed; otherwise, the user is prompted through the interactive interface that the data query statement fails the value verification.
  • the business platform sends a Chinese parsing request to the external parser, and the Chinese parsing request carries a data query statement.
  • the above S521 that is, the business platform sends a Chinese parsing request to the external parser, and the Chinese parsing request carries a Chinese SQL statement.
  • the business platform sends a Chinese parsing request to an external parser, and the Chinese parsing request carries the data query statement 2 .
  • the external parser generates a non-Chinese SQL statement based on the data query statement.
  • the data query statement is a Chinese SQL statement
  • the non-Chinese SQL statement is an SQL statement.
  • the external parser maps the Chinese SQL statement based on the digital dictionary to obtain the SQL statement, and the digital dictionary contains the mapping relationship between the Chinese SQL statement and the SQL statement.
  • a corresponding non-Chinese SQL statement containing the first type of keywords is generated based on the digital dictionary and the set sorting conditions.
  • the SQL statement containing the first type of keywords is generated, and the first type of keywords are used to represent Data sorting criteria.
  • the first type of keywords may refer to, but not limited to, "ORDER BY”.
  • the first type keyword "ORDER” is not included in the data query statement 3 BY"
  • the data query statement contains the second type of keywords representing the existence of at least one corresponding non-Chinese keyword, then based on the set digital dictionary and at least one non-Chinese keyword, generate a corresponding keyword containing at least one non-Chinese keyword.
  • a non-Chinese SQL statement with non-Chinese keywords if the data query statement contains the second type of keywords representing the existence of at least one corresponding non-Chinese keyword, then based on the set digital dictionary and at least one non-Chinese keyword, generate a corresponding keyword containing at least one non-Chinese keyword.
  • the SQL statement containing a plurality of corresponding non-Chinese keywords is generated, and the second type of keywords is in The digital dictionary corresponds to the plurality of non-Chinese keywords.
  • each non-Chinese keyword in the at least one non-Chinese keyword adopts or conditional connection.
  • the corresponding search conditions are determined based on the reference relationship, and the SQL statement containing the search conditions is generated based on the digital dictionary.
  • the class keyword is used to represent that the reference relationship exists in the Chinese SQL statement.
  • S523 The business platform receives the non-Chinese SQL statement returned by the external parser.
  • the non-Chinese SQL statement refers to an SQL statement.
  • S524 The business platform determines whether the user has the query authority for non-Chinese SQL statements, and if so, executes S525; otherwise, the user is prompted through the interactive interface that there is no query authority for the data query statement.
  • the non-Chinese SQL statement refers to an SQL statement.
  • the user's search authority is pre-stored in the business platform, and further, when the business platform receives the non-Chinese SQL statement returned by the external parser, it determines whether the user has the non-Chinese SQL statement based on the pre-stored user's search authority. Query permissions.
  • the business platform receives the data query statement 2 returned by the external parser and generated by the data query statement 2.
  • the non-Chinese SQL statement it is determined that user A does not have the query authority for the non-Chinese SQL statement based on the pre-stored search authority of user A.
  • S525 The business platform sends the non-Chinese SQL statement to the internal parser.
  • the non-Chinese SQL statement refers to an SQL statement.
  • S526 The internal parser generates AST based on non-Chinese SQL statements, and uses the set depth-first algorithm to traverse the AST to generate the corresponding ES DSL.
  • the non-Chinese SQL statement refers to the SQL statement
  • the ES DLI refers to the ES query statement.
  • ES DLI refers to ES query statement.
  • the above S527 that is, the internal parser sends the ES query statement to the ES search engine.
  • the ES search engine performs data query based on ES DSL, and obtains the data query result.
  • ES DLI refers to ES query statement.
  • the ES search engine performs data query based on the ES query statement, and obtains a data query result.
  • S529-S530 The business platform receives the data query result returned by the ES search engine through the internal parser.
  • S529-S530 include: S529, the ES search engine sends the data query result to the internal parser; S530, the internal parser sends the data query result to the business platform, so that the business platform receives the data query result.
  • an embodiment of the present disclosure provides a data query device based on a domain-specific language, including at least:
  • the first processing unit 801 is used to receive the data query statement sent by the business platform; that is to say, the first processing unit 801 is used to obtain the structured query language SQL statement based on the data query request;
  • the second processing unit 802 is configured to, if the data query statement is a non-Chinese structured query language statement, use the set basic parser based on the data query statement to generate an abstract syntax tree including each rule node, wherein, The rule node is used to represent the grammar rules included in the data query statement; that is, the second processing unit 802 is configured to obtain an abstract syntax tree based on the SQL statement, the abstract syntax tree includes rule nodes, and the The rule node is used to characterize the grammar rules that the SQL statement follows;
  • the third processing unit 803 is configured to obtain parsing results corresponding to the respective rule nodes based on the abstract syntax tree, generate a corresponding elastic search domain-specific language based on the obtained parsing results, and convert the elastic search domain
  • the specific language is sent to the elastic search search engine; that is, the third processing unit 803 is configured to obtain the elastic search ES query statement based on the parsing result corresponding to the rule node;
  • the fourth processing unit 804 is configured to receive the data query result returned by the elastic search engine based on the elastic search domain-specific language, and send the data query result to the business platform;
  • the fourth processing unit 804 is configured to send a data query result, where the data query result is a result obtained from an ES search engine based on the ES query statement.
  • the first processing unit 801 is further configured to: when receiving a visual query request sent by the business platform, load the set query recommendation information, and send the query recommendation information to the business platform , the query recommendation information contains at least a set of candidate search keywords; sequentially receive each search keyword returned by the business platform based on the query recommendation information, and send the various search keywords to the elastic search search engine, and receive the corresponding set of each keyword value returned by the elastic search search engine; send the each keyword value set to the business platform, and receive the business platform based on the each keyword value set The data query statement returned by the value set.
  • the first processing unit 801 is further configured to: acquire a search keyword; based on the search keyword, acquire a keyword value set, where the keyword value set is based on the search key The result obtained by querying the word from the ES search engine; sending the keyword value set; and receiving the data query request returned based on the keyword value set.
  • the first processing unit 801 is further configured to: send query recommendation information based on the visual query request, where the query recommendation information at least includes a set of candidate search keywords; receive recommendation information based on the query Returned search terms.
  • the first processing unit 801 is further configured to: acquire the Chinese SQL statement carried in the data query request; map the Chinese SQL statement based on a digital dictionary to obtain the SQL statement, the The digital dictionary contains the mapping relationship between Chinese SQL statements and SQL statements.
  • the second processing unit 802 is further configured to: if the data query statement is a Chinese structured query language statement, based on the set structured query language grammar rules, perform a query on the Chinese structured query language Syntax verification is performed on the statement, and based on the set value verification rule, the data query statement is subjected to value verification; if the data query statement passes the value verification, then based on the set digital dictionary, and based on the data query statements, and generate corresponding non-Chinese structured query language statements, wherein the digital dictionary contains the mapping relationship between Chinese structured query language statements and non-Chinese structured query language statements; based on the non-Chinese structured query language statements
  • the query language statement uses the set basic parser to generate an abstract syntax tree containing each rule node.
  • the second processing unit 802 is further configured to: perform syntax verification on the Chinese SQL statement based on SQL syntax rules; perform value verification on the Chinese SQL statement based on value verification rules; When the Chinese SQL statement passes the syntax check and the value check, the step of mapping the Chinese SQL statement based on a digital dictionary is performed to obtain the SQL statement.
  • the second processing unit 802 is configured to: if the data query statement does not contain the first type of keywords representing the data sorting conditions, generate a set based on the digital dictionary and the set sorting conditions.
  • a corresponding non-Chinese structured query language statement containing the first type of keywords if the data query statement contains a second type of keywords that represent the existence of at least one corresponding non-Chinese keyword, then based on the set digital dictionary , and at least one non-Chinese keyword to generate a corresponding non-Chinese structured query language statement containing the at least one non-Chinese keyword; if the data query statement contains the third type of keywords that represent the existence of a reference relationship, then Based on the reference relationship, each corresponding search condition is determined, and based on the set digital dictionary, a corresponding non-Chinese structured query language statement containing the each search condition is generated.
  • the second processing unit 802 is configured to: in the case that the Chinese SQL statement does not contain the first type of keywords, based on the digital dictionary, generate all the first type of keywords containing the first type of keywords. Said SQL statement, the first type of keywords is used to characterize data sorting conditions; in the case that the second type of keywords is included in the Chinese SQL statement, based on the digital dictionary, generate a plurality of non-Chinese keywords that contain corresponding The SQL statement of the word, the second type of keywords corresponds to the plurality of non-Chinese keywords in the digital dictionary; when the Chinese SQL statement contains the third type of keywords, based on the reference relationship, determine corresponding search conditions, generate the SQL statement containing the search conditions based on the digital dictionary, and the third type of keywords is used to represent the existence of the reference relationship in the Chinese SQL statement.
  • the second processing unit 802 is configured to: generate the abstract syntax tree by using a first parser based on the SQL statement.
  • the second processing unit 802 is configured to: based on the data query statement, use a set lexical analyzer to determine each lexical unit included in the data query statement; based on the various lexical units, Using the set syntax analyzer, determine each rule node included in the data query statement, and generate a corresponding abstract syntax tree based on each rule node; wherein, the lexical analyzer and the syntax analyzer are based on The set structured query language grammar rules are generated using the set base parser.
  • the second processing unit 802 is configured to: based on the SQL statement, use a lexical analyzer to determine the lexical unit included in the SQL statement; based on the lexical unit, use a syntax analyzer to determine A rule node included in the SQL statement; based on the rule node, the corresponding abstract syntax tree is generated; wherein, the lexical analyzer and the syntax analyzer are based on SQL grammar rules, using the first parsing generated by the generator.
  • the parsing result corresponding to each rule node is obtained based on the abstract syntax tree, and the corresponding elastic search domain-specific language is generated based on the obtained parsing result
  • the third processing unit 803 uses In: using the set depth-first algorithm, sequentially read each rule node and each leaf node contained in the abstract syntax tree, wherein the leaf node is used to represent the lexical unit contained in the data query statement;
  • the rule node according to the corresponding relationship between the preset structured query language grammar rules and the elastic search domain-specific language grammar rules, generates corresponding initial parsing results, and based on the leaf nodes, generates corresponding intermediate parsing results ;
  • Based on the context of the data query statement combine the initial analysis results and the intermediate analysis results to generate corresponding analysis results;
  • Based on the context of the data query statement combine the analysis results, Generate the corresponding elasticsearch domain specific language.
  • the third processing unit 803 is configured to: read the rule node included in the abstract syntax tree and the leaf node corresponding to the rule node, where the leaf node is used to represent the data in the SQL statement The lexical unit; based on the rule node, according to the corresponding relationship between the SQL grammar rules and the grammar rules of the ES domain specific language, generate an initial parsing result; based on the leaf node, generate an intermediate parsing result; based on the SQL statement context, the initial parsing result and the intermediate parsing result are combined to generate the parsing result corresponding to the rule node.
  • the third processing unit 803 is configured to: combine each of the parsing results based on the context of the SQL statement to generate the corresponding ES query statement.
  • an embodiment of the present disclosure provides a data query device based on a domain-specific language, including at least:
  • the processor 902 is configured to read and execute the executable instructions stored in the memory, so as to realize the following operations: based on the data query request, obtain a structured query language SQL statement; based on the SQL statement, obtain an abstract syntax tree, the abstract The syntax tree includes rule nodes, and the rule nodes are used to represent the grammar rules followed by the SQL statements; based on the parsing results corresponding to the rule nodes, the elastic search ES query statements are obtained; the data query results are sent, and the data query results It is the result obtained by querying from the ES search engine based on the ES query statement.
  • the processor is configured to read and execute the executable instructions stored in the memory, so as to realize the following operations: obtain a search keyword; obtain a keyword value set based on the search keyword, where the Describe the keyword value set as the result obtained from the ES search engine based on the search keyword; send the keyword value set; receive the data query request returned based on the keyword value set .
  • the processor is configured to read and execute executable instructions stored in the memory, so as to implement the following operations: based on the visual query request, send query recommendation information, where the query recommendation information at least includes candidate Search keyword sets; receive search keywords returned based on the query recommendation information.
  • the processor is configured to read and execute the executable instructions stored in the memory, so as to realize the following operations: obtain the Chinese SQL statement carried in the data query request;
  • the statement is mapped to obtain the SQL statement, and the digital dictionary contains the mapping relationship between the Chinese SQL statement and the SQL statement.
  • the processor is configured to read and execute the executable instructions stored in the memory, so as to realize the following operations: perform syntax verification on the Chinese SQL statement based on SQL syntax rules; verify based on the value The rule is to perform value verification on the Chinese SQL statement; when the Chinese SQL statement passes the syntax verification and the value verification, the execution is based on a digital dictionary, and the Chinese SQL statement is mapped to obtain the steps of the SQL statement.
  • the processor is configured to read and execute the executable instructions stored in the memory, so as to realize the following operations: in the case that the first type of keywords is not included in the Chinese SQL statement, based on the The digital dictionary is used to generate the SQL statement containing the first type of keywords, and the first type of keywords is used to represent data sorting conditions; in the case that the Chinese SQL statement contains the second type of keywords, Based on the digital dictionary, the SQL statement containing a plurality of corresponding non-Chinese keywords is generated, and the second type of keywords corresponds to the plurality of non-Chinese keywords in the digital dictionary; in the Chinese In the case that the third type of keyword is included in the SQL statement, the corresponding search condition is determined based on the reference relationship, and the SQL statement containing the search condition is generated based on the digital dictionary, and the third type of keyword is used for Indicates that the reference relationship exists in the Chinese SQL statement.
  • the processor is configured to read and execute the executable instructions stored in the memory, so as to realize the following operation: based on the SQL statement, using a first parser to generate the abstract syntax tree.
  • the processor is configured to read and execute the executable instructions stored in the memory, so as to realize the following operations: based on the SQL statement, use a lexical analyzer to determine the lexical syntax included in the SQL statement unit; based on the lexical unit, a syntax analyzer is used to determine the rule nodes contained in the SQL statement; based on the rule nodes, the corresponding abstract syntax tree is generated; wherein, the lexical analyzer and the grammar
  • the parser is generated by using the first parser based on SQL grammar rules.
  • the processor is configured to read and execute the executable instructions stored in the memory, so as to realize the following operations: read the rule node included in the abstract syntax tree and the leaf corresponding to the rule node node, the leaf node is used to represent the lexical unit in the SQL statement; based on the rule node, according to the corresponding relationship between the SQL grammar rule and the grammar rule of the ES domain-specific language, an initial parsing result is generated; based on the The leaf node generates an intermediate parsing result; based on the context of the SQL statement, the initial parsing result and the intermediate parsing result are combined to generate the parsing result corresponding to the rule node.
  • the processor is configured to read and execute the executable instructions stored in the memory, so as to realize the following operations: based on the context of the SQL statement, combine each of the parsing results to generate a corresponding The ES query statement.
  • the bus architecture may include any number of interconnected buses and bridges, specifically one or more processors represented by processor 902 and various circuits of memory represented by memory 901 are linked together.
  • the bus architecture can also link together various other circuits such as peripherals, voltage regulators, and power management circuits.
  • the bus interface provides the interface.
  • Transceiver 903 may be a number of elements, including transmitters and transceivers, that provide means for communicating with various other devices over a transmission medium.
  • the processor 902 is responsible for managing the bus architecture and general processing, and the memory 901 may store data used by the processor 902 in performing operations.
  • an embodiment of the present disclosure provides a storage medium storing instructions, and when the instructions in the storage medium are executed by a processor, the processor is caused to perform the following operations: based on a data query request, obtain a structured query language SQL statement; based on the SQL statement, obtain an abstract syntax tree, the abstract syntax tree includes rule nodes, the rule nodes are used to represent the grammar rules followed by the SQL statement; based on the analysis corresponding to the rule nodes
  • the elastic search ES query statement is obtained; the data query result is sent, and the data query result is the result obtained from the ES search engine based on the ES query statement.
  • the processor when the instructions in the storage medium are executed by the processor, the processor is caused to perform the following operations: acquiring a search keyword; acquiring a keyword value set based on the search keyword, the The keyword value set is the result obtained by querying the ES search engine based on the search keyword; sending the keyword value set; and receiving the data query request returned based on the keyword value set.
  • the processor when the instructions in the storage medium are executed by the processor, the processor is caused to perform the following operations: based on the visualized query request, send query recommendation information, where the query recommendation information at least includes candidate searches keyword set; receiving search keywords returned based on the query recommendation information.
  • the processor when the instructions in the storage medium are executed by the processor, the processor is caused to perform the following operations: obtain the Chinese SQL statement carried in the data query request; based on the digital dictionary, perform the following operations on the Chinese SQL statement Mapping is performed to obtain the SQL statement, and the digital dictionary contains the mapping relationship between the Chinese SQL statement and the SQL statement.
  • the processor when the instructions in the storage medium are executed by the processor, the processor is caused to perform the following operations: perform syntax verification on the Chinese SQL statement based on SQL syntax rules; based on value verification rules , perform value verification on the Chinese SQL statement; in the case that the Chinese SQL statement passes the syntax verification and the value verification, execute a digital dictionary-based mapping to the Chinese SQL statement to obtain the Describe the steps of the SQL statement.
  • the processor when the instructions in the storage medium are executed by the processor, the processor is caused to perform the following operation: in the case that the first type of keywords is not included in the Chinese SQL statement, based on the A digital dictionary, generating the SQL statement containing the first type of keywords, where the first type of keywords is used to represent data sorting conditions; in the case that the Chinese SQL statement contains the second type of keywords, based on The digital dictionary generates the SQL statement containing a plurality of corresponding non-Chinese keywords, and the second type of keywords corresponds to the plurality of non-Chinese keywords in the digital dictionary; in the Chinese SQL In the case where the third type of keywords are included in the statement, the corresponding search conditions are determined based on the reference relationship, and the SQL statement containing the search conditions is generated based on the digital dictionary, and the third type of keywords is used to represent The reference relationship exists in the Chinese SQL statement.
  • the processor when the instructions in the storage medium are executed by the processor, the processor is caused to perform the following operations: based on the SQL statement, using a first parser, generate the abstract syntax tree.
  • the processor when the instructions in the storage medium are executed by the processor, the processor is caused to perform the following operation: based on the SQL statement, use a lexical analyzer to determine the lexical units included in the SQL statement ; Based on the lexical unit, adopt a syntax analyzer to determine the rule node contained in the SQL statement; Based on the rule node, generate the corresponding abstract syntax tree; Wherein, the lexical analyzer and the syntax analysis The parser is generated using the first parser based on SQL grammar rules.
  • the processor when the instructions in the storage medium are executed by the processor, the processor is caused to perform the following operations: read the rule nodes contained in the abstract syntax tree and the leaf nodes corresponding to the rule nodes , the leaf node is used to represent the lexical unit in the SQL statement; based on the rule node, according to the corresponding relationship between the SQL grammar rule and the grammar rule of the ES domain-specific language, an initial parsing result is generated; based on the leaf node to generate an intermediate parsing result; based on the context of the SQL statement, combine the initial parsing result and the intermediate parsing result to generate an parsing result corresponding to the rule node.
  • the processor when the instructions in the storage medium are executed by the processor, the processor is caused to perform the following operations: based on the context of the SQL statement, combine each of the parsing results to generate corresponding Describe the ES query statement.
  • the conversion of the structured query language statement and the elastic search domain-specific language is implemented according to the structured query language syntax, so that developers who are familiar with structured query language syntax can use the commonly used structured query language
  • the data query of the elastic search engine can be realized without increasing the learning cost of the elastic search search grammar. Therefore, the system development efficiency and system availability are greatly improved.
  • the data query based on the elastic search search engine Guaranteed data query efficiency.
  • a multi-layer validator and parser are implemented. , but completely eliminates the risk of structured query language injection, avoids structured query language injection, and improves system security.
  • embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • computer-usable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.
  • a computer program product comprising one or more instructions executable by one or more processors of a computer device to cause the computer device to perform the various embodiments described above data query method.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions
  • the apparatus implements the functions specified in the flow or flow of the flowcharts and/or the block or blocks of the block diagrams.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种数据查询方法及装置,涉及计算机技术领域,该方法包括:基于数据查询请求,获取结构化查询语言SQL语句;基于所述SQL语句,获取抽象语法树,所述抽象语法树包含规则节点,所述规则节点用于表征所述SQL语句所遵循的语法规则;基于所述规则节点对应的解析结果,获取弹性搜索ES查询语句;发送数据查询结果,所述数据查询结果为基于所述ES查询语句从ES搜索引擎中查询得到的结果。

Description

数据查询方法及装置
本申请要求于2020年09月10日提交的申请号为202010948413.8、发明名称为“一种基于领域特定语言的数据查询方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本公开涉及计算机技术领域,特别涉及一种数据查询方法、数据查询装置和存储介质。
背景技术
随着业务系统数据呈爆炸式增长,系统复杂度也越来越高,基于关系型数据库的系统搜索能力,难以满足用户愈来愈高的搜索需求。
目前,通常采用以下方式进行数据搜索:
第一种方式,引入弹性搜索(Elasticsearch,ES)搜索引擎,作为系统搜索能力的核心,将数据库存储的多个业务模型,构建成搜索引擎的全系统对象,使数据库专注于数据存储,由ES搜索引擎专注于数据搜索。
第二种方式,引入Apache Calcite,Apache Calcite的结构如图1所示,Apache Calcite支持提供标准的结构化查询语言(Structured Query Language,SQL)语言、多种查询优化和连接各种数据源的能力。
发明内容
本公开提供了一种数据查询方法、数据查询装置和存储介质。本公开实施例提供的技术方案如下:
一方面,提供一种数据查询方法,包括:基于数据查询请求,获取结构化查询语言SQL语句;基于所述SQL语句,获取抽象语法树,所述抽象语法树包含规则节点,所述规则节点用于表征所述SQL语句所遵循的语法规则;基于所述规则节点对应的解析结果,获取弹性搜索ES查询语句;发送数据查询结果,所述数据查询结果为基于所述ES查询语句从ES搜索引擎中查询得到的结果。
在一些实施例中,进一步包括:获取搜索关键词;基于所述搜索关键词, 获取关键词取值集合,所述关键词取值集合为基于所述搜索关键词从所述ES搜索引擎中查询得到的结果;发送所述关键词取值集合;接收基于所述关键词取值集合返回的所述数据查询请求。
在一些实施例中,获取搜索关键词包括:基于可视化查询请求,发送查询推荐信息,所述查询推荐信息中至少包含候选的搜索关键词集合;接收基于所述查询推荐信息返回的搜索关键词。
在一些实施例中,基于数据查询请求,获取结构化查询语言SQL语句包括:获取数据查询请求中携带的中文SQL语句;基于数字字典,对所述中文SQL语句进行映射,得到所述SQL语句,所述数字字典包含中文SQL语句和SQL语句之间的映射关系。
在一些实施例中,进一步包括:基于SQL语法规则,对所述中文SQL语句进行语法校验;基于值校验规则,对所述中文SQL语句进行值校验;在所述中文SQL语句通过所述语法校验和所述值校验的情况下,执行基于数字字典,对所述中文SQL语句进行映射,得到所述SQL语句的步骤。
在一些实施例中,基于数字字典,对所述中文SQL语句进行映射,得到所述SQL语句包括:在所述中文SQL语句中未包含第一类关键词的情况下,基于所述数字字典,生成包含所述第一类关键词的所述SQL语句,所述第一类关键词用于表征数据排序条件;在所述中文SQL语句中包含第二类关键词的情况下,基于所述数字字典,生成包含对应的多个非中文关键词的所述SQL语句,所述第二类关键词在所述数字字典中对应于所述多个非中文关键词;在所述中文SQL语句中包含第三类关键词的情况下,基于引用关系,确定相应的搜索条件,基于所述数字字典,生成包含所述搜索条件的所述SQL语句,所述第三类关键词用于表征所述中文SQL语句中存在所述引用关系。
在一些实施例中,所述基于所述SQL语句,获取抽象语法树包括:基于所述SQL语句,采用第一解析器,生成所述抽象语法树。
在一些实施例中,基于所述SQL语句,采用第一解析器,生成所述抽象语法树包括:基于所述SQL语句,采用词法分析器,确定所述SQL语句中包含的词法单元;基于所述词法单元,采用语法分析器,确定所述SQL语句中包含的规则节点;基于所述规则节点,生成相应的所述抽象语法树;其中,所述词法分析器和所述语法分析器是基于SQL语法规则,采用所述第一解析器生成的。
在一些实施例中,进一步包括:读取所述抽象语法树中包含的规则节点和 所述规则节点对应的叶子节点,所述叶子节点用于表征所述SQL语句中的词法单元;基于所述规则节点,根据SQL语法规则与ES领域特定语言的语法规则之间的对应关系,生成初始解析结果;基于所述叶子节点,生成中间解析结果;基于所述SQL语句的上下文,对所述初始解析结果和所述中间解析结果进行合并,生成所述规则节点对应的解析结果。
在一些实施例中,基于所述规则节点对应的解析结果,获取弹性搜索ES查询语句包括:基于所述SQL语句的上下文,对各个所述解析结果进行合并,生成相应的所述ES查询语句。
另一方面,提供一种数据查询装置,包括:存储器,用于存储可执行指令;处理器,用于读取并执行所述存储器中存储的可执行指令,以实现如下操作:基于数据查询请求,获取结构化查询语言SQL语句;基于所述SQL语句,获取抽象语法树,所述抽象语法树包含规则节点,所述规则节点用于表征所述SQL语句所遵循的语法规则;基于所述规则节点对应的解析结果,获取弹性搜索ES查询语句;发送数据查询结果,所述数据查询结果为基于所述ES查询语句从ES搜索引擎中查询得到的结果。
在一些实施例中,所述处理器,用于读取并执行存储器中存储的可执行指令,以实现如下操作:获取搜索关键词;基于所述搜索关键词,获取关键词取值集合,所述关键词取值集合为基于所述搜索关键词从所述ES搜索引擎中查询得到的结果;发送所述关键词取值集合;接收基于所述关键词取值集合返回的所述数据查询请求。
在一些实施例中,所述处理器,用于读取并执行存储器中存储的可执行指令,以实现如下操作:基于可视化查询请求,发送查询推荐信息,所述查询推荐信息中至少包含候选的搜索关键词集合;接收基于所述查询推荐信息返回的搜索关键词。
在一些实施例中,所述处理器,用于读取并执行存储器中存储的可执行指令,以实现如下操作:获取数据查询请求中携带的中文SQL语句;基于数字字典,对所述中文SQL语句进行映射,得到所述SQL语句,所述数字字典包含中文SQL语句和SQL语句之间的映射关系。
在一些实施例中,所述处理器,用于读取并执行存储器中存储的可执行指令,以实现如下操作:基于SQL语法规则,对所述中文SQL语句进行语法校验;基于值校验规则,对所述中文SQL语句进行值校验;在所述中文SQL语句通过 所述语法校验和所述值校验的情况下,执行基于数字字典,对所述中文SQL语句进行映射,得到所述SQL语句的步骤。
在一些实施例中,所述处理器,用于读取并执行存储器中存储的可执行指令,以实现如下操作:在所述中文SQL语句中未包含第一类关键词的情况下,基于所述数字字典,生成包含所述第一类关键词的所述SQL语句,所述第一类关键词用于表征数据排序条件;在所述中文SQL语句中包含第二类关键词的情况下,基于所述数字字典,生成包含对应的多个非中文关键词的所述SQL语句,所述第二类关键词在所述数字字典中对应于所述多个非中文关键词;在所述中文SQL语句中包含第三类关键词的情况下,基于引用关系,确定相应的搜索条件,基于所述数字字典,生成包含所述搜索条件的所述SQL语句,所述第三类关键词用于表征所述中文SQL语句中存在所述引用关系。
在一些实施例中,所述处理器,用于读取并执行存储器中存储的可执行指令,以实现如下操作:基于所述SQL语句,采用第一解析器,生成所述抽象语法树。
在一些实施例中,所述处理器,用于读取并执行存储器中存储的可执行指令,以实现如下操作:基于所述SQL语句,采用词法分析器,确定所述SQL语句中包含的词法单元;基于所述词法单元,采用语法分析器,确定所述SQL语句中包含的规则节点;基于所述规则节点,生成相应的所述抽象语法树;其中,所述词法分析器和所述语法分析器是基于SQL语法规则,采用所述第一解析器生成的。
在一些实施例中,所述处理器,用于读取并执行存储器中存储的可执行指令,以实现如下操作:读取所述抽象语法树中包含的规则节点和所述规则节点对应的叶子节点,所述叶子节点用于表征所述SQL语句中的词法单元;基于所述规则节点,根据SQL语法规则与ES领域特定语言的语法规则之间的对应关系,生成初始解析结果;基于所述叶子节点,生成中间解析结果;基于所述SQL语句的上下文,对所述初始解析结果和所述中间解析结果进行合并,生成所述规则节点对应的解析结果。
在一些实施例中,所述处理器,用于读取并执行存储器中存储的可执行指令,以实现如下操作:基于所述SQL语句的上下文,对各个所述解析结果进行合并,生成相应的所述ES查询语句。
另一方面,提供一种存储介质,存储有指令,当所述存储介质中的指令由 处理器执行时,使所述处理器执行如下操作:基于数据查询请求,获取结构化查询语言SQL语句;基于所述SQL语句,获取抽象语法树,所述抽象语法树包含规则节点,所述规则节点用于表征所述SQL语句所遵循的语法规则;基于所述规则节点对应的解析结果,获取弹性搜索ES查询语句;发送数据查询结果,所述数据查询结果为基于所述ES查询语句从ES搜索引擎中查询得到的结果。
另一方面,提供一种计算机程序产品,包括一条或多条指令,所述一条或多条指令可以由计算机设备的一个或多个处理器执行,使得计算机设备执行上述一方面中任一项所涉及的数据查询方法。
附图说明
图1为相关技术中提供的一种Apache Calcite的构架示意图;
图2A为本公开实施例中提供的一种双解析器、业务平台、ES搜索引擎之间的交互示意图;
图2B为本公开实施例中提供的一种基于领域特定语言的数据查询方法的流程示意图;
图3本公开实施例中提供的另一种基于领域特定语言的数据查询方法的流程示意图;
图4A为本公开实施例中提供的一种AST的示意图;
图4B为本公开实施例中提供的一种非中文SQL语言转换为ES DSL的流程示意图;
图5为本公开实施例中提供的另一种基于领域特定语言的数据查询方法的流程示意图;
图6A-图6E为本公开实施例中提供的一组交互界面示意图;
图7A-图7C分别为本公开实施例中提供的中文SQL转换为非中文SQL语句的三种情况示意图;
图8为本公开实施例中提供的一种基于领域特定语言的数据查询装置的逻辑结构示意图;
图9为本公开实施例中提供的一种基于领域特定语言的数据查询装置的实体结构示意图。
具体实施方式
在介绍本公开实施例之前,首先对本公开实施例涉及到的术语进行解释如下说明:
领域特定语言(Domain-Specific Language,DSL):又称为领域专用语言,指的是专注于某个应用程序领域的计算机语言。DSL是一种旨在特定领域下用于特定上下文的语言,这里的领域是指某种商业上的(例如银行业、保险业等)上下文,也可以指某种应用程序的(例如Web网页应用、数据库等)上下文。与DSL相对的另一个概念是通用语言(General-Purpose Language,GPL),GPL可以广泛应用于各种商业或应用问题当中。DSL并不具备很强的普适性,DSL是仅为某个适用的领域而设计的,但DSL也足以用于表示这个领域中的问题以及构建对应的解决方案。示例性地,HTML(Hyper Text Markup Language,超文本标记语言)是一种典型的DSL,HTML是在Web(网页)应用上使用的语言,尽管HTML无法进行数字运算,但也不影响HTML在Web网站上的广泛应用;而Java则是一种典型的GPL,Java可以在个人计算机或移动终端上运行,并可以嵌入到银行、金融、保险、制造业等各种行业的应用中去。
结构化查询语言(Structured Query Language,SQL):是一种特殊目的的编程语言,是一种数据库查询和程序设计语言,用于存取数据以及查询、更新和管理关系数据库系统。SQL是高级的非过程化编程语言,允许用户在高层数据结构上工作。SQL不要求用户指定对数据的存放方法,也不需要用户了解具体的数据存放方式,所以具有完全不同底层结构的不同数据库系统,可以使用相同的SQL作为数据输入与管理的接口。SQL语句可以嵌套,这使SQL具有极大的灵活性和强大的功能。
抽象语法树(Abstract Syntax Tree,AST):简称为语法树(Syntax tree),在计算机科学中,AST是源代码语法结构的一种抽象表示。AST以树状的形式表现编程语言的语法结构,树上的每个节点都表示源代码中的一种结构。
弹性搜索(Elastic Search,ES):ES是一个开源的高扩展的分布式全文检索引擎,ES也是使用Java开发并使用Lucene(搜索引擎)作为其核心来实现所有索引和搜索的功能,但是ES的目的是通过简单的RESTful API(Application Programming Interface,应用程序接口)来隐藏Lucene的复杂性,从而让全文搜索变得简单,其中,RESTful是一种网络应用程序的设计风格和开发方式,适用于移动互联网厂商作为业务接口的场景。
在本公开实施例中,提供一种基于领域特定语言的数据查询的解决方案。 该方案包括:响应于接收到的业务平台发送的数据查询请求,基于该数据查询请求,获取SQL语句,采用设置的基础解析器即第一解析器,对SQL语句进行解析,生成包含各个规则节点的AST,然后,基于AST,获取各个规则节点对应的各个解析结果,并基于各个解析结果,生成相应的ES查询语句,之后,接收ES搜索引擎返回的数据查询结果,并将数据查询结果发送至业务平台。
为了使本领域普通人员更好地理解本公开的技术方案,下面将结合附图,对本公开实施例中的技术方案进行清楚、完整地描述。
参阅图2A所示,本公开实施例中,提供一种双解析器,该双解析器至少包含外部解析器和内部解析器。
其中,外部解析器中至少包含中文解析器和校验器,中文解析器用于将中文SQL语句转换为非中文SQL语句,校验器用于对业务平台发送的关键词取值进行值校验,其中,中文SQL语句是指以中文的自然语言结合SQL语法规则撰写的SQL伪代码,而非中文SQL语句是指传统的SQL语句(而非伪代码)。
内部解析器中至少包含基础解析器、链式解析器、本地解析器,基础解析器用于基于非中文SQL语句,生成相应的AST,也即基础解析器用于基于SQL语句获取AST,链式解析器用于基于AST,生成相应的ES查询语句(属于一种DSL),本地解析器用于返回中文SQL语句的语法校验结果和语法解析信息。
本公开实施例中,外部解析器和内部解析器可以单独使用,也可以集成于一个装置中,本公开对此并不限定。
本公开实施例中,双解析器的结构设计中,内部解析器遵循SQL语法,基于ANTLR(Another Tool for Language Recognition,指可以根据输入自动生成语法树并可视化的显示出来的开源语法分析器)实现非中文SQL语句与ES DSL的转换,外部解析器基于Java实现中文SQL语句与非中文SQL语句的转换,也即是说,内部解析器用于将SQL语句转换成ES查询语句,外部解析器用于将中文SQL语句转换成SQ语句,提供了近自然语言的查询方式,这样,可以根据实际的业务需求,灵活扩展语法规则。
参阅图2B所示,本公开实施例中,基于领域特定语言的数据查询的流程如下:
S201:服务器接收业务平台发送的数据查询语句。
其中,该数据查询语句包括中文SQL语句或者SQL语句,在该数据查询语句为SQL语句的情况下,由于SQL语句本身就是非中文结构化查询语言语句, 直接执行S202;在该数据查询语句为中文SQL语句的情况下,服务器将该中文SQL语句转换为对应的SQL语句之后,执行S202。
上述S201,也即服务器基于数据查询请求,获取SQL语句的一种实施方式。服务器可以是指向业务平台提供后台服务的集群设备。在一些实施例中,用户在终端上登录业务平台,并触发数据查询请求,终端向服务器发送数据查询请求,服务器接收数据查询请求,并基于该数据查询请求获取SQL语句。
示例性地,在用户是掌握SQL语句编程规则的技术人员的情况下,那么终端向服务器发送的数据查询请求通常携带SQL语句(而并非是中文SQL语句),服务器只需要解析数据查询请求即可得到该SQL语句。
示例性地,在用户并非是技术人员的情况下,那么终端向服务器发送的数据查询请求通常携带中文SQL语句,也即是用于利用中文的自然语言结合SQL语法规则撰写的SQL伪代码,那么服务器在解析数据查询请求得到中文SQL语句之后,还需要将中文SQL语句转换成SQL语句。
S202:服务器若数据查询语句为非中文结构化查询语言语句,则基于数据查询语句,采用设置的基础解析器,生成包含各个规则节点的抽象语法树,其中,规则节点用于表征数据查询语句包含的语法规则。
其中,该非中文结构化查询语句是指SQL语句。
上述S202,也即服务器基于该SQL语句,获取抽象语法树的一种实施方式,其中,该抽象语法树包含规则节点,该规则节点用于表征该SQL语句所遵循的语法规则。
S203:服务器基于抽象语法树,得到各个规则节点对应的解析结果,并基于得到的各个解析结果,生成相应的弹性搜索领域特定语言,以及将弹性搜索领域特定语言发送至弹性搜索搜索引擎。
其中,该弹性搜索(Elastic Search,ES)领域特定语言是指ES查询语句。
上述S203,也即服务器基于该规则节点对应的解析结果,获取ES查询语句的一种实施方式,其中,该ES查询语句与该SQL语句的语义相同。
S204:服务器接收弹性搜索搜索引擎基于弹性搜索领域特定语言返回的数据查询结果,并将数据查询结果发送至业务平台。
其中,该弹性搜索搜索引擎是指ES搜索引擎。
上述S204,也即服务器发送数据查询结果的一种实施方式,该数据查询结果为基于该ES查询语句从ES搜索引擎中查询得到的结果。
接下来,对上述基于领域特定语言的数据查询流程进一步进行描述。
参阅图3所示,本公开实施例中,采用上述双解析器,进行数据查询的详细流程如下:
S301:服务器通过外部解析器接收业务平台发送的数据查询语句。
其中,该数据查询语句包括中文SQL语句或者SQL语句,在该数据查询语句为SQL语句的情况下,由于SQL语句本身就是非中文SQL语句,直接执行S302;在该数据查询语句为中文SQL语句的情况下,服务器将该中文SQL语句转换为对应的SQL语句之后,执行S302。
上述S301,也即是服务器接收业务平台发送的数据查询请求,并通过外部解析器,基于该数据查询请求获取SQL语句。在一些实施例中,该数据查询请求中携带中文SQL语句,那么服务器需要调用外部解析器将该中文SQL语句转换成该SQL语句。在另一些实施例中,该数据查询请求中直接携带该SQL语句,那么服务器需要调用外部解析器读取该SQL语句。
在一些实施例中,将该中文SQL语句转换成该SQL语句的过程包括:服务器获取该数据查询请求中携带的该中文SQL语句;基于数字字典,对该中文SQL语句进行映射,得到该SQL语句,其中,该数据字典包含中文SQL语句和SQL语句之间的映射关系。
例如,外部解析器接收业务平台发送数据查询语句1,数据查询语句1为“SELECT taskId FROM task WHERE taskId='T1'”。
S302:若数据查询语句为非中文SQL语句,则服务器通过外部解析器对数据查询语句进行值校验。
其中,该非中文SQL语句是指SQL语句。
上述S302,也即是服务器通过外部解析器对该SQL语句进行值校验。在本公开实施例中,仅以服务器直接对该SQL语句进行值校验为例进行说明。在一些实施例中,服务器可以在上述S301中接收到数据查询请求之后,在该数据查询请求携带中文SQL语句的情况下,服务器解析该数据查询请求,得到该中文SQL语句,然后直接对该中文SQL语句执行如下操作:基于SQL语法规则,对该中文SQL语句进行语法校验;基于值校验规则,对该中文SQL语句进行值校验;在该中文SQL语句通过该语法校验和该值校验的情况下,执行基于数字字典,对该中文SQL语句进行映射,得到该SQL语句的步骤。
本公开实施例中,数据查询语句由各个搜索关键词组成,基于设置的搜索 关键词类型,预先设置相应的值校验规则。
其中,设置的搜索关键词类型可以包括但不限于字符串类型、整数类型、浮点类型、时间类型、布尔类型、数组类型、对象类型中的任一项或组合。
在一些实施例中,外部解析器中的校验器基于设置的值校验规则,对数据查询语句中包含的各个搜索关键词对应的各个关键词取值进行值校验。
例如,假设,设置的值校验规则中,字符串类型的搜索关键词对应的关键词取值需要满足字符串长度范围为1-10字符,数据查询语句1为非中文SQL语句,仅以“taskId='T1'”为例,那么,外部解析器中的校验器基于设置的值校验规则,对数据查询语句1中包含的搜索关键词“taskId”对应的关键词取值“T1”进行值校验,此时,搜索关键词“taskId”对应的关键词取值“T1”满足字符串长度范围1-10字符,因此,搜索关键词“taskId”对应的关键词取值“T1”通过值校验。
S303:若数据查询语句通过值校验,则服务器通过外部解析器将数据查询语句发送至内部解析器。
上述S303,也即服务器在该SQL语句通过值校验的情况下,由外部解析器向内部解析器发送该SQL语句。
在一些实施例中,若数据查询语句中包含的各个搜索关键词对应的各个关键词取值通过值校验,则外部解析器将数据查询语句发送至内部解析器。
例如,若数据查询语句1中包含的搜索关键词“taskId”对应的关键词取值“T1”通过值校验,则外部解析器将数据查询语句1发送至内部解析器。
S304:服务器通过内部解析器基于数据查询语句,采用设置的基础解析器,生成包含各个规则节点的AST。
其中,该基础解析器是指第一解析器。
上述S304,也即服务器通过内部解析器接收该SQL语句,基于该SQL语句,采用第一解析器,生成抽象语法树AST,该AST包含规则节点,该规则节点用于表征该SQL语句所遵循的语法规则。
需要说明的是,本公开实施例中,内部解析器可以采用ANTLR元语言描述SQL语法规则,并基于编写的SQL语法规则,通过ANTLR语法生成工具,生成基础解析器,这样,实现了大部分的基础解析功能,因此,相对于根据理论基础逐一实现各个功能,以提高系统开发效率,减少系统资源的浪费。
在一些实施例中,基于预先设置相应的SQL语法规则,并采用ANTLR, 预先生成相应的词法分析器和语法分析器。
其中,SQL语法规则中至少包含词法规则、语法规则和句法规则,词法规则用于表征词法单元的结构,语法规则用于表征词法单元组成的词汇的结构,句法规则用于表征一个数据查询语句中各个词汇之间关系。
本公开实施例中,可以采用ANTLR元语言描述SQL语法规则中的SELECT句法规则:
SELECT fields FROM table whereClause?groupbyClause?orderClause?limitClause
本公开实施例中,可以采用ANTLR元语言描述SELECT语法、WHERE语法、ORDER语法、LIMIT语法、GROUPBY语法等语法规则。
WHERE语法的元语言描述为:whereClause:WHERE expression,其中,expression表示表达式,WHERE语法用于表示根据搜索条件进行数据查询。
GROUPBY语法的元语言描述为:groupbyClause:GROUP BY ID(COMMA ID)*,GROUPBY语法用于表示根据指定列对数据查询结果进行分组。
ORDER语法的元语言描述为:
orderClause:ORDER BY order(COMMA order)*
order:name(ASC|DESC)?
其中,ASC表示升序排序,DESC表示降序排序,ORDER语法用于表示根据指定字段对数据查询结果进行排序。
LIMIT语法的元语言描述为:limitClause:LIMIT(offset=INT COMMA)?size=INT,其中,LIMIT语法用于表示限制数据查询结果中的数据数目。
本公开实施例中,WHERE语法中的expression可以包括但不限于以下规则中的任一项或组合:
LPAREN expression RPAREN;用于表示括号形式的表达式;
leftExpr=expression operator=(MUL|DIV|MOD)rightExpr=expression;用于表示乘、除、求余操作;
leftExpr=expression operator=(PLUS|MINUS)rightExpr=expression;用于表示加、减操作;
leftExpr=expression operator=(LSH|RSH|USH)rightExpr=expression;用于表示左移、右移、无符号右移操作;
leftExpr=expression operator=(LT|LTE|GT|GTE)rightExpr=expression;用 于表示小于、小于或等于、大于、大于或等于操作;
leftExpr=expression operator=(EQ|NE|AEQ|NAEQ|TEQ|NTEQ|MPPEQ|N MPPEQ)rightExpr=expression;用于表示等于、不等于、模糊查询、反向模糊查询、短语匹配、反向短语匹配、前缀短语匹配、反向前缀短语匹配操作;
leftExpr=expression operator=AND rightExpr=expression;用于表示与运算;
leftExpr=expression operator=OR rightExpr=expression;用于表示或运算
expr=name BETWEEN left=identity AND right=identity;用于表示查询取值介于两个值之间的数据;
leftExpr=expression operator=XOR rightExpr=expression;用于表示异或运算;
inClause;用于表示查询多个字段。
仅以“leftExpr=expression operator=AND rightExpr=expression”为例进行说明,“leftExpr=expression”用于表征左表达式,“operator=AND”表征操作符为与操作,“rightExpr=expression”用于表征右表达式,如“taskId='T1'and projectId='S1'”。
需要说明的是,本公开实施例中,仅以SELECT语法、WHERE语法、ORDER语法、LIMIT语法、GROUPBY语法等语法规则进行举例,在实际应用中,可以根据实际需求,进行语法规则设置。
本公开实施例中,由于将上述SQL语法规则用于解析转换ES DSL时,有些ES DSL语法规则无法通过通用的SQL语法规则来表示,因此,参阅表1、表2所示,对上述SQL语法规则中的语法规则和词法规则进行了扩展,并设置SQL语法规则与ES DSL语法规则之间的对应关系,使得SQL解析转换为ES DSL。
表1 语法扩展
Figure PCTCN2021107464-appb-000001
Figure PCTCN2021107464-appb-000002
表2 词法扩展
Figure PCTCN2021107464-appb-000003
在一些实施例中,执行S303时,可以采用但不限于以下步骤:
A1、服务器通过内部解析器基于数据查询语句,采用设置的词法分析器,确定数据查询语句中包含的各个词法单元。
上述A1,也即是服务器通过内部解析器,基于该SQL语句,采用词法分析器,确定该SQL语句中包含的词法单元。其中,该词法分析器是基于SQL语法规则,采用第一解析器生成的。
其中,词法单元用于表征不可再分的单词或符号,例如,“SELECT”、“=”。
在一些实施例中,词法分析器依次读取数据查询语句中包含的各个字符,并基于设置的各个词法规则,确定数据查询语句中包含的各个词法单元。
本公开实施例中,词法单元类型可以包括但不限于关键字、标识符、字面量、运算符和分界符,其中,关键字用于表征不能用作标识符的保留字符,如“SELECT”,标识符用于表征SQL语法中包含的表名称、列名称等搜索关键词,字面量用于表征字符串和数值。
例如,内部解析器基于数据查询语句1,采用通过ANTLR预先生成的词法分析器,确定数据查询语句1中包含以下词法单元:
关键字:SELECT、FROM、WHERE;
标识符:taskId、task;
字面量:T1;
运算符:=。
本公开实施例中,可以针对各个词法单元类型,预先设置相应的优先级,进而,若一个词法单元满足至少一个词法规则,词法分析器基于预先设置的优先级,确定一个词法单元类型的词法单元类型。
例如,假设,预先设置关键字的优先级高于标识符的优先级,以“SELECT”为例,“SELECT”同时满足关键字的词法规则和标识符的词法规则,词法分析器基于预先设置的优先级,确定“SELECT”的词法单元类型为关键字。
A2、服务器通过内部解析器基于各个词法单元,采用设置的语法分析器,确定数据查询语句中包含的各个规则节点,并基于各个规则节点,生成相应的AST。
上述A2,也即服务器通过内部解析器,基于该词法单元,采用语法分析器,确定该SQL语句中包含的规则节点;基于该规则节点,生成相应的该AST。其中,该语法分析器同样是基于SQL语法规则,采用第一解析器生成的。
本公开实施例中,各个规则节点用于表征数据查询语句包含的各个语法规则。
在一些实施例中,语法分析器采用设置的状态转换表,基于设置的各个语法规则,确定各个词法单元各自对应的各个语法规则,并基于得到的各个语法规则,确定相应的规则节点,以及基于各个规则节点,和各个叶子节点,生成相应的AST,其中,叶子节点用于表征数据查询语句中包含的词法单元。
例如,参阅图4A所示,语法分析器采用设置的状态转换表,基于SELECT语法规则、WHERE语法规则、ORDER语法规则、LIMIT语法规则、GROUPBY语法规则,确定各个词法单元各自对应的各个语法规则,并基于得到的各个语法规则,确定相应的规则节点,以及基于各个规则节点,和叶子节点{SELECT:“SELECT”}、叶子节点{STAR:“*”}、叶子节点{FROM:“FROM”}、叶子节点{ID:“task”}、叶子节点{WHERE:“WHERE”}、叶子节点{ID:“taskId”}、叶子节点{EQ:“=”}、叶子节点{STRING:“T1”},生成如图4A所示的AST。
S305:服务器通过内部解析器基于AST,得到各个规则节点对应的各个解析结果,并基于各个解析结果,生成ES DSL。
其中,该ES DSL是指ES查询语句。
上述S305,也即服务器通过内部解析器,基于AST,获取各个规则节点对应的各个解析结果,并基于各个解析结果,生成ES查询语句。
在一些实施例中,内部解析器中的链式解析器采用设置的深度优先算法,依次读取AST中包含的所述各个规则节点和各个叶子节点;
链式解析器基于各个规则节点,根据预先设置的SQL语法规则与ES DSL语法规则之间的对应关系,生成相应的各个初始解析结果,并基于各个叶子节点,生成相应的各个中间解析结果;
链式解析器基于数据查询语句的上下文,对各个初始解析结果和各个中间解析结果进行合并,生成相应的各个解析结果;
链式解析器基于数据查询语句的上下文,对各个解析结果进行合并,生成相应的ES DSL。
上述获取解析结果的过程,也即服务器通过链式解析器,读取该抽象语法树中包含的规则节点和该规则节点对应的叶子节点,该叶子节点用于表征该SQL语句中的词法单元;基于该规则节点,根据SQL语法规则与ES领域特定语言的语法规则之间的对应关系,生成初始解析结果;基于该叶子节点,生成中间解析结果;基于该SQL语句的上下文,对该初始解析结果和该中间解析结果进行合并,生成该规则节点对应的解析结果;基于该SQL的上下文,对各个解析结果进行合并,生成相应的ES查询语句。
例如,参阅图4A所示,仅以读取规则节点{selectOperation}、叶子节点{SELECT:“SELECT”}、叶子节点{STAR:“*”}、规则节点{whereClause}、叶子节点{WHERE:“WHERE”}、叶子节点{ID:“taskId”}、叶子节点{EQ:“=”}、叶子节点{STRING:“T1”}为例进行说明,内部解析器采用设置的深度优先算法,首先,读取AST中包含的规则节点{selectOperation},进入SELECT语法规则,根据预先设置的SQL语法规则中的SELECT语法规则与ES DSL语法规则之间的对应关系,生成相应的初始解析结果1。
然后,依次读取叶子节点{SELECT:“SELECT”}、规则节点{fieldList:1}、读取叶子节点{STAR:“*”},由于叶子节点{STAR:“*”}表征匹配全部,因此,不生成相应的中间解析结果,基于数据查询语句1的上下文,直接将初始解析结果1作为相应的解析结果1,解析结果1用于表征从所有数据进行查询,解析结果1参阅以下代码所示:
Figure PCTCN2021107464-appb-000004
接着,读取规则节点{whereClause},进入WHERE语法规则,生成相应的初始解析结果2,然后,依次读取叶子节点{ID:“taskId”}、叶子节点{EQ:“=”}、叶子节点{STRING:“T1”},基于叶子节点{ID:“taskId”}、叶子节点{EQ:“=”}、叶子节点{STRING:“T1”},生成相应的中间解析结果2,基于数据查询语句1的上下文,对初始解析结果2和中间解析结果2进行合并,生成解析结果2,解析结果1用于表征查询“taskId”对应的关键词取值为“T1”的数据,解析结果2参阅以下代码所示:
Figure PCTCN2021107464-appb-000005
最后,基于数据查询语句1的上下文,对解析结果1和解析结果2进行合并,生成ES DSL,ES DSL用于表征从所有数据中查询“taskId”对应的关键词取值为“T1”的数据,ES DSL参阅以下代码所示:
Figure PCTCN2021107464-appb-000006
Figure PCTCN2021107464-appb-000007
本公开实施例中,可以进一步通过预设的函数进入或退出各个规则节点,例如,参阅图4A所示,通过预设的进入规则函数enterRule()进入规则节点,通过预设的退出规则函数exitRule()退出规则节点。
需要说明的是,本公开实施例中,对AST进行深度优先遍历的过程,实际上就是对各个语法规则进行链式解析的过程,也就是说,针对AST进行深度优先遍历时,参阅图4B所示,首先,进入SELECT语法规则的解析类,即QuerySelectFieldsParser中进行解析,并保存解析结果到上下文中,然后匹配到FROM语法规则,进入QueryFromParser中进行解析,并保存解析结果到上下文中,然后,依次进行WHERE语法规则、GROUPBY语法规则、ORDER语法规则、LIMIT语法规则的解析,并保存相应的解析结果到上下文中,最终,基于上下文中的各个初始解析结果,合并生成用于ES搜索的ES DSL。
S306:服务器通过内部解析器将ES DSL发送至ES搜索引擎。
其中,该ES DSL是指ES查询语句。
上述S306,也即服务器通过内部解析器,将ES查询语句发送至ES搜索引擎。
S307:ES搜索引擎基于ES DSL进行数据查询,得到数据查询结果。
上述S307,也即服务器通过ES搜索引擎基于ES查询语句进行数据查询,得到数据查询结果。
需要说明的是,本申请实施例中,ES搜索引擎是一个分布式、高扩展、高实时的搜索与数据分析引擎。ES搜索引擎采用集群构架,集群是一个或多个节点的集合,它们共同保存整个数据,并提供跨所有节点的联合索引和搜索功能,节点作为集群一部分的单个服务器,存储数据并参与集群的索引和搜索功能,其中,每一个索引都可以被分成分片,每个分片可以有0个或多个副本,下文中,将被复制的原始分片称为主分片,将主分片的副本称为副本分片,每个分片的主分片和副分片不在同一节点中。
例如,ES集群由节点1和节点2组成,其中,索引1被分成分片0和分片1,节点1中包含分片0的主分片和分片1的副本分片,节点2中包含分片0的 副本分片和分片1的主分片。
在一些实施例中,执行S307时,可以包括但不限于以下几个步骤:
A1、ES集群中的一个节点接收到ES DSL时,将ES DSL广播至包含目标索引的分片的其他节点。
上述A1,也即ES集群中的协调节点设备接收到ES查询语句时,将ES查询语句广播至目标索引的分片的其他节点设备。
本申请实施例中,将接收到ES DSL的节点称为协调节点,协调节点在接收到ES DSL后,根据ES DSL,确定进行查询的目标索引,并将ES DSL广播至包含目标索引的主分片或副本分片的其它节点中。
例如,ES集群中的节点1接收到ES DSL,将节点1称为协调节点,此时,节点1根据ES DSL,确定进行查询的目标索引为索引1,并将ES DSL广播至包含索引1的分片的节点2。
A2、其他节点根据ES DSL,对各个包含目标索引的分片执行查询操作,并将查询到的文档标识添加至各自的有序优先队列中,以及将各自的有序优先队列中的文档标识和文档标识对应的排序值返回至协调节点。
上述A2,也即其他节点设备基于该ES查询语句,对各个包含目标索引的分片执行查询操作,并将查询到的文档标识添加至各自的有序优先队列中,以及将各自的有序优先队列中的文档标识和文档标识对应的排序值返回至协调节点设备。
其中,文档是可以建立索引的基本信息单元,文档标识用于唯一标识一个文档。
例如,节点2根据ES DSL,对分片0的副本分片执行查询操作,并将查询到的文档标识doc_ID1、doc_ID2添加至节点2的有序优先队列中,以及将节点2的有序优先队列中的文档标识doc_ID1、doc_ID2、doc_ID1对应的排序值、doc_ID2对应的排序值发送至节点1,其中,doc_ID1对应的排序值表征doc_ID1的优先级最高,doc_ID2对应的排序值表征doc_ID2的优先级低于doc_ID1的优先级,doc_ID1和doc_ID2对应的文档中包含任务编号为T1的任务数据。
A3、协调节点根据排序值,将得到的文档标识合并到协调节点的有序优先队列中,得到全局查询结果列表,并基于全局查询结果列表,得到数据查询结果。
上述A3,也即协调节点设备根据排序值,将得到的文档标识合并到协调节 点的有序优先队列中,获取全局查询结果列表,并基于全局查询结果列表,获取数据查询结果。
例如,节点1根据排序值,将得到的doc_ID1、doc_ID2合并到协调节点的有序优先队列中,得到全局查询结果列表,该全局查询结果列表中的文档标识依次为doc_ID1、doc_ID2,然后,节点1基于全局查询结果列表,从节点2的分片0的副本分片中获取相应的文档,作为数据查询结果1。
S308:ES搜索引擎将数据查询结果发送至内部解析器。
S309:服务器通过内部解析器将数据查询结果发送至业务平台。
本公开实施例中,为了降低系统使用的难度,方便用户在复杂业务系统中进行便捷的搜索,因此,外部解析器基于数据字典,提供了中文SQL语句的搜索方式。
参阅图5所示,以可视化的中文查询场景为例,对本公开实施例进行进一步说明。
S501:业务平台获取用户输入的模式选择信息。
上述S501,也即用户在终端上登录业务平台,并向业务平台输入模式选择信息,触发终端向业务平台发送模式选择信息。
本公开实施例中,模式选择信息用于表征业务平台提供的查询方式,例如,中文SQL查询方式和非中文SQL查询方式,在实际应用中,业务平台还可以根据业务场景提供其他查询方式,但最终都会转换成使用非中文SQL查询方式,在此不再赘述。其中,中文SQL查询方式是指用户可以输入中文SQL语句来进行数据查询,通常面向的是非技术人员,而非中文SQL查询方式则是指用户可以输入专业的SQL语句来进行数据查询,通常面向的是技术人员(如开发人员、测试人员等)。
业务平台获取到用户输入的模式选择信息时,若基于模式选择信息,确定模式选择信息表征中文SQL查询方式,则继续执行S502。
S502:业务平台向外部解析器发送可视化查询请求。
其中,该外部解析器是集成在服务器上的解析功能模块。
上述S502,也即业务平台向服务器的外部解析器发送可视化查询请求。
S503:外部解析器加载查询推荐信息。
上述S503,也即服务器通过外部解析器基于可视化查询请求,获取查询推荐信息,并向业务平台发送查询推荐信息。
本公开实施例中,查询推荐信息中至少包含候选的搜索关键词集合,即,查询推荐信息中至少包含数据字典,可以包含数据字典中的任一项或组合,其中,数据字典、操作符表和运算符表是根据业务平台提供的业务设置的,数据字典中包含中文SQL语句和非中文SQL语句之间的映射关系。
下文中,仅以查询推荐信息包含数据字典、操作符表和运算符表为例进行说明。
参阅表3所示,数据字典中关键词表中提供中文SQL语句和非中文SQL语句中搜索关键词的映射关系,其中,中文关键词用于表征搜索关键词的中文名称,英文关键词用于表征ES搜索引擎使用的搜索关键词,操作符表和运算符表参阅表4和表5所示。
需要说明的是,本公开实施例中,仅以英文关键词进行举例,也可以根据根据实际业务,提供其它类型的非中文关键词。
表3 数据字典中的关键词表
Figure PCTCN2021107464-appb-000008
表4 操作符表
操作符 描述
AND 与操作,多个条件同时满足
OR 或操作,满足任一条件
GROUP BY 聚类
ORDER BY 排序
表5 运算符表
Figure PCTCN2021107464-appb-000009
在一些实施例中,外部解析器通过对业务平台中根据实际业务设置的业务模型进行梳理,得到中文关键词、英文关键词和字段类型,并通过构建ES搜索引擎的索引获得相应的索引字段,以及基于中文关键词、英文关键词、字段类型和索引字段,构建数据字典。
进一步的,基于数据字典、SQL语法规则、操作符表和运算符表,构建业务平台对应的语义模型。这样,在进行数据查询的过程中,若业务平台中的出现关键词的新增或变更,数据字典可以进行实时同步,实现数据字典的动态更新,从而不断扩展业务平台的搜索边界,提高业务平台的搜索能力。
需要说明的是,本公开实施例中,内部解析器也可以在基于数据查询语句,采用基础解析器,生成AST时,加载上述语义模型,进而直接根据语义模型进行解析。
S504:业务平台接收外部解析器返回的查询推荐信息。
其中,该外部解析器是集成在服务器上的解析功能模块。
上述S504,也即业务平台接收服务器通过外部解析器返回的查询推荐信息。
业务平台接收到外部解析器返回的查询推荐信息时,基于查询推荐信息,向用户展示相应的交互界面。
例如,参阅图6A所示,业务平台基于查询推荐信息中包含的数据字典、操 作符表和运算符表,向用户展示如图6A所示的交互界面。
S505:业务平台获取用户输入的搜索关键词。
其中,该业务平台是用户在终端侧登录的平台,支持数据查询、数据修改等各类业务。
上述S505,也即用户在终端上向业务平台输入搜索关键词,业务平台获取该搜索关键词。
例如,参阅图6B所示,业务平台获取用户输入的搜索关键词“优先级”。
进一步的,业务平台基于运算符表,获取用户输入的运算符。
例如,参阅图6B所示,业务平台获取用户输入的运算符“=”。
S506-S508:业务平台将搜索关键词经外部解析器、内部解析器发送至ES搜索引擎。
S506-S508包括:S506,业务平台将搜索关键词发送至服务器的外部解析器;S507,外部解析器将搜索关键词发送至内部解析器;S508,内部解析器将搜索关键词发送至ES搜索引擎。
上述S506-S508中,服务器接收基于该查询推荐信息返回的搜索关键词,也即是说,服务器获取搜索关键词。
S509:ES搜索引擎基于搜索关键词,确定相应的关键词取值集合。
例如,ES搜索引擎基于“优先级”,确定相应的关键词取值集合1{“最高优”,“高优”,“中等”,“较低”,“极低”}。
S510-S512:业务平台接收ES搜索引擎通过内部解析器、外部解析器返回的关键词取值集合。
S510-S512包括:S510,ES搜索引擎向内部解析器发送关键词取值集合;S511,内部解析器向外部解析器发送关键词取值集合;S512,外部解析器向业务平台发送关键词取值集合。
在S510-S512中,服务器基于该搜索关键词,获取关键词取值集合,该关键词取值集合为基于该搜索关键词从ES搜索引擎中查询得到的结果,接着,发送该关键词取值集合。
业务平台接收到ES搜索引擎通过内部解析器、外部解析器返回的关键词取值集合,并通过交互界面,展示关键词取值集合,以及获取用户通过交互界面输入的上述搜索关键词对应的关键词取值。
例如,参阅图6C所示,业务平台接收到关键词取值集合1时,通过交互界 面,展示关键词取值集合1,参阅图6D所示,业务平台获取用户通过交互界面输入的搜索关键词“优先级”对应的关键词取值“最高优”。
进一步的,业务平台基于操作符表,通过交互界面,展示相应的操作符。
例如,参阅图6D,业务平台基于操作符表,通过交互界面,展示操作符“AND”、“OR”、“ORDER BY”。
这样,通过重复执行S505-S512,得到用户输入的各个搜索关键词,以及各个搜索关键词对应的取值。
例如,参阅图6E所示,得到用户输入的“优先级=最高级AND截止日期=2020-01-01AND(任务分类=任务)ORDER BY开始日期ASC”,其中,ASC用于表征升序排列。
这样,本公开实施例中,通过外部解析器,为业务平台提供了关键词推荐、值推荐和操作符推荐,从而实现了良好的可视化交互,为客户提供了便捷的DSL填充方式。
S513:业务平台获取用户输入的数据查询语句。
其中,该数据查询语句包括中文SQL语句或者SQL语句。
上述S513,也即用户在终端上向业务平台输入SQL语句或中文SQL语句,业务平台获取SQL语句或中文SQL语句。
例如,业务平台获取用户输入的数据查询语句2“优先级=最高级AND截止日期=2020-01-01AND(任务分类=任务)ORDER BY开始日期ASC”。
在一些实施例中,除了输入搜索关键词、对应的关键词取值集合之外,业务平台还可以向用户提供一些封装好的、预定义的函数API,该函数包括但不限于:项目类函数、用户类函数、时间类函数等。
示意性地,项目类函数可以包括projectsWhereUserlsln(),表征当前用户参与的项目集合。
示意性地,用户类函数可以包括currentUser()和Group(${项目定义用户分组}),其中,currentUser()用于表征当前登录人,Group(${项目定义用户分组})用于查询某个用户组。
示意性地,时间类函数包括但不限于:
now(),用于表征当前时间;
endOfDay(),用于表征当天结束时间(当天23:59:59);
startOfDay(),用于表征当天开始时间(当天00:00:00);
endOfWeek(),用于表征当周结束时间(当周日23:59:59);
startOfWeek(),用于表征当周开始时间(当周一00:00:00);
endOfMonth(),用于表征当月结束时间(当前自然月最后一天23:59:59);
startOfMonth(),用于表征当月开始时间(当前自然月1号00:00:00);
endOfYear(),用于表征当年结束时间(当年12月31日23:59:59);
startOfYear(),用于表征当年开始时间(当年1月1日00:00:00)。
S514:业务平台将数据查询语句发送至外部解析器。
上述S514,也即业务平台将获取到的该SQL语句或中文SQL语句封装为数据查询请求,向服务器的外部解析器发送该数据查询请求。
例如,业务平台将数据查询语句2发送至外部解析器。
S515:外部解析器判断数据查询语句是否为中文SQL,若是,则执行S516,否则,直接将数据查询语句发送至内部解析器进行解析。
上述S515,也即服务器通过外部解析器接收基于该关键词取值集合返回的该数据查询请求。进一步地,外部解析器在该数据查询请求中携带中文SQL语句的情况下,执行S516,在该数据查询请求中携带SQL语句的情况下,将该SQL语句发送至内部解析器进行解析。
需要说明的是,本公开实施例中,若数据查询语句为非中文SQL,直接将数据查询语句发送至内部解析器进行解析的情况,与上文中步骤S302-S310的过程相同,在此不再赘述。
S516:外部解析器将数据查询语句发送至内部解析器。
上述S516,也即在该数据查询请求中携带中文SQL语句的情况下,外部解析器将该中文SQL语句发送至内部解析器。
例如,外部解析器将数据查询语句2发送至内部解析器。
S517:内部解析器对数据查询语句进行语法校验。
其中,由于在该数据查询语句为中文SQL语句的情况下执行S516-S517,因此S517中的数据查询语句为中文SQL语句。
上述S517,也即内部解析器对该中文SQL语句进行语法校验。
在一些实施例中,内部解析器基于SQL语法规则,采用词法分析器和语法分析器,对数据查询语句进行语法校验,生成相应的语法解析结果,以及生成相应的语法解析信息,并通过本地解析器将语法解析结果和语法解析信息返回至外部解析器。
其中,语法解析结果用于表征数据查询语句是否通过语法校验,语法解析信息中至少包含数据查询语句中各个词法单元的词法单元类型和各个词法单元的位置信息,所述语法解析信息用于在交互界面的搜索框中根据光标提示不同的搜索关键词。
例如,内部解析器基于SQL语法规则,采用词法分析器和语法分析器,对数据查询语句2进行语法校验,生成相应的语法解析结果,以及生成相应的语法解析信息,并通过本地解析器将语法解析结果和语法解析信息返回至外部解析器,其中,语法解析结果表征数据查询语句2通过语法校验,语法解析信息中包含数据查询语句2中各个词法单元的词法单元类型和各个词法单元的位置信息。
S518:外部解析器接收内部解析器返回的语法校验结果。
S519:若数据查询语句通过语法校验,外部解析器对数据查询语句进行值校验。
其中,由于在该数据查询语句为中文SQL语句的情况下执行S516-S519,因此S519中的数据查询语句为中文SQL语句。
上述S519,也即在中文SQL语句通过语法校验的情况下,外部解析器对该中文SQL语句进行值校验。
若外部解析器基于内部解析器返回的语法校验结果,确定数据查询语句通过语法校验,则对数据查询语句进行值校验,否则,向业务平台发送表征数据查询语句未通过语法校验的语法错误提示。
在一些实施例中,外部解析器中的校验器基于设置的值校验规则,以及基于内部解析器返回的语法解析信息,对数据查询语句中包含的各个搜索关键词对应的各个关键词取值进行值校验。
例如,假设,设置的值校验规则中,若字段类型为时间,则关键词取值的格式为“年-月-日时:秒:分”,以数据查询语句2中包含的搜索关键词“截止日期”为例,校验器基于内部解析器返回的语法解析信息获取数据查询语句2中“截止日期”对应的关键词取值“2020-01-01”,并基于数据字典,确定“截止日期”的字段类型为时间,此时,校验器基于设置的值校验规则,确定关键词取值“2020-01-01”值校验失败。
S520:业务平台接收外部解析器返回的值校验结果。
进一步的,若业务平台基于值校验结果,确定数据查询语句通过值校验, 则执行S521,否则,通过交互界面,提示用户数据查询语句未通过值校验。
S521:业务平台向外部解析器发送中文解析请求,中文解析请求中携带数据查询语句。
其中,由于在该数据查询语句为中文SQL语句的情况下执行S516-S521,因此S521中的数据查询语句为中文SQL语句。
上述S521,也即业务平台向外部解析器发送中文解析请求,该中文解析请求中携带中文SQL语句。
例如,业务平台向外部解析器发送中文解析请求,中文解析请求中携带数据查询语句2。
S522:外部解析器基于数据查询语句,生成非中文SQL语句。
其中,该数据查询语句为中文SQL语句,该非中文SQL语句为SQL语句。
上述S522,也即外部解析器基于数字字典,对该中文SQL语句进行映射,得到SQL语句,该数字字典中包含中文SQL语句和SQL语句之间的映射关系。
在一些实施例中,执行S522时,存在但不限于以下几种情况:
在一种情况下:若数据查询语句中未包含表征数据排序条件的第一类关键词,则基于数字字典,以及设置的排序条件,生成相应的包含第一类关键词的非中文SQL语句。
上述过程中,也即在该中文SQL语句中未包含第一类关键词的情况下,基于该数字字典,生成包含该第一类关键词的该SQL语句,该第一类关键词用于表征数据排序条件。
本公开实施例中,第一类关键词可以但不限于是指“ORDER BY”。
例如,参阅图7A所示,假设,设置的排序条件为按照创建时间升序排序,数据查询语句3为“项目=测试项目”,此时,数据查询语句3中不包含第一类关键词“ORDER BY”,那么,基于数字字典,以及设置的排序条件,生成包含第一类关键词“ORDER BY”的非中文SQL语句“SELECT*FROM table WHERE project=测试项目ORDER BY createdAt DESC”
在另一种情况下:若数据查询语句中包含表征存在至少一个对应的非中文关键词的第二类关键词,则基于设置的数字字典,以及至少一个非中文关键词,生成相应的包含至少一个非中文关键词的非中文SQL语句。
在上述过程中,也即在该中文SQL语句中包含第二类关键词的情况下,基于该数字字典,生成包含对应的多个非中文关键词的该SQL语句,该第二类关 键词在该数字字典中对应于该多个非中文关键词。
需要说明的是,本公开实施例中,至少一个非中文关键词中的各个非中文关键词之间采用或条件连接。
例如,参阅图7B所示,假设,数据查询语句4为“操作人=xiaoli”,其中,“操作人”在数据字典中对应英文关键词“operator”和英文关键词“updatedBy”,则基于设置的数字字典,以及英文关键词“operator”和英文关键词“updatedBy”,生成相应的包含英文关键词“operator”和英文关键词“updatedBy”的非中文SQL语句“SELECT*FROM table WHERE operator=xiaoli OR updatedBy=xiaoli”。
在另一种情况下:若数据查询语句中包含表征存在引用关系的第三类关键词,则基于引用关系,确定相应的各个搜索条件,并基于设置的数字字典,生成相应的包含各个搜索条件的非中文SQL语句。
在上述过程中,也即在该中文SQL语句中包含第三类关键词的情况下,基于引用关系,确定相应的搜索条件,基于该数字字典,生成包含该搜索条件的SQL语句,该第三类关键词用于表征该中文SQL语句中存在所述引用关系。
需要说明的是,本公开实施例中,存在引用关系也可以理解为数据字典中引用相应的字段,各个搜索条件之间采用与条件连接。
例如,参阅图7C所示,假设,数据查询语句5为“创建人!=xiaoli”,其中,“创建人”引用字段“creator”,则基于引用字段“creator”,确定表征不等于的搜索条件1“creator!=xiaoli”、表征非空的搜索条件2“creator is not null”、表征引用字段“creator”的搜索条件3“existKey in{creator}”,并基于数字字典,生成相应的非中文SQL语句“SELECT*FROM table WHERE existKey in{creator}AND creator!=xiaoli AND creator is not null”
S523:业务平台接收外部解析器返回的非中文SQL语句。
其中,该非中文SQL语句是指SQL语句。
上述S523,也即业务平台接收外部解析器返回的SQL语句。
S524:业务平台判断用户是否存在非中文SQL语句的查询权限,若存在,执行S525,否则,通过交互界面提示用户无数据查询语句的查询权限。
其中,该非中文SQL语句是指SQL语句。
上述S524,也即业务平台判断用户是否具有SQL语句的查询权限,在用户具有SQL语句的查询权限的情况下,执行S525,在用户不具有SQL语句的查 询权限的情况下,通过交互界面提示用户无SQL语句的查询权限。
在一些实施例中,业务平台中预先存储用户的搜索权限,进而,业务平台接收到外部解析器返回的非中文SQL语句时,基于预先存储用户的搜索权限,确定用户是否存在非中文SQL语句的查询权限。
例如,假设,业务平台中预先存储用户A的搜索权限,存储用户A的搜索权限表征用户A不能对数据查询语句2进行查询,那么,业务平台接收到外部解析器返回的通过数据查询语句2生成的非中文SQL语句时,基于预先存储用户A的搜索权限,确定用户A不存在该非中文SQL语句的查询权限。
S525:业务平台将非中文SQL语句发送至内部解析器。
其中,该非中文SQL语句是指SQL语句。
上述S525,也即业务平台将SQL语句发送至内部解析器。
S526:内部解析器基于非中文SQL语句生成AST,并采用设置的深度优先算法,遍历AST,生成相应的ES DSL。
其中,该非中文SQL语句是指SQL语句,ES DLI是指ES查询语句。
上述S526,也即内部解析器基于SQL语句生成AST,并基于深度优先算法,遍历AST,生成相应的ES查询语句。
需要说明的是,本公开实施例中,S525与S304-S305的过程相同,在此不再赘述。
S527:内部解析器将ES DSL发送至ES搜索引擎。
其中,ES DLI是指ES查询语句。
上述S527,也即内部解析器将ES查询语句发送至ES搜索引擎。
S528:ES搜索引擎基于ES DSL进行数据查询,得到数据查询结果。
其中,ES DLI是指ES查询语句。
上述S528,也即ES搜索引擎基于ES查询语句进行数据查询,得到数据查询结果。
S529-S530:业务平台接收ES搜索引擎通过内部解析器返回的数据查询结果。
上述S529-S530包括:S529,ES搜索引擎向内部解析器发送数据查询结果;S530,内部解析器向业务平台发送数据查询结果,使得业务平台接收该数据查询结果。
基于同一发明构思,参阅图8所示,本公开实施例提供一种基于领域特定 语言的数据查询装置,至少包括:
第一处理单元801,用于接收业务平台发送的数据查询语句;也即是说,该第一处理单元801,用于基于数据查询请求,获取结构化查询语言SQL语句;
第二处理单元802,用于若所述数据查询语句为非中文结构化查询语言语句,则基于所述数据查询语句,采用设置的基础解析器,生成包含各个规则节点的抽象语法树,其中,规则节点用于表征所述数据查询语句包含的语法规则;也即是说,该第二处理单元802,用于基于所述SQL语句,获取抽象语法树,所述抽象语法树包含规则节点,所述规则节点用于表征所述SQL语句所遵循的语法规则;
第三处理单元803,用于基于所述抽象语法树,得到所述各个规则节点对应的解析结果,并基于得到的各个解析结果,生成相应的弹性搜索领域特定语言,以及将所述弹性搜索领域特定语言发送至弹性搜索搜索引擎;也即是说,该第三处理单元803,用于基于所述规则节点对应的解析结果,获取弹性搜索ES查询语句;
第四处理单元804,用于接收所述弹性搜索搜索引擎基于所述弹性搜索领域特定语言返回的数据查询结果,并将所述数据查询结果发送至所述业务平台;也即是说,该第四处理单元804,用于发送数据查询结果,所述数据查询结果为基于所述ES查询语句从ES搜索引擎中查询得到的结果。
在一些实施例中,所述第一处理单元801进一步用于:接收到所述业务平台发送的可视化查询请求时,加载设置的查询推荐信息,并将所述查询推荐信息发送至所述业务平台,所述查询推荐信息中至少包含候选的搜索关键词集合;依次接收所述业务平台台基于所述查询推荐信息返回的各个搜索关键词,并将所述各个搜索关键词发送至所述弹性搜索搜索引擎,以及接收所述弹性搜索搜索引擎返回的相应的各个关键词取值集合;将所述各个关键词取值集合发送至所述业务平台,并接收所述业务平台基于所述各个关键词取值集合返回的所述数据查询语句。
在一些实施例中,该所述第一处理单元801进一步用于:获取搜索关键词;基于所述搜索关键词,获取关键词取值集合,所述关键词取值集合为基于所述搜索关键词从所述ES搜索引擎中查询得到的结果;发送所述关键词取值集合;接收基于所述关键词取值集合返回的所述数据查询请求。
在一些实施例中,该所述第一处理单元801进一步用于:基于可视化查询 请求,发送查询推荐信息,所述查询推荐信息中至少包含候选的搜索关键词集合;接收基于所述查询推荐信息返回的搜索关键词。
在一些实施例中,该所述第一处理单元801进一步用于:获取数据查询请求中携带的中文SQL语句;基于数字字典,对所述中文SQL语句进行映射,得到所述SQL语句,所述数字字典包含中文SQL语句和SQL语句之间的映射关系。
在一些实施例中,所述第二处理单元802进一步用于:若所述数据查询语句为中文结构化查询语言语句,则基于设置的结构化查询语言语法规则,对所述中文结构化查询语言语句进行语法校验,以及基于设置的值校验规则,对所述数据查询语句进行值校验;若所述数据查询语句通过所述值校验,则基于设置的数字字典,以及基于所述数据查询语句,生成相应的非中文结构化查询语言语句,其中,所述数字字典中包含中文结构化查询语言语句和非中文结构化查询语言语句之间的映射关系;基于所述非中文结构化查询语言语句,采用设置的基础解析器,生成包含各个规则节点的抽象语法树。
在一些实施例中,该第二处理单元802进一步用于:基于SQL语法规则,对所述中文SQL语句进行语法校验;基于值校验规则,对所述中文SQL语句进行值校验;在所述中文SQL语句通过所述语法校验和所述值校验的情况下,执行基于数字字典,对所述中文SQL语句进行映射,得到所述SQL语句的步骤。
在一些实施例中,所述第二处理单元802用于:若所述数据查询语句中未包含表征数据排序条件的第一类关键词,则基于设置的数字字典,以及设置的排序条件,生成相应的包含所述第一类关键词的非中文结构化查询语言语句;若所述数据查询语句中包含表征存在至少一个相应的非中文关键词的第二类关键词,则基于设置的数字字典,以及至少一个非中文关键词,生成相应的包含所述至少一个非中文关键词的非中文结构化查询语言语句;若所述数据查询语句中包含表征存在引用关系的第三类关键词,则基于所述引用关系,确定相应的各个搜索条件,并基于设置的数字字典,生成相应的包含所述各个搜索条件的非中文结构化查询语言语句。
在一些实施例中,该第二处理单元802用于:在所述中文SQL语句中未包含第一类关键词的情况下,基于所述数字字典,生成包含所述第一类关键词的所述SQL语句,所述第一类关键词用于表征数据排序条件;在所述中文SQL语句中包含第二类关键词的情况下,基于所述数字字典,生成包含对应的多个非 中文关键词的所述SQL语句,所述第二类关键词在所述数字字典中对应于所述多个非中文关键词;在所述中文SQL语句中包含第三类关键词的情况下,基于引用关系,确定相应的搜索条件,基于所述数字字典,生成包含所述搜索条件的所述SQL语句,所述第三类关键词用于表征所述中文SQL语句中存在所述引用关系。
在一些实施例中,所述第二处理单元802用于:基于所述SQL语句,采用第一解析器,生成所述抽象语法树。
在一些实施例中,所述第二处理单元802用于:基于所述数据查询语句,采用设置的词法分析器,确定所述数据查询语句中包含的各个词法单元;基于所述各个词法单元,采用设置的语法分析器,确定所述数据查询语句中包含的各个规则节点,并基于所述各个规则节点,生成相应的抽象语法树;其中,所述词法分析器和所述语法分析器是基于设置的结构化查询语言语法规则,采用设置的基础解析器生成的。
在一些实施例中,所述第二处理单元802用于:基于所述SQL语句,采用词法分析器,确定所述SQL语句中包含的词法单元;基于所述词法单元,采用语法分析器,确定所述SQL语句中包含的规则节点;基于所述规则节点,生成相应的所述抽象语法树;其中,所述词法分析器和所述语法分析器是基于SQL语法规则,采用所述第一解析器生成的。
在一些实施例中,基于所述抽象语法树,得到所述各个规则节点对应的解析结果,并基于得到的各个解析结果,生成相应的弹性搜索领域特定语言时,所述第三处理单元803用于:采用设置的深度优先算法,依次读取所述抽象语法树中包含的各个规则节点和各个叶子节点,其中,叶子节点用于表征所述数据查询语句中包含的词法单元;基于所述各个规则节点,根据预先设置的结构化查询语言语法规则与弹性搜索领域特定语言语法规则之间的对应关系,生成相应的各个初始解析结果,并基于所述各个叶子节点,生成相应的各个中间解析结果;基于所述数据查询语句的上下文,对所述各个初始解析结果和所述各个中间解析结果进行合并,生成相应的各个解析结果;基于所述数据查询语句的上下文,对各个解析结果进行合并,生成相应的弹性搜索领域特定语言。
在一些实施例中,所述第三处理单元803用于:读取所述抽象语法树中包含的规则节点和所述规则节点对应的叶子节点,所述叶子节点用于表征所述SQL语句中的词法单元;基于所述规则节点,根据SQL语法规则与ES领域特 定语言的语法规则之间的对应关系,生成初始解析结果;基于所述叶子节点,生成中间解析结果;基于所述SQL语句的上下文,对所述初始解析结果和所述中间解析结果进行合并,生成所述规则节点对应的解析结果。
在一些实施例中,所述第三处理单元803用于:基于所述SQL语句的上下文,对各个所述解析结果进行合并,生成相应的所述ES查询语句。
基于同一发明构思,参阅图9所示,本公开实施例提供一种基于领域特定语言的数据查询装置,至少包括:
存储器901,用于存储可执行指令;
处理器902,用于读取并执行存储器中存储的可执行指令,以实现如下操作:基于数据查询请求,获取结构化查询语言SQL语句;基于所述SQL语句,获取抽象语法树,所述抽象语法树包含规则节点,所述规则节点用于表征所述SQL语句所遵循的语法规则;基于所述规则节点对应的解析结果,获取弹性搜索ES查询语句;发送数据查询结果,所述数据查询结果为基于所述ES查询语句从ES搜索引擎中查询得到的结果。
在一些实施例中,所述处理器,用于读取并执行存储器中存储的可执行指令,以实现如下操作:获取搜索关键词;基于所述搜索关键词,获取关键词取值集合,所述关键词取值集合为基于所述搜索关键词从所述ES搜索引擎中查询得到的结果;发送所述关键词取值集合;接收基于所述关键词取值集合返回的所述数据查询请求。
在一些实施例中,所述处理器,用于读取并执行存储器中存储的可执行指令,以实现如下操作:基于可视化查询请求,发送查询推荐信息,所述查询推荐信息中至少包含候选的搜索关键词集合;接收基于所述查询推荐信息返回的搜索关键词。
在一些实施例中,所述处理器,用于读取并执行存储器中存储的可执行指令,以实现如下操作:获取数据查询请求中携带的中文SQL语句;基于数字字典,对所述中文SQL语句进行映射,得到所述SQL语句,所述数字字典包含中文SQL语句和SQL语句之间的映射关系。
在一些实施例中,所述处理器,用于读取并执行存储器中存储的可执行指令,以实现如下操作:基于SQL语法规则,对所述中文SQL语句进行语法校验;基于值校验规则,对所述中文SQL语句进行值校验;在所述中文SQL语句通过所述语法校验和所述值校验的情况下,执行基于数字字典,对所述中文SQL语 句进行映射,得到所述SQL语句的步骤。
在一些实施例中,所述处理器,用于读取并执行存储器中存储的可执行指令,以实现如下操作:在所述中文SQL语句中未包含第一类关键词的情况下,基于所述数字字典,生成包含所述第一类关键词的所述SQL语句,所述第一类关键词用于表征数据排序条件;在所述中文SQL语句中包含第二类关键词的情况下,基于所述数字字典,生成包含对应的多个非中文关键词的所述SQL语句,所述第二类关键词在所述数字字典中对应于所述多个非中文关键词;在所述中文SQL语句中包含第三类关键词的情况下,基于引用关系,确定相应的搜索条件,基于所述数字字典,生成包含所述搜索条件的所述SQL语句,所述第三类关键词用于表征所述中文SQL语句中存在所述引用关系。
在一些实施例中,所述处理器,用于读取并执行存储器中存储的可执行指令,以实现如下操作:基于所述SQL语句,采用第一解析器,生成所述抽象语法树。
在一些实施例中,所述处理器,用于读取并执行存储器中存储的可执行指令,以实现如下操作:基于所述SQL语句,采用词法分析器,确定所述SQL语句中包含的词法单元;基于所述词法单元,采用语法分析器,确定所述SQL语句中包含的规则节点;基于所述规则节点,生成相应的所述抽象语法树;其中,所述词法分析器和所述语法分析器是基于SQL语法规则,采用所述第一解析器生成的。
在一些实施例中,所述处理器,用于读取并执行存储器中存储的可执行指令,以实现如下操作:读取所述抽象语法树中包含的规则节点和所述规则节点对应的叶子节点,所述叶子节点用于表征所述SQL语句中的词法单元;基于所述规则节点,根据SQL语法规则与ES领域特定语言的语法规则之间的对应关系,生成初始解析结果;基于所述叶子节点,生成中间解析结果;基于所述SQL语句的上下文,对所述初始解析结果和所述中间解析结果进行合并,生成所述规则节点对应的解析结果。
在一些实施例中,所述处理器,用于读取并执行存储器中存储的可执行指令,以实现如下操作:基于所述SQL语句的上下文,对各个所述解析结果进行合并,生成相应的所述ES查询语句。
其中,在图9中,总线架构可以包括任意数量的互联的总线和桥,具体由处理器902代表的一个或多个处理器和存储器901代表的存储器的各种电路链 接在一起。总线架构还可以将诸如外围设备、稳压器和功率管理电路等之类的各种其他电路链接在一起。总线接口提供接口。收发机903可以是多个元件,即包括发送机和收发机,提供用于在传输介质上与各种其他装置通信的单元。处理器902负责管理总线架构和通常的处理,存储器901可以存储处理器902在执行操作时所使用的数据。
基于同一发明构思,本公开实施例提供一种存储介质,存储有指令,当所述存储介质中的指令由处理器执行时,使所述处理器执行如下操作:基于数据查询请求,获取结构化查询语言SQL语句;基于所述SQL语句,获取抽象语法树,所述抽象语法树包含规则节点,所述规则节点用于表征所述SQL语句所遵循的语法规则;基于所述规则节点对应的解析结果,获取弹性搜索ES查询语句;发送数据查询结果,所述数据查询结果为基于所述ES查询语句从ES搜索引擎中查询得到的结果。
在一些实施例中,当所述存储介质中的指令由处理器执行时,使所述处理器执行如下操作:获取搜索关键词;基于所述搜索关键词,获取关键词取值集合,所述关键词取值集合为基于所述搜索关键词从所述ES搜索引擎中查询得到的结果;发送所述关键词取值集合;接收基于所述关键词取值集合返回的所述数据查询请求。
在一些实施例中,当所述存储介质中的指令由处理器执行时,使所述处理器执行如下操作:基于可视化查询请求,发送查询推荐信息,所述查询推荐信息中至少包含候选的搜索关键词集合;接收基于所述查询推荐信息返回的搜索关键词。
在一些实施例中,当所述存储介质中的指令由处理器执行时,使所述处理器执行如下操作:获取数据查询请求中携带的中文SQL语句;基于数字字典,对所述中文SQL语句进行映射,得到所述SQL语句,所述数字字典包含中文SQL语句和SQL语句之间的映射关系。
在一些实施例中,当所述存储介质中的指令由处理器执行时,使所述处理器执行如下操作:基于SQL语法规则,对所述中文SQL语句进行语法校验;基于值校验规则,对所述中文SQL语句进行值校验;在所述中文SQL语句通过所述语法校验和所述值校验的情况下,执行基于数字字典,对所述中文SQL语句进行映射,得到所述SQL语句的步骤。
在一些实施例中,当所述存储介质中的指令由处理器执行时,使所述处理 器执行如下操作:在所述中文SQL语句中未包含第一类关键词的情况下,基于所述数字字典,生成包含所述第一类关键词的所述SQL语句,所述第一类关键词用于表征数据排序条件;在所述中文SQL语句中包含第二类关键词的情况下,基于所述数字字典,生成包含对应的多个非中文关键词的所述SQL语句,所述第二类关键词在所述数字字典中对应于所述多个非中文关键词;在所述中文SQL语句中包含第三类关键词的情况下,基于引用关系,确定相应的搜索条件,基于所述数字字典,生成包含所述搜索条件的所述SQL语句,所述第三类关键词用于表征所述中文SQL语句中存在所述引用关系。
在一些实施例中,当所述存储介质中的指令由处理器执行时,使所述处理器执行如下操作:基于所述SQL语句,采用第一解析器,生成所述抽象语法树。
在一些实施例中,当所述存储介质中的指令由处理器执行时,使所述处理器执行如下操作:基于所述SQL语句,采用词法分析器,确定所述SQL语句中包含的词法单元;基于所述词法单元,采用语法分析器,确定所述SQL语句中包含的规则节点;基于所述规则节点,生成相应的所述抽象语法树;其中,所述词法分析器和所述语法分析器是基于SQL语法规则,采用所述第一解析器生成的。
在一些实施例中,当所述存储介质中的指令由处理器执行时,使所述处理器执行如下操作:读取所述抽象语法树中包含的规则节点和所述规则节点对应的叶子节点,所述叶子节点用于表征所述SQL语句中的词法单元;基于所述规则节点,根据SQL语法规则与ES领域特定语言的语法规则之间的对应关系,生成初始解析结果;基于所述叶子节点,生成中间解析结果;基于所述SQL语句的上下文,对所述初始解析结果和所述中间解析结果进行合并,生成所述规则节点对应的解析结果。
在一些实施例中,当所述存储介质中的指令由处理器执行时,使所述处理器执行如下操作:基于所述SQL语句的上下文,对各个所述解析结果进行合并,生成相应的所述ES查询语句。
综上所述,本公开实施例中,遵循结构化查询语言语法实现结构化查询语言语句与弹性搜索领域特定语言的转换,使得熟知的结构化查询语言语法的开发人员采用常用的结构化查询语言就可以实现弹性搜索搜索引擎的数据查询,不必要增加对弹性搜索搜索语法的学习成本,因此,极大的提高了系统开发效率,提高了系统可用性,同时,基于弹性搜索搜索引擎进行数据查询,保证了 数据查询效率。
进一步的,本公开实施例中,基于弹性搜索搜索引擎作为底层搜索引擎,在双解析器的基础上,实现了多层校验器和解析器,这样,虽然类结构化查询语言的搜索语法体系,但是完全剔除了结构化查询语言注入的风险,避免了结构化查询语言注入,从而提高了系统安全性。
对于系统/装置实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者一个操作与另一个实体或者另一个操作区分开来,而不一定要求或者暗示这些实体或者操作之间存在任何这种实际的关系或者顺序。
本领域内的技术人员应明白,本公开的实施例可提供为方法、系统、或计算机程序产品。因此,本公开可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本公开可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
在一些实施例中,提供一种计算机程序产品,包括一条或多条指令,所述一条或多条指令可以由计算机设备的一个或多个处理器执行,使得计算机设备执行上述各个实施例中涉及的数据查询方法。
本公开是参照根据本公开实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处 理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
本公开所有实施例均可以单独被执行,也可以与其他实施例相结合被执行,均视为本公开要求的保护范围。

Claims (21)

  1. 一种数据查询方法,包括:
    基于数据查询请求,获取结构化查询语言SQL语句;
    基于所述SQL语句,获取抽象语法树,所述抽象语法树包含规则节点,所述规则节点用于表征所述SQL语句所遵循的语法规则;
    基于所述规则节点对应的解析结果,获取弹性搜索ES查询语句;
    发送数据查询结果,所述数据查询结果为基于所述ES查询语句从ES搜索引擎中查询得到的结果。
  2. 如权利要求1所述的方法,进一步包括:
    获取搜索关键词;
    基于所述搜索关键词,获取关键词取值集合,所述关键词取值集合为基于所述搜索关键词从所述ES搜索引擎中查询得到的结果;
    发送所述关键词取值集合;
    接收基于所述关键词取值集合返回的所述数据查询请求。
  3. 如权利要求2所述的方法,获取搜索关键词包括:
    基于可视化查询请求,发送查询推荐信息,所述查询推荐信息中至少包含候选的搜索关键词集合;
    接收基于所述查询推荐信息返回的搜索关键词。
  4. 如权利要求1所述的方法,基于数据查询请求,获取结构化查询语言SQL语句包括:
    获取数据查询请求中携带的中文SQL语句;
    基于数字字典,对所述中文SQL语句进行映射,得到所述SQL语句,所述数字字典包含中文SQL语句和SQL语句之间的映射关系。
  5. 如权利要求4所述的方法,进一步包括:
    基于SQL语法规则,对所述中文SQL语句进行语法校验;
    基于值校验规则,对所述中文SQL语句进行值校验;
    在所述中文SQL语句通过所述语法校验和所述值校验的情况下,执行基于数字字典,对所述中文SQL语句进行映射,得到所述SQL语句的步骤。
  6. 如权利要求4所述的方法,基于数字字典,对所述中文SQL语句进行映射,得到所述SQL语句包括:
    在所述中文SQL语句中未包含第一类关键词的情况下,基于所述数字字典,生成包含所述第一类关键词的所述SQL语句,所述第一类关键词用于表征数据排序条件;
    在所述中文SQL语句中包含第二类关键词的情况下,基于所述数字字典,生成包含对应的多个非中文关键词的所述SQL语句,所述第二类关键词在所述数字字典中对应于所述多个非中文关键词;
    在所述中文SQL语句中包含第三类关键词的情况下,基于引用关系,确定相应的搜索条件,基于所述数字字典,生成包含所述搜索条件的所述SQL语句,所述第三类关键词用于表征所述中文SQL语句中存在所述引用关系。
  7. 如权利要求1所述的方法,所述基于所述SQL语句,获取抽象语法树包括:
    基于所述SQL语句,采用第一解析器,生成所述抽象语法树。
  8. 如权利要求7所述的方法,基于所述SQL语句,采用第一解析器,生成所述抽象语法树包括:
    基于所述SQL语句,采用词法分析器,确定所述SQL语句中包含的词法单元;
    基于所述词法单元,采用语法分析器,确定所述SQL语句中包含的规则节点;
    基于所述规则节点,生成相应的所述抽象语法树;
    其中,所述词法分析器和所述语法分析器是基于SQL语法规则,采用所述第一解析器生成的。
  9. 如权利要求1所述的方法,进一步包括:
    读取所述抽象语法树中包含的规则节点和所述规则节点对应的叶子节点,所述叶子节点用于表征所述SQL语句中的词法单元;
    基于所述规则节点,根据SQL语法规则与ES领域特定语言的语法规则之间的对应关系,生成初始解析结果;
    基于所述叶子节点,生成中间解析结果;
    基于所述SQL语句的上下文,对所述初始解析结果和所述中间解析结果进行合并,生成所述规则节点对应的解析结果。
  10. 如权利要求1所述的方法,基于所述规则节点对应的解析结果,获取弹 性搜索ES查询语句包括:
    基于所述SQL语句的上下文,对各个所述解析结果进行合并,生成相应的所述ES查询语句。
  11. 一种基于领域特定语言的数据查询装置,包括:
    存储器,用于存储可执行指令;
    处理器,用于读取并执行存储器中存储的可执行指令,以实现如下操作:
    基于数据查询请求,获取结构化查询语言SQL语句;
    基于所述SQL语句,获取抽象语法树,所述抽象语法树包含规则节点,所述规则节点用于表征所述SQL语句所遵循的语法规则;
    基于所述规则节点对应的解析结果,获取弹性搜索ES查询语句;
    发送数据查询结果,所述数据查询结果为基于所述ES查询语句从ES搜索引擎中查询得到的结果。
  12. 如权利要求11所述的装置,所述处理器,用于读取并执行存储器中存储的可执行指令,以实现如下操作:
    获取搜索关键词;
    基于所述搜索关键词,获取关键词取值集合,所述关键词取值集合为基于所述搜索关键词从所述ES搜索引擎中查询得到的结果;
    发送所述关键词取值集合;
    接收基于所述关键词取值集合返回的所述数据查询请求。
  13. 如权利要求12所述的装置,所述处理器,用于读取并执行存储器中存储的可执行指令,以实现如下操作:
    基于可视化查询请求,发送查询推荐信息,所述查询推荐信息中至少包含候选的搜索关键词集合;
    接收基于所述查询推荐信息返回的搜索关键词。
  14. 如权利要求11所述的装置,所述处理器,用于读取并执行存储器中存储的可执行指令,以实现如下操作:
    获取数据查询请求中携带的中文SQL语句;
    基于数字字典,对所述中文SQL语句进行映射,得到所述SQL语句,所述数字字典包含中文SQL语句和SQL语句之间的映射关系。
  15. 如权利要求14所述的装置,所述处理器,用于读取并执行存储器中存储 的可执行指令,以实现如下操作:
    基于SQL语法规则,对所述中文SQL语句进行语法校验;
    基于值校验规则,对所述中文SQL语句进行值校验;
    在所述中文SQL语句通过所述语法校验和所述值校验的情况下,执行基于数字字典,对所述中文SQL语句进行映射,得到所述SQL语句的步骤。
  16. 如权利要求14所述的装置,所述处理器,用于读取并执行存储器中存储的可执行指令,以实现如下操作:
    在所述中文SQL语句中未包含第一类关键词的情况下,基于所述数字字典,生成包含所述第一类关键词的所述SQL语句,所述第一类关键词用于表征数据排序条件;
    在所述中文SQL语句中包含第二类关键词的情况下,基于所述数字字典,生成包含对应的多个非中文关键词的所述SQL语句,所述第二类关键词在所述数字字典中对应于所述多个非中文关键词;
    在所述中文SQL语句中包含第三类关键词的情况下,基于引用关系,确定相应的搜索条件,基于所述数字字典,生成包含所述搜索条件的所述SQL语句,所述第三类关键词用于表征所述中文SQL语句中存在所述引用关系。
  17. 如权利要求11所述的装置,所述处理器,用于读取并执行存储器中存储的可执行指令,以实现如下操作:
    基于所述SQL语句,采用第一解析器,生成所述抽象语法树。
  18. 如权利要求17所述的装置,所述处理器,用于读取并执行存储器中存储的可执行指令,以实现如下操作:
    基于所述SQL语句,采用词法分析器,确定所述SQL语句中包含的词法单元;
    基于所述词法单元,采用语法分析器,确定所述SQL语句中包含的规则节点;
    基于所述规则节点,生成相应的所述抽象语法树;
    其中,所述词法分析器和所述语法分析器是基于SQL语法规则,采用所述第一解析器生成的。
  19. 如权利要求11所述的装置,所述处理器,用于读取并执行存储器中存储的可执行指令,以实现如下操作:
    读取所述抽象语法树中包含的规则节点和所述规则节点对应的叶子节点,所述叶子节点用于表征所述SQL语句中的词法单元;
    基于所述规则节点,根据SQL语法规则与ES领域特定语言的语法规则之间的对应关系,生成初始解析结果;
    基于所述叶子节点,生成中间解析结果;
    基于所述SQL语句的上下文,对所述初始解析结果和所述中间解析结果进行合并,生成所述规则节点对应的解析结果。
  20. 如权利要求11所述的装置,所述处理器,用于读取并执行存储器中存储的可执行指令,以实现如下操作:
    基于所述SQL语句的上下文,对各个所述解析结果进行合并,生成相应的所述ES查询语句。
  21. 一种存储介质,存储有指令,当所述存储介质中的指令由处理器执行时,使所述处理器执行如下操作:
    基于数据查询请求,获取结构化查询语言SQL语句;
    基于所述SQL语句,获取抽象语法树,所述抽象语法树包含规则节点,所述规则节点用于表征所述SQL语句所遵循的语法规则;
    基于所述规则节点对应的解析结果,获取弹性搜索ES查询语句;
    发送数据查询结果,所述数据查询结果为基于所述ES查询语句从ES搜索引擎中查询得到的结果。
PCT/CN2021/107464 2020-09-10 2021-07-20 数据查询方法及装置 WO2022052639A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010948413.8 2020-09-10
CN202010948413.8A CN114168622A (zh) 2020-09-10 2020-09-10 一种基于领域特定语言的数据查询方法及装置

Publications (1)

Publication Number Publication Date
WO2022052639A1 true WO2022052639A1 (zh) 2022-03-17

Family

ID=80475650

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/107464 WO2022052639A1 (zh) 2020-09-10 2021-07-20 数据查询方法及装置

Country Status (2)

Country Link
CN (1) CN114168622A (zh)
WO (1) WO2022052639A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115271514A (zh) * 2022-08-11 2022-11-01 中国铁塔股份有限公司 通讯企业的监控方法、装置、电子设备及存储介质
CN116738065A (zh) * 2023-08-15 2023-09-12 浙江同信企业征信服务有限公司 一种企业搜索方法、装置、设备及存储介质
CN117009319A (zh) * 2023-08-07 2023-11-07 广州青莲网络科技有限公司 基于大语言模型的数据库操作方法、系统及存储介质
CN117520483A (zh) * 2024-01-04 2024-02-06 北京奇虎科技有限公司 基于大模型的信息校验方法及装置

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115469875B (zh) * 2022-08-22 2023-08-15 西安衍舆航天科技有限公司 基于遥控作业的领域专用语言dsl的编译方法及装置
CN117519702B (zh) * 2023-12-29 2024-03-19 冠骋信息技术(苏州)有限公司 基于低代码配置化的搜索页面设计方法及系统

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106934062A (zh) * 2017-03-28 2017-07-07 广东工业大学 一种查询elasticsearch的实现方法及系统
WO2018053568A1 (en) * 2016-09-20 2018-03-29 Global Software Innovation Pty Ltd Systems and methods for accessing a database management system
CN109145009A (zh) * 2018-08-19 2019-01-04 杭州安恒信息技术股份有限公司 一种基于SQL检索ElasticSearch的方法
CN109739882A (zh) * 2019-01-04 2019-05-10 南威软件股份有限公司 一种基于Presto和Elasticsearch的大数据查询优化方法
CN110019291A (zh) * 2017-09-04 2019-07-16 中国移动通信集团浙江有限公司 一种sql解析方法和sql解析器
US10607271B1 (en) * 2017-03-16 2020-03-31 Walgreen Co. Search platform with data driven search relevancy management
CN111143403A (zh) * 2019-12-10 2020-05-12 跬云(上海)信息科技有限公司 Sql转换方法及装置、存储介质

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008077329A (ja) * 2006-09-20 2008-04-03 Hitachi Software Eng Co Ltd データベースアクセス設計情報解析システム
CN101788992A (zh) * 2009-05-06 2010-07-28 厦门东南融通系统工程有限公司 一种数据库查询语句的转换方法和转换系统
US20150269234A1 (en) * 2014-03-19 2015-09-24 Hewlett-Packard Development Company, L.P. User Defined Functions Including Requests for Analytics by External Analytic Engines
CN111176650B (zh) * 2018-11-09 2023-04-18 阿里巴巴集团控股有限公司 解析器生成方法、检索方法、服务器及存储介质
CN109840254A (zh) * 2018-12-14 2019-06-04 湖南亚信软件有限公司 一种数据虚拟化及查询方法、装置
CN111581229B (zh) * 2020-03-25 2023-04-18 平安科技(深圳)有限公司 Sql语句的生成方法、装置、计算机设备及存储介质

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018053568A1 (en) * 2016-09-20 2018-03-29 Global Software Innovation Pty Ltd Systems and methods for accessing a database management system
US10607271B1 (en) * 2017-03-16 2020-03-31 Walgreen Co. Search platform with data driven search relevancy management
CN106934062A (zh) * 2017-03-28 2017-07-07 广东工业大学 一种查询elasticsearch的实现方法及系统
CN110019291A (zh) * 2017-09-04 2019-07-16 中国移动通信集团浙江有限公司 一种sql解析方法和sql解析器
CN109145009A (zh) * 2018-08-19 2019-01-04 杭州安恒信息技术股份有限公司 一种基于SQL检索ElasticSearch的方法
CN109739882A (zh) * 2019-01-04 2019-05-10 南威软件股份有限公司 一种基于Presto和Elasticsearch的大数据查询优化方法
CN111143403A (zh) * 2019-12-10 2020-05-12 跬云(上海)信息科技有限公司 Sql转换方法及装置、存储介质

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115271514A (zh) * 2022-08-11 2022-11-01 中国铁塔股份有限公司 通讯企业的监控方法、装置、电子设备及存储介质
CN117009319A (zh) * 2023-08-07 2023-11-07 广州青莲网络科技有限公司 基于大语言模型的数据库操作方法、系统及存储介质
CN117009319B (zh) * 2023-08-07 2024-01-26 广州青莲网络科技有限公司 基于大语言模型的数据库操作方法、系统及存储介质
CN116738065A (zh) * 2023-08-15 2023-09-12 浙江同信企业征信服务有限公司 一种企业搜索方法、装置、设备及存储介质
CN116738065B (zh) * 2023-08-15 2024-04-19 浙江同信企业征信服务有限公司 一种企业搜索方法、装置、设备及存储介质
CN117520483A (zh) * 2024-01-04 2024-02-06 北京奇虎科技有限公司 基于大模型的信息校验方法及装置

Also Published As

Publication number Publication date
CN114168622A (zh) 2022-03-11

Similar Documents

Publication Publication Date Title
WO2022052639A1 (zh) 数据查询方法及装置
US10169337B2 (en) Converting data into natural language form
EP3080721B1 (en) Query techniques and ranking results for knowledge-based matching
US20110219360A1 (en) Software debugging recommendations
US20080140387A1 (en) Method and system for machine understanding, knowledge, and conversation
Li et al. Bridging semantic gaps between natural languages and APIs with word embedding
CN109522341B (zh) 实现基于sql的流式数据处理引擎的方法、装置、设备
KR20160007040A (ko) 단문/복문 구조의 자연어 질의에 대한 검색 및 정보 제공 방법 및 시스템
KR20100075454A (ko) 간접 화법 내에서의 시맨틱 관계의 식별
US10678820B2 (en) System and method for computerized semantic indexing and searching
Ochieng PAROT: Translating natural language to SPARQL
CN111382571A (zh) 一种信息抽取方法、系统、服务器和存储介质
KR20230005797A (ko) 질의어를 처리하는 장치, 방법 및 컴퓨터 프로그램
Solanki et al. A system to transform natural language queries into SQL queries
CN116483850A (zh) 数据处理方法、装置、设备以及介质
Gao et al. Semantic mapping from natural language questions to OWL queries
Giordani et al. Automatic generation and reranking of sql-derived answers to nl questions
Nahshon et al. Incorporating information extraction in the relational database model
CN115345153A (zh) 一种基于概念网络的自然语言生成方法
Chiarcos et al. Building a Linked Open Data cloud of linguistic resources: Motivations and developments
Kedwan NLQ into SQL translation using computational linguistics
Mielke et al. Flexible semantic query expansion for process exploration
US11017172B2 (en) Proposition identification in natural language and usage thereof for search and retrieval
Pandit et al. Ontology-guided extraction of complex nested relationships
RU2572367C1 (ru) Способ поиска информации в предварительно преобразованном структурированном массиве данных

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21865695

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 23/06/2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21865695

Country of ref document: EP

Kind code of ref document: A1