CN108920614B - Method, device and system for inquiring data online - Google Patents

Method, device and system for inquiring data online Download PDF

Info

Publication number
CN108920614B
CN108920614B CN201810688094.4A CN201810688094A CN108920614B CN 108920614 B CN108920614 B CN 108920614B CN 201810688094 A CN201810688094 A CN 201810688094A CN 108920614 B CN108920614 B CN 108920614B
Authority
CN
China
Prior art keywords
query
data
tree
web platform
queried
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201810688094.4A
Other languages
Chinese (zh)
Other versions
CN108920614A (en
Inventor
高其林
王肖磊
王志超
刘陟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qihoo Technology Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN201810688094.4A priority Critical patent/CN108920614B/en
Publication of CN108920614A publication Critical patent/CN108920614A/en
Application granted granted Critical
Publication of CN108920614B publication Critical patent/CN108920614B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method, a device and a system for inquiring data on line, wherein the method comprises the following steps: receiving a query request sent by an analyst by using a pre-established web platform; parsing the syntax of the received query request to obtain a corresponding syntax tree, and customizing a query rule corresponding to the query request based on the syntax tree; and inquiring corresponding inquiry data from a preset database according to the inquiry rule so as to feed the inquired inquiry data back to the web platform. Therefore, the online data query mode of the embodiment of the invention can support multiple customized query grammars, can flexibly customize the query rule corresponding to the query request based on the grammar tree, and can query the required query data from the database according to the query rule. In addition, the scheme also effectively improves the query efficiency of online query data.

Description

Method, device and system for inquiring data online
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a method, an apparatus, and a system for querying data online.
Background
With the continuous development of big data technology, the data query function based on the internet is more and more widely applied. In the prior art, when a professional operation analyst queries data from a database storing a large amount of data, a query request of a user needs to be analyzed first, that is, the query intention of the user is known, so as to query the data.
However, in the current data query process, the mode of analyzing the user query request by using the specific grammar cannot flexibly support the self-defined query grammar, and cannot well meet the requirements of query diversity and flexibility. Therefore, how to more efficiently and flexibly implement the query of data is an important technical problem facing the present time.
Disclosure of Invention
In view of the above, the present invention has been made to provide a method, apparatus and system for querying data online that overcomes or at least partially solves the above problems.
According to an aspect of the present invention, there is provided a method for querying data online, including:
receiving a query request sent by an analyst through a pre-established web platform;
parsing the syntax of the received query request to obtain a corresponding syntax tree, and customizing a query rule corresponding to the query request based on the syntax tree;
and inquiring corresponding inquiry data from a preset database according to the inquiry rule, and feeding the inquired inquiry data back to the web platform.
Optionally, receiving a query request sent by an analyst through a pre-established web platform includes: and receiving an http query request sent by an analyst through a pre-established web platform based on an http protocol.
Optionally, customizing a query rule corresponding to the query request based on the syntax tree includes: optimizing the syntax tree by adopting an optimizer;
and generating a corresponding query plan tree according to the optimized syntax tree, wherein the query plan tree comprises a plurality of tree nodes, and the tree nodes correspond to the query steps of the query rules.
Optionally, querying, according to the query rule, corresponding query data from a preset database, and feeding the queried query data back to the web platform, includes:
sequentially querying corresponding query data from the preset database according to the sequence of the query steps corresponding to the tree nodes in the query plan tree;
and after the data query is finished according to all the tree nodes in the query plan tree, integrating the queried query data, and feeding the integrated query data back to the web platform.
Optionally, querying, according to the query rule, corresponding query data from a preset database, and feeding the queried query data back to the web platform, includes:
sequentially querying corresponding query data from the preset database according to the sequence of the query steps corresponding to the tree nodes in the query plan tree;
after the corresponding query data are queried according to the query step corresponding to any tree node, the query data which are queried currently are fed back to the web platform.
Optionally, querying, according to the query rule, corresponding query data from a preset database, and feeding the queried query data back to the web platform, includes:
selecting a plurality of tree nodes from the query plan tree;
according to the selected multiple tree nodes, inquiring corresponding inquiry data from the preset database in parallel;
and after the data query is finished according to all the tree nodes in the query plan tree, integrating the queried query data, and feeding the integrated query data back to the web platform.
Optionally, selecting a plurality of tree nodes from the query plan tree includes:
selecting a plurality of tree nodes from the query plan tree according to the sequence of the query steps corresponding to the tree nodes; and/or
A plurality of tree nodes are randomly selected from the query plan tree.
Optionally, the method further comprises: after corresponding query data are queried from a preset database according to the query rule, caching the queried query data and a corresponding query request;
and when the same query request sent by an analyst through the pre-established web platform is received again, directly acquiring the cached query data corresponding to the query request.
Optionally, feeding the queried query data back to the web platform, including:
carrying out format conversion on the inquired inquiry data to obtain inquiry data with a uniform format;
and feeding back the query data with unified format to the web platform.
According to another aspect of the present invention, there is also provided an apparatus for querying data online, comprising
The receiving module is suitable for receiving a query request sent by an analyst through a pre-established web platform;
the parsing module is suitable for parsing the syntax of the received query request to obtain a corresponding syntax tree, and customizing a query rule corresponding to the query request based on the syntax tree;
and the query module is suitable for querying corresponding query data from a preset database according to the query rule and feeding the queried query data back to the web platform.
Optionally, the receiving module is further adapted to: and receiving an http query request sent by an analyst through a pre-established web platform based on an http protocol.
Optionally, the parsing module is further adapted to: optimizing the syntax tree by adopting an optimizer;
and generating a corresponding query plan tree according to the optimized syntax tree, wherein the query plan tree comprises a plurality of tree nodes, and the tree nodes correspond to the query steps of the query rules.
Optionally, the query module is further adapted to: sequentially querying corresponding query data from the preset database according to the sequence of the query steps corresponding to the tree nodes in the query plan tree;
and after the data query is finished according to all the tree nodes in the query plan tree, integrating the queried query data, and feeding the integrated query data back to the web platform.
Optionally, the query module is further adapted to: sequentially querying corresponding query data from the preset database according to the sequence of the query steps corresponding to the tree nodes in the query plan tree;
after the corresponding query data are queried according to the query step corresponding to any tree node, the query data which are queried currently are fed back to the web platform.
Optionally, the query module is further adapted to: selecting a plurality of tree nodes from the query plan tree;
according to the selected multiple tree nodes, inquiring corresponding inquiry data from the preset database in parallel;
and after the data query is finished according to all the tree nodes in the query plan tree, integrating the queried query data, and feeding the integrated query data back to the web platform.
Optionally, the query module is further adapted to: selecting a plurality of tree nodes from the query plan tree according to the sequence of the query steps corresponding to the tree nodes; and/or randomly selecting a plurality of tree nodes from the query plan tree.
Optionally, the apparatus further comprises: the cache module is suitable for caching the inquired inquiry data and the corresponding inquiry request after inquiring the corresponding inquiry data from the preset database according to the inquiry rule;
and when the receiving module receives the same query request sent by an analyst through a pre-established web platform again, the cached query data corresponding to the query request is directly obtained.
Optionally, the query module is further adapted to:
carrying out format conversion on the inquired inquiry data to obtain inquiry data with a uniform format;
and feeding back the query data with unified format to the web platform.
According to still another aspect of the present invention, there is provided a system for online querying data, including a web platform, the apparatus for online querying data according to any of the above embodiments, and a preset database, wherein,
the web platform receives a query request input by an analyst and sends the query request to the device for online querying data;
the device for on-line data query receives a query request from the web platform, performs syntax analysis on the query request to obtain a corresponding syntax tree, and customizes a query rule corresponding to the query request based on the syntax tree;
and the device for on-line data query queries corresponding query data from a preset database according to the query rule and feeds the queried query data back to the web platform.
According to another aspect of the present invention, there is also provided a computer storage medium storing computer program code, which when run on a computing device, causes the computing device to execute the method of querying data online as described in any of the above embodiments.
According to yet another aspect of the present invention, there is also provided a computing device comprising: a processor; a memory storing computer program code; the computer program code, when executed by the processor, causes the computing device to perform the method of querying data online as described in any of the embodiments above.
In the embodiment of the invention, after receiving a query request sent by an analyst by using a pre-established web platform, syntax analysis is firstly carried out on the received query request to obtain a corresponding syntax tree, then a query rule corresponding to the query request is customized based on the syntax tree, and corresponding query data is queried from a preset database according to the query rule so as to feed the queried query data back to the web platform. Therefore, the on-line data query mode of the embodiment of the invention obtains the corresponding syntax tree by performing syntax analysis on the query request to support multiple self-defined query syntaxes, namely, the self-defined query syntaxes are realized, the query rule corresponding to the query request can be flexibly customized based on the syntax tree, and the required query data can be queried from the database according to the query rule. Furthermore, the scheme also effectively improves the query efficiency of online query data.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
The above and other objects, advantages and features of the present invention will become more apparent to those skilled in the art from the following detailed description of specific embodiments thereof, taken in conjunction with the accompanying drawings.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
FIG. 1 shows a flow diagram of a method of querying data online, according to one embodiment of the invention;
FIG. 2 illustrates a search engine design framework diagram of online query data, according to one embodiment of the invention;
FIG. 3 is a schematic structural diagram of an apparatus for querying data online according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of an apparatus for querying data online according to another embodiment of the present invention; and
FIG. 5 is a block diagram of a system for querying data online according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
In order to solve the above technical problem, an embodiment of the present invention provides a method for querying data online. FIG. 1 is a flow diagram illustrating a method for querying data online according to one embodiment of the invention. Referring to fig. 1, the method includes at least steps S102 to S106.
Step S102, receiving a query request sent by an analyst through a pre-established web platform.
In this step, the type of the query request may be an http query request, and therefore, the scheme of the present invention may receive, based on an http protocol, an http query request sent by an analyst through a pre-established web platform. Of course, the query request may also be of other types, and the transmission protocol used is also different for different types of query requests, and the type of the query request and the type of the transmission protocol used are not specifically limited in the embodiments of the present invention.
In addition, the query request in the present invention may be a query request about a query log issued by an analyst.
And step S104, carrying out syntax analysis on the received query request to obtain a corresponding syntax tree, and customizing a query rule corresponding to the query request based on the syntax tree.
The query rule in this step is actually a query plan customized based on the syntax tree and corresponding to the query request, that is, the corresponding query steps are planned according to the query request, and which step is queried first and then which step is queried, so as to effectively improve the query efficiency.
And S106, inquiring corresponding inquiry data from a preset database according to the inquiry rule, and feeding the inquired inquiry data back to the web platform.
In this step, after the query data is fed back to the web platform, the query data can be visually displayed on a display device in the web platform, and an analyst can see the fed-back query data through the web platform to perform corresponding analysis and use.
The on-line data query mode of the embodiment of the invention obtains the corresponding syntax tree by performing syntax analysis on the query request so as to support multiple self-defined query syntaxes, namely, the self-owned query syntaxes are realized, and then the query rule corresponding to the query request can be flexibly customized based on the syntax tree, and the required query data can be queried from the database according to the query rule. Furthermore, the scheme also effectively improves the query efficiency of online query data.
Referring to step S104 above, in an embodiment of the present invention, when the query rule corresponding to the query request is customized based on the syntax tree, in order to ensure the accuracy of the syntax tree, a corresponding optimizer may be further used to optimize the syntax tree, and then a corresponding query plan tree is generated according to the optimized syntax tree, where the query plan tree includes a plurality of tree nodes, and the tree nodes correspond to the query steps of the query rule. The query plan tree can effectively and clearly embody the query plan, and is beneficial to efficiently and accurately realizing data query. In this embodiment, the optimizer for optimizing the syntax tree may adopt a plain optimizer, and may also adopt other types of optimizers, which is not specifically limited in this embodiment of the present invention.
Referring to the above steps S104 and S106, after the corresponding query plan tree is generated according to the optimized syntax tree, since the query plan tree includes a plurality of tree nodes, in the process of querying the corresponding query data from the preset database according to the query plan tree, data query can be performed in various ways.
In a first mode
And sequentially querying corresponding query data from a preset database according to the sequence of the query steps corresponding to the tree nodes in the query plan tree. And integrating the inquired query data after the data is inquired according to all the tree nodes in the query plan tree, and feeding the integrated query data back to the web platform.
For example, the generated query plan tree includes three tree nodes, then, tree node 1 corresponds to query step 1, tree node 2 corresponds to query step 2, and tree node 3 corresponds to query step 3, and corresponding query data is queried from the preset database according to the sequence of query steps 1, 2, and 3, that is, according to the sequence of tree nodes 1 to 3. And after the three tree nodes realize the query of the data, integrating the queried data to integrate the data into data content corresponding to the query request, and feeding the integrated query data back to the web platform.
Mode two
And sequentially querying corresponding query data from a preset database according to the sequence of the query steps corresponding to the tree nodes in the query plan tree. And then after querying corresponding query data according to the query step corresponding to any tree node, feeding the query data queried currently back to the web platform.
The second mode is different from the first mode in that the second mode returns the query data after the query of the data according to all the tree nodes in the query plan tree is finished, but directly feeds back the currently queried data to the web platform after the data is queried according to any tree node, so that the data query experience of an analyst can be effectively improved, and the analyst can be helped to timely view the query data.
Mode III
First, a plurality of tree nodes are selected from a query plan tree. And then, according to the selected multiple tree nodes, inquiring corresponding inquiry data from a preset database in parallel. And finally, after the data query is finished according to all the tree nodes in the query plan tree, integrating the queried query data, and feeding the integrated query data back to the web platform.
In this manner, the selecting the plurality of tree nodes from the query plan tree may be selecting the plurality of tree nodes from the query plan tree according to the sequence of the query steps corresponding to the tree nodes, for example, selecting 2, 3, or 4 tree nodes in equal number according to the sequence of the query steps, and performing data query on the selected plurality of tree nodes in parallel. Of course, multiple tree nodes may also be randomly selected from the query plan tree. In addition, a specified number of tree nodes can be selected according to the sequence of the query steps in the query plan tree, and some tree nodes can be randomly selected. The embodiment of the invention does not specifically limit the number of the selected tree nodes.
The above embodiments are merely exemplary, and the manner of querying the corresponding query data from the preset database according to the query plan tree may be other manners.
In the embodiment of the present invention, the preset database may actually include one or more databases, and each database may store different types of data.
For example, the preset database may include a Poseidon database, a quick table database, a tidb database, and a mysql database, and the storage objects in the databases are described below.
The Poseidon database mainly provides trillion-level data, the data are accurately retrieved according to the participles, original logs in the trillion-level data can be quickly retrieved through the database according to a pre-established index, and the indexing is flexible. When the query request sent by the business analysis personnel is a log query request, the related log content can be queried from the database.
The quick table database mainly provides quick retrieval for some frequently queried data, such as query quantity and first occurrence time of samples (trillion level) every day, and designs target second level retrieval.
The database is actually a complement to the Poseidon database, which has the advantages of flexible index, large data storage amount, etc., but relies on the hadoop component with very low QPS (Query Per Second, Query rate Per Second), and generally needs to look back many layers of indexes if a piece of data needs to be retrieved, and when the data to be queried is only related to MD5(Message-Digest Algorithm 5, information Digest Algorithm 5) and sha1(Secure Hash Algorithm), it essentially only relates to the requirement of pure kv (Key-Value) Query, and does not need as flexible index. For example, when some users (such as white list users) need PV (Page View) and UV (Unique viewer) interfaces to meet the usage scenarios of automatic acquisition and batch (1000 at a time), the application cannot be supported currently by using the Poseidon database. Therefore, the quick table database is introduced to accelerate the indexing speed of some field queries to a certain extent.
the tidb database is an open-source distributed mysql and mainly stores some data updated in real time. For example, a set of samples filtered according to certain specific conditions, and all attribute information of samples in a recent period of time (e.g., 1 hour).
The Mysql database mainly stores metadata information of some data.
In order to more clearly embody the present solution, a specific embodiment is described to describe an implementation process of online Query data and each step in the process, where the Query process may be implemented based on a Structured Query Language (SQL) -like search engine, and the search engine receives a Query request sent from a Web platform, processes the Query request, and obtains corresponding Query data. Referring to fig. 2, the search engine may include a data layer smart, a driver layer GDO, and a Model layer.
Step 1, a query request (such as an http request) sent by a user (i.e. an analyst) through a Web platform is lexically analyzed according to relevant processing parameters of the query request to determine corresponding query conditions, where the analyzed query conditions are as follows: the samples with md5 xxx and path xxx need to be queried.
And 2, transmitting the query condition to a data layer smart, carrying out syntax analysis on the query condition by the data layer to obtain a syntax tree, and customizing a query plan (namely, the query rule in the foregoing) corresponding to the query request according to the syntax tree, wherein the query plan can be embodied in a physical query plan tree form. Here, a parser may be employed to parse the query conditions and an optimizer of the type, e.g., a plain optimizer, may be employed to optimize the syntax tree. This step generates the data index.
In this step, the data layer smart can also implement concurrent Exec, that is, query efficiency is improved by concurrently accessing the database, for example, in the data query mode three described above, a process of selecting a plurality of tree nodes from the query plan tree and querying corresponding query data from the preset database in parallel according to the selected plurality of tree nodes.
In addition, caching is also an effective means for improving the query efficiency, and the data queried before is cached, so that the cached data is directly obtained without accessing the database again in the next access, and the query speed can be increased. Specifically, after the corresponding query data is queried from the preset database according to the query rule, the queried query data and the corresponding query request can be cached. Furthermore, when the analyst sends the same query request again through the pre-established web platform, the analyst can directly obtain the query data corresponding to the query request cached before, and does not need to obtain the query data from the preset database, so that the workload and the network resources of the query data are greatly saved.
And 3, the GDO (driver layer) acquires corresponding query data from a preset database based on a query plan tree generated by the smart data layer, transmits the acquired query data to the Model layer after acquiring the query data, and unifies the format of the received query data by the Model layer.
The GDO of the driver layer can acquire data by querying and acquiring data from a Poseidon database, can query a URL interface according to a URL carried by an http query request based on the http query request, and can also query and acquire data based on a protocol buffer, wherein the protocol buffer is a data description language, can serialize structured data, and can be used for data storage, communication protocols and other aspects. The driver layer GDO can interface to different data layers by retrieving query data from different databases.
In this step, if the SDK is used as a package form of the database, when the driver layer GDO obtains corresponding query data from the preset database, the driver layer GDO may query the data by calling an SDK (Software Development Kit) outside the database.
And 4, unifying the format of the query data received from the GDO into a specified format, such as a JSON format, by the Model layer, outputting the query data in the specified format to the data layer smart, and feeding the queried data back to the user by the data layer.
In this step, the Model layer may also perform field screening before performing format unification on the query data, such as screening fields required for output and calculation from information tables pre-stored in the database. The information table comprises a Proc Chain process Chain information table, a Network sample information table, a Proc Behavior process Behavior table, a Basic sample Basic information table, a Cloud Info sample Cloud search information table (including sample Cloud search related information such as file path, history level and the like), a managed Files release information table, a Scan Log scanning information table, an Upload file uploading traceability information table and the like. The Basic sample Basic information table can quickly know the importance degree of a sample and is beneficial to realizing quick query of data, wherein the Basic sample Basic information table comprises sample key information such as historical query quantity, first occurrence time and level, and the table is usually used for enabling the part of data to come from a quick table database. The information table may also include other types of information tables and is not limited to the respective information tables shown in fig. 2.
In addition, after the format of the query data is unified, the Model layer can also filter or merge the data with unified format by adopting a corresponding algorithm, and then output the filtered or merged data to the data layer. The corresponding algorithm can adopt group aggregation, join connection and other operation methods.
It should be noted that, although only the Poseidon database and the Stored database are shown in the database of fig. 2, other types of databases described above may be included, and are not limited in particular.
In an embodiment of the present invention, a data processing system is further provided, where the system can implement processing of mass offline data and online data, and store the processed data in a preset database corresponding to the storage component. The offline data and the online data may be log data.
In the offline data extraction process, firstly, a distributed scheduler is adopted to schedule massive logs from a file system (such as hdfs, S3, and the like), and a Spark engine is adopted to extract metadata of the logs based on a MapReduce model. Then, a data processing frame (e.g., a content frame) is used to perform aggregation calculation on the scheduled logs and log metadata to obtain logs (i.e., intermediate data in fig. 3) in a specific format (e.g., json format), and the logs in the specific format are classified and merged to generate a corresponding virtual table, and statistics is performed on the metadata to obtain statistical information. And finally, storing the virtual table in a poseidon database, and storing the statistical information in a mysql database. The virtual table herein corresponds to the information table mentioned above, and may further include a specmen _ detail table, a specmen detailed information table, a specmen _ closed _ detail table, a specmen cloud static attribute information table, a scan _ info scan information table, a file _ relationships file relationship table, a specmen collected sample information table, a pe _ info executable information table (including a table related to sample executable information), and the like.
In the real-time data extraction process, when a user performs query services (such as antivirus, sample upload, url query, dns (Domain Name System) query and the like), a real-time log processing cluster can be used for extracting a generated log in real time from a query result of the query services, sending the extracted log to a pre-created nsq (real-time distributed messaging platform) message queue, further consuming the log from the nsq message queue by the feature extraction cluster so as to analyze and extract feature data of the log, and further storing the extracted log and the extracted log feature data in a tidb database. The characteristic data of the log may include metadata information of the log, among others.
In this embodiment, the storage component may further include a builder cluster, where the builder cluster may read a conversion instruction for converting a log in a specific format into a log in another format, further perform format conversion on the log in the specific format according to the read conversion instruction, and finally store the log after the format conversion into a preset database in the storage component. For example, the data processing device performs aggregation calculation on the offline logs to obtain json-format logs, and after the builder cluster receives an instruction for converting the json-format logs into other formats, the builder cluster performs format conversion on the json-format logs to convert the json-format logs into other specified formats, and stores the json-format logs into a quick _ table database.
Based on the same inventive concept, an embodiment of the present invention further provides an apparatus for querying data online, and fig. 3 illustrates a schematic structural diagram of the apparatus for querying data online according to an embodiment of the present invention. Referring to fig. 3, the apparatus 300 for querying data online includes a receiving module 310, a parsing module 320, and a querying module 330.
Now, the functions of the components or devices of the device 300 for online query of data according to the embodiment of the present invention and the connection relationship between the components will be described:
the receiving module 310 is suitable for receiving a query request sent by an analyst through a pre-established web platform;
the parsing module 320 is coupled with the receiving module 310 and is adapted to parse the syntax of the received query request to obtain a corresponding syntax tree, and customize a query rule corresponding to the query request based on the syntax tree;
the query module 330, coupled to the parsing module 320, is adapted to query the corresponding query data from the preset database according to the query rule, and feed the queried query data back to the web platform.
In an embodiment of the present invention, the receiving module 310 is further adapted to receive, based on an http protocol, an http query request sent by an analyst through a pre-established web platform.
In an embodiment of the present invention, the parsing module 320 is further adapted to optimize the syntax tree by using an optimizer, and generate a corresponding query plan tree according to the optimized syntax tree, where the query plan tree includes a plurality of tree nodes, and the tree nodes correspond to query steps of the query rule.
In an embodiment of the present invention, the query module 330 is further adapted to sequentially query the corresponding query data from the preset database according to the sequence of the query steps corresponding to the tree nodes in the query plan tree, and further integrate the queried query data after the query of the data according to all the tree nodes in the query plan tree is completed, and feed the integrated query data back to the web platform.
In an embodiment of the present invention, the query module 330 is further adapted to sequentially query the corresponding query data from the preset database according to the sequence of the query steps corresponding to the tree nodes in the query plan tree, and further feed back the query data currently queried to the web platform after querying the corresponding query data according to the query step corresponding to any tree node.
In an embodiment of the present invention, the query module 330 is further adapted to select a plurality of tree nodes from the query plan tree, query corresponding query data from the preset database according to the selected plurality of tree nodes in parallel, integrate the queried query data after querying the data according to all the tree nodes in the query plan tree, and feed the integrated query data back to the web platform.
In an embodiment of the present invention, the query module 330 is further adapted to select a plurality of tree nodes from the query plan tree according to the sequence of the query steps corresponding to the tree nodes; and/or randomly selecting a plurality of tree nodes from the query plan tree.
In an embodiment of the present invention, the query module 330 is further adapted to perform format conversion on the queried query data to obtain query data with a uniform format, and feed the query data with the uniform format back to the web platform.
Referring to fig. 4, the apparatus 300 for online querying data further includes a caching module 340 in addition to the receiving module 310, the parsing module 320, and the querying module 330.
The caching module 340, coupled to the querying module 330, is adapted to cache the queried query data and the corresponding query request after querying the corresponding query data from the preset database according to the query rule. When the receiving module 310 receives the same query request sent by the analyst through the pre-established web platform again, the cached query data corresponding to the query request is directly obtained.
Based on the same inventive concept, the present invention further provides a system for querying data online, and referring to fig. 5, the system 500 for querying data online comprises a web platform 510, the device 300 for querying data online in any of the above embodiments, and a preset database 520, wherein the web platform 510 receives a query request input by an analyst and sends the query request to the device 300 for querying data online; after receiving the query request from the web platform, the device for online querying data 300 performs syntax parsing on the query request to obtain a corresponding syntax tree, and customizes a query rule corresponding to the query request based on the syntax tree. Further, the device for online query of data 300 queries corresponding query data from the preset database 520 according to the query rule, and feeds the queried query data back to the web platform 510. For a specific online data query process, reference may be made to the above embodiments, which are not described in detail herein.
The present invention also provides a computer storage medium storing computer program code, which when run on a computing device, causes the computing device to execute the method for querying data online in any of the above embodiments.
In addition, the present invention also provides a computing device comprising: a processor; a memory storing computer program code; the computer program code, when executed by a processor, causes a computing device to perform the method of querying data online in any of the above embodiments.
According to any one or a combination of the above preferred embodiments, the following advantages can be achieved by the embodiments of the present invention:
in the embodiment of the invention, after receiving a query request sent by an analyst by using a pre-established web platform, syntax analysis is firstly carried out on the received query request to obtain a corresponding syntax tree, then a query rule corresponding to the query request is customized based on the syntax tree, and corresponding query data is queried from a preset database according to the query rule so as to feed the queried query data back to the web platform. Therefore, the on-line data query mode of the embodiment of the invention obtains the corresponding syntax tree by performing syntax analysis on the query request to support multiple self-defined query syntaxes, namely, the self-defined query syntaxes are realized, the query rule corresponding to the query request can be flexibly customized based on the syntax tree, and the required query data can be queried from the database according to the query rule. Furthermore, the scheme also effectively improves the query efficiency of online query data.
It is clear to those skilled in the art that the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and for the sake of brevity, further description is omitted here.
In addition, the functional units in the embodiments of the present invention may be physically independent of each other, two or more functional units may be integrated together, or all the functional units may be integrated in one processing unit. The integrated functional units may be implemented in the form of hardware, or in the form of software or firmware.
Those of ordinary skill in the art will understand that: the integrated functional units, if implemented in software and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computing device (e.g., a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention when the instructions are executed. And the aforementioned storage medium includes: u disk, removable hard disk, Read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disk, and other various media capable of storing program code.
Alternatively, all or part of the steps of implementing the foregoing method embodiments may be implemented by hardware (such as a computing device, e.g., a personal computer, a server, or a network device) associated with program instructions, which may be stored in a computer-readable storage medium, and when the program instructions are executed by a processor of the computing device, the computing device executes all or part of the steps of the method according to the embodiments of the present invention.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments can be modified or some or all of the technical features can be equivalently replaced within the spirit and principle of the present invention; such modifications or substitutions do not depart from the scope of the present invention.

Claims (21)

1. A method of querying data online, comprising:
receiving a query request sent by an analyst through a pre-established web platform;
parsing a syntax of a received query request to obtain a corresponding syntax tree, and customizing a query rule corresponding to the query request based on the syntax tree, wherein the query rule is a query plan tree which is customized based on the syntax tree and corresponds to the query request, and the query plan tree comprises a plurality of query nodes;
inquiring corresponding inquiry data from a preset database according to the inquiry rule, feeding the inquired inquiry data back to the web platform, wherein the inquiry data inquired by each inquiry node in sequence are integrated and then fed back to the web platform, or the inquiry data inquired by any inquiry node is fed back to the web platform, or the inquiry nodes in an inquiry plan tree are randomly selected, and the inquiry data inquired by the inquiry nodes in parallel is fed back to the web platform, wherein the online inquiry data method is realized by an SQL-like retrieval engine which comprises a data layer smart, a driver layer GOD and a Model layer, the inquiry plan tree and the inquiry data inquired and responded by the driver layer GOD are generated by the data layer smart layer, and the Model layer sends the inquiry data to the data layer smart after unifying the format of the inquiry data, and feeding back the query data to a user by the data layer smart.
2. The method of claim 1, wherein receiving a query request issued by an analyst via a pre-established web platform comprises:
and receiving an http query request sent by an analyst through a pre-established web platform based on an http protocol.
3. The method of claim 1 or 2, wherein customizing the query rule corresponding to the query request based on the syntax tree comprises:
optimizing the syntax tree by adopting an optimizer;
and generating a corresponding query plan tree according to the optimized syntax tree, wherein the query plan tree comprises a plurality of tree nodes, and the tree nodes correspond to the query steps of the query rules.
4. The method of claim 3, wherein querying the corresponding query data from a preset database according to the query rule, and feeding the queried query data back to the web platform comprises:
sequentially querying corresponding query data from the preset database according to the sequence of the query steps corresponding to the tree nodes in the query plan tree;
and after the data query is finished according to all the tree nodes in the query plan tree, integrating the queried query data, and feeding the integrated query data back to the web platform.
5. The method of claim 3, wherein querying the corresponding query data from a preset database according to the query rule, and feeding the queried query data back to the web platform comprises:
sequentially querying corresponding query data from the preset database according to the sequence of the query steps corresponding to the tree nodes in the query plan tree;
after the corresponding query data are queried according to the query step corresponding to any tree node, the query data which are queried currently are fed back to the web platform.
6. The method of claim 3, wherein querying the corresponding query data from a preset database according to the query rule, and feeding the queried query data back to the web platform comprises:
selecting a plurality of tree nodes from the query plan tree;
according to the selected multiple tree nodes, inquiring corresponding inquiry data from the preset database in parallel;
and after the data query is finished according to all the tree nodes in the query plan tree, integrating the queried query data, and feeding the integrated query data back to the web platform.
7. The method of claim 6, wherein selecting a plurality of tree nodes from the query plan tree comprises:
selecting a plurality of tree nodes from the query plan tree according to the sequence of the query steps corresponding to the tree nodes; and/or
A plurality of tree nodes are randomly selected from the query plan tree.
8. The method of claim 1 or 2, further comprising:
after corresponding query data are queried from a preset database according to the query rule, caching the queried query data and a corresponding query request;
and when the same query request sent by an analyst through the pre-established web platform is received again, directly acquiring the cached query data corresponding to the query request.
9. The method of claim 1 or 2, wherein feeding queried query data back to the web platform comprises:
carrying out format conversion on the inquired inquiry data to obtain inquiry data with a uniform format;
and feeding back the query data with unified format to the web platform.
10. An apparatus for querying data online comprises
The receiving module is suitable for receiving a query request sent by an analyst through a pre-established web platform;
the query module is suitable for parsing a received query request to obtain a corresponding syntax tree, and customizing a query rule corresponding to the query request based on the syntax tree, wherein the query rule is a query plan tree which is customized based on the syntax tree and corresponds to the query request, and the query plan tree comprises a plurality of query nodes;
the query module is suitable for querying corresponding query data from a preset database according to the query rule and feeding the queried query data back to the web platform, wherein the query data sequentially queried by each query node are integrated and then fed back to the web platform, or the query data queried by any query node are fed back to the web platform, or query nodes in a query plan tree are randomly selected and the query data queried by the query nodes in parallel are fed back to the web platform, wherein the online query data query method is realized by a SQL-like search engine which comprises a data layer smart, a driver layer GOD and a Model layer, the query plan tree is generated by the data layer smart layer, the query data responded by the driver layer GOD is queried, the format of the query data is unified by the Model layer, and then the query data are sent to the data layer smart, and feeding back the query data to a user by the data layer smart.
11. The apparatus of claim 10, wherein the receiving means is further adapted to:
and receiving an http query request sent by an analyst through a pre-established web platform based on an http protocol.
12. The apparatus of claim 10 or 11, wherein the parsing module is further adapted to:
optimizing the syntax tree by adopting an optimizer;
and generating a corresponding query plan tree according to the optimized syntax tree, wherein the query plan tree comprises a plurality of tree nodes, and the tree nodes correspond to the query steps of the query rules.
13. The apparatus of claim 12, wherein the query module is further adapted to:
sequentially querying corresponding query data from the preset database according to the sequence of the query steps corresponding to the tree nodes in the query plan tree;
and after the data query is finished according to all the tree nodes in the query plan tree, integrating the queried query data, and feeding the integrated query data back to the web platform.
14. The apparatus of claim 12, wherein the query module is further adapted to:
sequentially querying corresponding query data from the preset database according to the sequence of the query steps corresponding to the tree nodes in the query plan tree;
after the corresponding query data are queried according to the query step corresponding to any tree node, the query data which are queried currently are fed back to the web platform.
15. The apparatus of claim 12, wherein the query module is further adapted to:
selecting a plurality of tree nodes from the query plan tree;
according to the selected multiple tree nodes, inquiring corresponding inquiry data from the preset database in parallel;
and after the data query is finished according to all the tree nodes in the query plan tree, integrating the queried query data, and feeding the integrated query data back to the web platform.
16. The apparatus of claim 15, wherein the query module is further adapted to:
selecting a plurality of tree nodes from the query plan tree according to the sequence of the query steps corresponding to the tree nodes; and/or
A plurality of tree nodes are randomly selected from the query plan tree.
17. The apparatus of claim 10 or 11, further comprising:
the cache module is suitable for caching the inquired inquiry data and the corresponding inquiry request after inquiring the corresponding inquiry data from the preset database according to the inquiry rule;
and when the receiving module receives the same query request sent by an analyst through a pre-established web platform again, the cached query data corresponding to the query request is directly obtained.
18. The apparatus of claim 10 or 11, wherein the query module is further adapted to:
carrying out format conversion on the inquired inquiry data to obtain inquiry data with a uniform format;
and feeding back the query data with unified format to the web platform.
19. A system for on-line query of data, comprising a web platform, the device for on-line query of data of claim 7, and a preset database, wherein,
the web platform receives a query request input by an analyst and sends the query request to the device for online querying data;
the device for on-line data query receives a query request from the web platform, performs syntax analysis on the query request to obtain a corresponding syntax tree, and customizes a query rule corresponding to the query request based on the syntax tree, wherein the query rule is a query plan tree which is customized based on the syntax tree and corresponds to the query request, and the query plan tree comprises a plurality of query nodes;
the device for online data query queries corresponding query data from a preset database according to the query rule and feeds the queried query data back to the web platform, wherein the query data sequentially queried by each query node is integrated and then fed back to the web platform, or the query data queried by any query node is fed back to the web platform, or query nodes in a query plan tree are randomly selected and the query data queried by the query nodes in parallel is fed back to the web platform, wherein the method for online data query is realized by a SQL-like search engine which comprises a data layer smart, a driver layer GOD and a Model layer, the query plan tree is generated by the data layer smart layer, and the query data responded by the driver layer GOD is queried by the driver layer GOD, and after the format of the query data is unified by the Model layer, the query data is sent to a data layer smart, and the query data is fed back to a user by the data layer smart.
20. A computer storage medium having computer program code stored thereon which, when run on a computing device, causes the computing device to perform the method of querying data online of any one of claims 1-9.
21. A computing device, comprising: a processor; a memory storing computer program code; the computer program code, when executed by the processor, causes the computing device to perform the method of querying data online of any of claims 1-9.
CN201810688094.4A 2018-06-28 2018-06-28 Method, device and system for inquiring data online Expired - Fee Related CN108920614B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810688094.4A CN108920614B (en) 2018-06-28 2018-06-28 Method, device and system for inquiring data online

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810688094.4A CN108920614B (en) 2018-06-28 2018-06-28 Method, device and system for inquiring data online

Publications (2)

Publication Number Publication Date
CN108920614A CN108920614A (en) 2018-11-30
CN108920614B true CN108920614B (en) 2021-08-20

Family

ID=64421943

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810688094.4A Expired - Fee Related CN108920614B (en) 2018-06-28 2018-06-28 Method, device and system for inquiring data online

Country Status (1)

Country Link
CN (1) CN108920614B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109710641A (en) * 2018-12-17 2019-05-03 浩云科技股份有限公司 A kind of inquiry processing method and system of mass data

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010050675A2 (en) * 2008-10-29 2010-05-06 한국과학기술원 Method for automatically extracting relation triplets through a dependency grammar parse tree
CN103761080A (en) * 2013-12-25 2014-04-30 中国农业大学 Structured query language (SQL) based MapReduce operation generating method and system
CN107832391A (en) * 2017-10-31 2018-03-23 长城计算机软件与系统有限公司 A kind of data query method and system
CN107943952A (en) * 2017-11-24 2018-04-20 北京赛思信安技术股份有限公司 A kind of implementation method that full-text search is carried out based on Spark frames

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160070546A1 (en) * 2014-09-06 2016-03-10 Aquameta LLC Computer programming system and method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010050675A2 (en) * 2008-10-29 2010-05-06 한국과학기술원 Method for automatically extracting relation triplets through a dependency grammar parse tree
CN103761080A (en) * 2013-12-25 2014-04-30 中国农业大学 Structured query language (SQL) based MapReduce operation generating method and system
CN107832391A (en) * 2017-10-31 2018-03-23 长城计算机软件与系统有限公司 A kind of data query method and system
CN107943952A (en) * 2017-11-24 2018-04-20 北京赛思信安技术股份有限公司 A kind of implementation method that full-text search is carried out based on Spark frames

Also Published As

Publication number Publication date
CN108920614A (en) 2018-11-30

Similar Documents

Publication Publication Date Title
CN108985981B (en) Data processing system and method
US11226977B1 (en) Application of event subtypes defined by user-specified examples
US10498355B2 (en) Searchable, streaming text compression and decompression using a dictionary
EP3096250B1 (en) System and method for distributed database query engines
US20200372007A1 (en) Trace and span sampling and analysis for instrumented software
CN108073625B (en) System and method for metadata information management
CN109033206B (en) Rule matching method, cloud server and rule matching system
CN113360554B (en) Method and equipment for extracting, converting and loading ETL (extract transform load) data
US11494395B2 (en) Creating dashboards for viewing data in a data storage system based on natural language requests
US9514184B2 (en) Systems and methods for a high speed query infrastructure
CN112395333B (en) Method, device, electronic equipment and storage medium for checking data abnormality
CN108206776B (en) Group history message query method and device
CN111858760A (en) Data processing method and device for heterogeneous database
CN105786941B (en) Information mining method and device
CN108920614B (en) Method, device and system for inquiring data online
CN106557483B (en) Data processing method, data query method, data processing equipment and data query equipment
CN117271584A (en) Data processing method and device, computer readable storage medium and electronic equipment
CN114443599A (en) Data synchronization method and device, electronic equipment and storage medium
CN115510139A (en) Data query method and device
CN106599244B (en) General original log cleaning device and method
CN113779017A (en) Method and apparatus for data asset management
Wu et al. RIVA: A Real-Time Information Visualization and analysis platform for social media sentiment trend
US20190034555A1 (en) Translating a natural language request to a domain specific language request based on multiple interpretation algorithms
KR20160031807A (en) The spreadsheet-based big data analysis system and the method
Rump et al. Distributed and collaborative malware analysis with MASS

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20210820

CF01 Termination of patent right due to non-payment of annual fee