WO2014101445A1 - Data query method and system - Google Patents

Data query method and system Download PDF

Info

Publication number
WO2014101445A1
WO2014101445A1 PCT/CN2013/082130 CN2013082130W WO2014101445A1 WO 2014101445 A1 WO2014101445 A1 WO 2014101445A1 CN 2013082130 W CN2013082130 W CN 2013082130W WO 2014101445 A1 WO2014101445 A1 WO 2014101445A1
Authority
WO
WIPO (PCT)
Prior art keywords
query
data
field
index table
collector
Prior art date
Application number
PCT/CN2013/082130
Other languages
French (fr)
Chinese (zh)
Inventor
谢永方
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2014101445A1 publication Critical patent/WO2014101445A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/93Document management systems

Definitions

  • the present invention relates to the field of computer network technologies, and in particular, to a data query method, a data query server, a data collector, and a data query system.
  • the system architecture In the massive data collection and query system, the system architecture has two modes: distributed storage and centralized storage. In either case, it faces the need for rapid storage and fast query of massive log data.
  • An existing distributed data query system includes a data query server and multiple data collectors.
  • the data collector is responsible for collecting (receiving, formatting, merging) logs, storing and indexing, and the data query server is a log query. Unified entrance.
  • the data query server sends a query command to all data collectors, and the query results of all data collectors are collected to summarize the final query result.
  • the logs to be queried in one query exist only in a few data collectors, and the query operations are very frequent.
  • This existing solution will increase the burden on all data collectors, including the power consumption of the data collector.
  • the central processing unit (CPU) resources in addition to the query, the data collector needs to do data receiving and warehousing work. If the query operation is frequent, it will also affect the data collector's acquisition performance and reduce The overall processing power of the system.
  • the original log data of another distributed data query system is centralized storage, each The data collector is only responsible for collecting (receiving, formatting, merging) and reporting the log.
  • the log content is not saved locally after being processed by the data collector, but is reported to the data query server for storage.
  • the data query server After receiving the log reported by the data collector, the data query server stores the data in the database and establishes an index. When the log query is needed, it can be directly queried in the database of the data query server. This way of storing data in a centralized manner, the log query operation is limited to being executed in the database of the data query server, and does not affect the data collector.
  • the present invention provides a data query method, a data query server, a data collector, and a data query system, which can improve the processing speed of data query, reduce the system resource occupation of the data collector, and load pressure of the data query server, and improve the entire The processing power of the system.
  • a first aspect of the present invention provides a data query method, where the method includes:
  • the collector identifier corresponding to the query term Obtaining, by the centralized index table corresponding to the field, the collector identifier corresponding to the query term, where the centralized index table stores a correspondence between the query word and the collector identifier in the field; generating according to the query request a query command carrying the field and the query word, and sending the query command to the data collector corresponding to the collector identifier, where the data collector performs the query command in the data collector In the local index table corresponding to the field, the query obtains data that matches the query word carried in the query command;
  • the method before the obtaining, by the centralized index table corresponding to the field, the collector identifier corresponding to the query term, the method further includes: a field, establishing a centralized index table corresponding to the field;
  • the establishing a centralized index table corresponding to the field includes:
  • the report index table includes a query word corresponding to the field in a data collector that sends the report index table;
  • the index table the correspondence between the identifier of the data collector and the query term of the field in the report index table reported by the data collector is stored.
  • the obtaining, by the query, the collector identifier corresponding to the query term from the centralized index table corresponding to the field includes:
  • the query request carries at least two fields to be queried, obtain query terms of each field in the query request, and record a logical relationship between the query words of the fields;
  • the present invention further provides a data query method, the method includes: receiving a query command sent by a data query server, where the query command includes a field to be queried carried in a query request received by the data query server And a query term in the field; querying, from a local index table corresponding to the field, a storage location of data matching a query word in the query command, where the local index table stores the Correspondence between the query word and the storage location of the data;
  • the data is acquired and sent to the data query server according to the storage location of the data.
  • the querying from the local index table corresponding to the field, the location of the data that matches the query word in the query command Previously, it also included:
  • the content of the data in the field is used as a query word of the data, and a mapping relationship between the query word and the storage location is established, and a local index table of the field in the current data collector is formed.
  • the data is stored in a local index table of the field in the current data collector After the correspondence between the query word and the storage location of the data, the method further includes:
  • the obtaining, by the local index table corresponding to the field, the storage location of the data that matches the query term in the query command includes:
  • the query command carries at least two fields to be queried, the query words of each field in the query command are obtained, and the logical relationship between the query words of the fields is recorded;
  • the storage location of the data satisfying the logical relationship is filtered out from the storage location of the data obtained by the query.
  • the present invention further provides a data query server, where the data query server includes:
  • a first receiving unit configured to receive an input query request, where the query request carries a field to be queried and a query word in the field;
  • a first querying unit configured to: obtain, from the centralized index table corresponding to the field, a collector identifier corresponding to the query word carried by the first receiving unit, and the storage device in the centralized index table Corresponding relationship between the query word in the field and the collector identifier;
  • a first processing unit configured to generate, according to the query request, a query command that carries the field and the query word, and send the query command to a data collector corresponding to the collector identifier obtained by querying by the first query unit And the data is obtained by the data collector in a local index table corresponding to the field carried by the query command in the data collector, and the query obtains data that matches the query word carried in the query command;
  • the first output unit is configured to receive the data returned by the data collector, form a query result of the query request according to the received data, and output the result.
  • the data query server further includes:
  • a first index unit configured to establish, according to the field, a centralized index table corresponding to the field; the first index unit includes:
  • a first receiving subunit configured to receive a reporting index table of the field sent by each data collector, where the reporting index table includes a query corresponding to the field in the data collector that sends the reporting index table Word
  • the first index sub-unit is configured to store, in the centralized index table of the field, a correspondence between an identifier of the data collector and a query word of the field in the report index table reported by the data collector.
  • the first query unit includes:
  • a first parsing subunit configured to: if the query request received by the first receiving unit carries at least two fields to be queried, obtain query words of each field in the query request, and record the fields Query the logical relationship between words;
  • a first query subunit configured to query, from the centralized index table corresponding to each field, a collector identifier corresponding to the query word of each field acquired by the first parsing subunit;
  • a first filtering subunit configured to filter, according to the logical relationship between the query words of the fields obtained by the first parsing subunit, from the collector identifier obtained by querying by the first query subunit A collector identifier that satisfies the logical relationship.
  • the present invention further provides a data collector, where the data collector includes: a second receiving unit, configured to receive a query command sent by a data query server, where the query command carries a field to be queried and the The query word in the field;
  • a second query unit configured to query, from a local index table corresponding to the field, a storage location of data that matches a query word in a query command received by the second receiving unit, where the local index table stores Corresponding relationship between the query word in the field and the storage location of the data;
  • a second processing unit configured to acquire the data according to the storage location of the data obtained by querying by the second query unit, and send the data to the data query server.
  • the data collector further includes:
  • the second index unit includes:
  • Obtaining a subunit configured to acquire data in a current data collector and a storage location of the data, where the data includes content of at least one field;
  • a second index subunit configured to use, for each field acquired by the obtaining subunit, a content of the data in the field as a query word of the data, and a local index of the field in the data collector In the table, a correspondence between a query word storing the data and a storage location of the data.
  • the second indexing unit further includes:
  • a third index subunit configured to extract the query word from a local index table of the field obtained by the second index subunit, and perform deduplication processing on the query word to form the current data collector
  • the report index table of the field
  • a sending subunit configured to send the report index table of the field formed by the third index subunit to the data query server, where the data query server establishes a centralized index table of the field.
  • the second query unit includes:
  • a second parsing subunit configured to: if the query command received by the second receiving unit carries at least two fields to be queried, obtain query words of each field in the query command, and record the fields Query the logical relationship between words;
  • a second query subunit configured to query, from the local index table corresponding to each field, a storage location of data that matches a query word of each field in the query command acquired by the second parsing subunit;
  • a second filtering subunit configured to filter, according to a logical relationship between the query words of the fields obtained by the second parsing subunit, from a storage location of the data obtained by querying by the second query subunit A storage location of data that satisfies the logical relationship is obtained.
  • the present invention further provides a data query system, where the system includes: the data query server provided by the third aspect, and the data collector provided by the fourth aspect.
  • FIG. 1 is a structural diagram of a data query system according to an embodiment of the present invention
  • FIG. 2 is a signaling diagram of an index establishment process according to Embodiment 1 of the present invention.
  • FIG. 3 is a flowchart of a data query method according to Embodiment 1 of the present invention.
  • FIG. 5 is a schematic diagram of a data query system according to Embodiment 2 of the present invention.
  • FIG. 6 is a schematic diagram of a data query server and a data collector provided by Embodiment 2 of the present invention
  • FIG. 7 is a schematic diagram of a data query server according to Embodiment 3 of the present invention
  • FIG. 8 is a schematic diagram of a data collector provided by Embodiment 3 of the present invention.
  • DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS The technical solution of the present invention will be further described in detail below with reference to the accompanying drawings and embodiments.
  • 1 is a structural diagram of a data query system according to an embodiment of the present invention. As shown in FIG. 1, the present invention adopts a distributed architecture, including a data query server 10 and multiple data collectors 20.
  • the data collector 20 is responsible for
  • the data source server 10 is a unified entry of data query for collecting (including receiving, formatting, merging), storing, and indexing data such as massive logs reported by the log source 30.
  • the data query method provided by the present invention can be used for quick query of massive data.
  • the log data is taken as an example for description.
  • the data stored in the system Before querying the log data, the data stored in the system needs to be indexed in advance, usually during data storage, for the system to query the data according to the established index table.
  • a local index table and a centralized index table are respectively established in the data collector and the data query server.
  • the local index table is used to store the index of the log data in the current data collector, and its function is: When the query condition is given, the specific storage location of all the logs in the local data that meet the conditions can be found.
  • the centralized index table is used to store an index of the query word and the collector identifier of each field, The function is: When the query condition is given, it can be found on which data collectors the data to be queried may be stored, and the identification information of the data collector storing the data to be queried is given in the centralized index table.
  • Step S101 A data collector acquires data in a current data collector and a storage location of the data.
  • the data stored in the data collector is the original log data reported by the log source.
  • the data collector After the log source reports the raw log data to the data collector, the data collector also needs to build a local index for the original log data.
  • the data collector formats and merges the original log data, and processes the original log data into the form of each record in the log table (that is, each row of records in the log table).
  • Each log table may have multiple fields, as follows As shown in Table 1, the log table includes fields such as field 1 and field 2, and the serial number indicates the storage location of the data.
  • Step S102 The data collector uses, for each field, the content of the data in the field as a query word of the data, and establishes a mapping relationship between the query word and the storage location, forming the field in the current data.
  • the local index table of the collector uses, for each field, the content of the data in the field as a query word of the data, and establishes a mapping relationship between the query word and the storage location, forming the field in the current data.
  • a local index table is established for each field of each record in the log table, and each index table corresponds to a field content in the specified log table and contains a specified field content.
  • the content of the field is used as the query term for the corresponding data.
  • the mapping relationship between the query term and the storage location may be, but is not limited to, expressed in the form of a table, as shown in Table 2 below, the local index table of the field 1 in the data collector:
  • the specific location of the data on the data collector can be quickly found by the local index table of the corresponding field. For example, if you want the specific location of a content in field 1, you can quickly find the corresponding location based on the local index table of field 1.
  • Step S103 The data collector extracts the query word from the local index table of the field, and performs deduplication processing on the query word to form a report index table of the field of the current data collector.
  • the data collector extracts the reported report index table according to the local index table of the field. There is no specific original log location corresponding to each index content in the report index table, and only the content of each data corresponding field, that is, the query word. Usually, before the report, the query words are de-reprocessed so that the fields of the reported fields are not duplicated. For example, the field 1 in the table has two aaa and one bbb. After de-reprocessing, there is only one aaa and one bbb in the index table, as shown in Table 3 below.
  • Bbb compares the query words extracted from the newly added local index table with the reported report index table. If the query words are the same, they are not added to the report index table, only the new ones are not duplicated. Report the index table.
  • Step S104 The data collector sends the report index table of the field to the data query server.
  • the data collector reports the report index table as shown in Table 3 to the data query server. What's new in this table.
  • the data query server summarizes the report index table sent by the multiple data collectors, and establishes a centralized index table corresponding to each field in the data query server, which specifically includes: Step S105: The data query server receives the report index table of the field sent by each data collector.
  • the report index table of the data collector includes a query word corresponding to the field in the data collector.
  • Step S106 Establish a mapping relationship between the query word and the identifier of the collector, and form a centralized index table of the field.
  • the data query server respectively establishes a centralized index table of corresponding fields.
  • the centralized index table of the data query server stores the query words and corresponding collector identifiers contained in each data collector, as shown in Table 4 below:
  • Each of the fields in the data query server has a centralized index table, and the report index table reported by each data collector is summarized into a centralized index table of the data query server, that is, in the centralized index table of the field on the data query server. Corresponding relationship between the identifier of the data collector and the query term of the field in the report index table reported by the data collector.
  • the index table is created, when the query request is received, the data to be queried can be found through the index table.
  • FIG. 3 is a flowchart of a data query method according to the embodiment.
  • the data query method of the present invention includes: Step S201: The data query server receives the input query request.
  • the user inputs a query request by means of a form search or an expression search.
  • a form search or an expression search.
  • a fixed field is given in the interaction interface with the user, the user can input the query word in the prompt box of multiple fields, and finally submit the query request to the data query server through the submit button.
  • the expression retrieval method the user directly inputs the field to be queried and the query word of the field, and submits it to the data query server.
  • the query request received by the data query server carries the field to be queried and the query term of the field.
  • the query request received by the data query server also carries the logical relationship between the query words of the respective fields. For example, the user enters query terms in different search fields, and the data query server receives the logical relationships between the query words and the various query terms in the various search fields.
  • Step S202 The data query server queries the centralized index table corresponding to the field to obtain a collector identifier corresponding to the query word.
  • the query word matching the query word is obtained in the centralized index table of the field 1, and the corresponding collector identifier is obtained, thereby obtaining the data collector of the query word.
  • the collector identifier corresponding to aaa is obtained from the centralized index table shown in Table 4 as collector 1 and collector 3.
  • this step specifically includes:
  • Step S2021 Acquire a query word of each field in the query request, and record a logical relationship between the query words of the fields carried in the query request.
  • Step S2022 Query, from the centralized index table corresponding to each field, the collector identifier corresponding to the query word of each field.
  • Step S2023 According to the logical relationship between the query words of the fields, the collector identifier that is obtained by querying in step S2022 is filtered to obtain a collector identifier that satisfies the logical relationship.
  • the identifier of the collector corresponding to aaa is collected as collector 1 and collector 3.
  • the collector identifier corresponding to ccc is obtained as collector 1, between aaa and ccc. If the logical relationship is "and", then the collector identifier that satisfies the logical relationship can be filtered out only by the collector 1.
  • Step S203 The data query server generates a query command that carries the field and the query word according to the query request, and sends the query command to the data collector corresponding to the collector identifier.
  • the data collector obtains data matching the query word through the local index table of the data collector, and the specific process is described in detail in conjunction with FIG. 3 .
  • Step S204 The data query server receives the data returned by the data collector, and forms a query result of the query request according to the received data, and outputs the result.
  • the data query server summarizes the received data, which may be, but is not limited to, output in the form of a table.
  • FIG. 4 is a flowchart of still another data query method provided by this embodiment. As shown in FIG. 4, the data query method of the present invention includes:
  • Step S301 The data collector receives the query command sent by the data query server.
  • the query command includes a field to be queried and a query word of the field carried in the query request received by the data query server, and optionally, a query word and a plurality of fields The logical relationship between the query words.
  • Step S302 The data collector queries the local index table corresponding to the field to obtain a storage location of data that matches the query command.
  • Step S3021 The data collector obtains the query words of each field in the query command received in step S301, and records each field carried in the query command. The logical relationship between the query words.
  • Step S3022 The data collector queries, from the local index table corresponding to each field, a storage location of data that matches the query words of the respective fields.
  • the storage location corresponding to the data obtained in the local index table of field 1 is 2, and the data is queried in the local index table of field 2.
  • the corresponding storage location is also 2.
  • Step S3023 The data collector filters, according to the logical relationship between the query words of the fields, the storage location of the data that satisfies the logical relationship from the storage location of the data obtained by the query.
  • the data collector filters the storage location of the matched data according to the logical relationship between the query words to obtain the storage location of the data satisfying the logical relationship.
  • Step S303 The data collector acquires the data according to the storage location of the data, and sends the data to the data query server.
  • the data collector obtains the corresponding data according to the storage location of the data obtained in step S302.
  • the data that can be obtained to satisfy the query command is:
  • the data collector sends the data to the data query server, and the data query server aggregates and outputs the query result of the query request.
  • the local index table and the centralized index table are respectively established in the data collector and the data query server, and when the data is queried, the collector identifier corresponding to the query word is found through the centralized index table of the corresponding field. Therefore, the corresponding data is obtained in the data collector corresponding to the collector identifier, which can effectively reduce the system resource occupation of the data query server and the data collector, so that the data collector can have more resources for improving the collection performance. Improve the processing speed of data queries.
  • FIG. 5 is a schematic diagram of a data query system according to the embodiment.
  • the data query system of the embodiment of the present invention includes: a data query server 10 and a data collector 20.
  • the data collector 20 is responsible for data collection, including receiving, formatting, merging, storing data, and indexing the stored data.
  • the data query server 10 is used for unified management of the contents stored on the plurality of data collectors 20 and serves as a unified entry for data query.
  • FIG. 6 is a schematic diagram of the data query server 10 and the data collector 20 provided in this embodiment.
  • the data query server 10 includes a first index unit 100, a first receiving unit 101, and a first query unit 102.
  • the data collector 20 includes a second index unit 200, a second receiving unit 201, a second query unit 202, and a second processing unit 203.
  • the data collector 20 and the data query server 10 need to pre-index the data stored in the system, which is usually completed during the data storage, and is used by the system to query the data according to the established index table.
  • the data query server 10 establishes a centralized index table of the fields for each of the fields using the first index unit 100.
  • the data collector 20 uses the second index unit 200 to establish a local index table of fields.
  • the local index table is used to store the index of the log data in the current data collector.
  • the function is: When the query condition is given, the specific storage location of all the logs in the local data that meet the conditions can be found.
  • the centralized index table is used to store the index of the data to be queried and the identifier of the collector, and its function is: When the query condition is given, it can be found on which data collectors the data to be queried may be stored, and the storage is given in the centralized index table. Identification information of the data collector of the data to be queried.
  • the second index unit 200 includes an acquisition subunit 2001, a second index subunit 2002, a third index subunit 2003, and a transmission subunit 2004.
  • the obtaining subunit 2001 is configured to acquire data in the current data collector and a storage location of the data, where the data includes content of at least one field.
  • the data stored in the data collector is the original log data reported by the log source.
  • the data collector After the log source reports the raw log data to the data collector, the data collector also needs to build a local index for the original log data.
  • the data collector formats and merges the original log data, and processes the original log data into the form of each record in the log table (ie, each row in the log table), and each log table may There are multiple fields, as shown in Table 1, the log table includes fields such as field 1 and field 2, and the serial number indicates the storage location of the data.
  • the second index sub-unit 2002 is configured to use, for each field, the content of the data in the field as a query word of the data, to establish a mapping relationship between the query word and the storage location, to form the field in the The local index table of the current data collector.
  • a local index table is established for each field of each record in the log table, and each index table corresponds to a field content in the specified log table and a certain field containing the specified
  • the data of the content is stored in the log table.
  • the field content is used as the query word of the corresponding data.
  • the mapping relationship between the query term and the storage location may be, but is not limited to, expressed in the form of a table, as shown in Table 2.
  • the specific location of the data on the data collector can be quickly found by the local index table of the corresponding field. For example, if you want the specific location of a content in field 1, you can quickly find the corresponding location based on the local index table of field 1.
  • the third index subunit 2003 is configured to extract the query word from the local index table of the field, and perform deduplication processing on the query word to form a report index table of the field of the current data collector.
  • the data collector extracts the reported report index table according to the local index table of the field. There is no specific original log location corresponding to each index content in the report index table, and only the content of each data corresponding field, that is, the query word. Usually, before the report, the query words are de-reprocessed so that the fields of the reported fields are not duplicated. For example, the field 1 in the table has two aaa and one bbb. After de-reprocessing, there is only one aaa and one bbb in the index table, as shown in Table 3.
  • query words extracted from the newly added local index table compare them with the reported report index table. If the query words are the same, they are not added to the report index table, and only the newly added non-repeating reports are uploaded. direction chart.
  • the sending sub-unit 2004 is configured to send the report index table of the field to the first index unit 100 of the data query server 10, and use the data query server to establish a centralized index table of the field.
  • the first index unit 100 includes a first receiving subunit 1001 and a first index subunit 1002.
  • the first receiving subunit 1001 is configured to receive a reporting index table of the field sent by each data collector.
  • the report index table includes a query word corresponding to the field in the data collector that sends the report index table.
  • the first index sub-unit 1002 is configured to establish a mapping relationship between the query word and the collector identifier, and form a centralized index table of the field.
  • the first index subunit 1002 respectively establishes a centralized index table of corresponding fields for different fields. For example, for field 1, the query words contained in each data collector and the corresponding collector identifiers are stored in the centralized index table, as shown in Table 4.
  • the first index sub-unit 1002 establishes a centralized index table for each field, and the report index table reported by each data collector is summarized into a centralized index table of the data query server.
  • the index table is established in the data query server 10 and the data collector 20 by the first index unit 100 and the second index unit 200, respectively, when the query request is received, the data to be queried can be found through the index table.
  • the first receiving unit 101 is configured to receive an input query request.
  • the user inputs a query request by means of a form search or an expression search.
  • a form search or an expression search.
  • a fixed field is given in the interaction interface with the user, the user can input the query word in the prompt box of multiple fields, and finally submit the query request to the data query server through the submit button.
  • the expression retrieval method the user directly inputs the field to be queried and the query word of the field, and submits it to the data query server.
  • the query request received by the first receiving unit 101 carries the field to be queried and the query word of the field.
  • the query request received by the first receiving unit 101 further includes a logical relationship between the query words of the respective fields. For example, the user inputs a query word in a different search field, and the first receiving unit 101 receives the logical relationship between the query words and the respective query words in the respective different search fields.
  • the first querying unit 102 is configured to obtain, from the centralized index table of the field, the collector identifier corresponding to the query word carried by the query request received by the first receiving unit 101.
  • the first query unit 102 obtains the query word matching the query word from the centralized index table of the field 1, and obtains the corresponding collector identifier, thereby obtaining the query word.
  • Data collector When the query request includes the field 1, the first query unit 102 obtains the query word matching the query word from the centralized index table of the field 1, and obtains the corresponding collector identifier, thereby obtaining the query word.
  • the first query unit 102 includes: a parsing subunit, a first query subunit, and a first filtering subunit (not shown).
  • the first parsing subunit is configured to obtain a query word of each field in the query request, and record the The logical relationship between the query words of each field.
  • the first query sub-unit is configured to query, from the centralized index table corresponding to the field, the collector identifier corresponding to the query word of each field obtained by the first parsing sub-unit.
  • a first filtering subunit configured to: according to the logical relationship between the query words of the fields obtained by the first parsing subunit, the collector identifier that is obtained by querying from the first query subunit is satisfied The collector ID of the logical relationship.
  • the first processing unit 103 is configured to generate, according to the query request, a query command that carries the field and the query word, and send the query command to the data collector 20 corresponding to the obtained collector identifier that is obtained by the first query unit 102. .
  • the first query unit 102 queries the centralized index table shown in Table 5 to obtain the collector identifier corresponding to ccc as the collector 1, and the logical relationship between aaa and ccc is "OR", and the first processing unit 103 generates
  • the second receiving unit 201 of the data collector 20 is configured to receive the query command sent by the data query server 10.
  • the query command includes a field to be queried and a query term of the field carried in the query request received by the data query server 10, and may include a logical relationship between the query words of the plurality of fields and the query word.
  • the second query unit 202 is configured to query, from the local index table corresponding to the field, a storage location of data that matches the query word in the query command received by the second receiving unit 201.
  • the second query unit 202 includes: a second parsing subunit, a second query subunit, and a second filtering subunit (not shown).
  • the second parsing sub-unit is configured to: when the query command received by the second receiving unit 201 carries a plurality of fields to be queried, obtain query terms of each field in the query command, and record the information carried in the query command The logical relationship between the query words of each field. The storage location of the data matching the query words of each field in the query command obtained by the second parsing subunit.
  • the second filtering subunit is configured to filter, according to the logical relationship between the query words of the fields obtained by the second parsing subunit, from the storage location of the data obtained by querying by the second query subunit The storage location of the data that satisfies the logical relationship.
  • the second processing unit 203 is configured to obtain the data according to the storage location of the data that is queried by the second query unit 202, and send the data to the data query server.
  • the second processing unit 203 sends the data queried by the second query unit 202 to the first output unit 104 of the data query server 10 for outputting the query result of the query request.
  • the first output unit 104 is configured to receive the data returned by the second processing unit 203 of the data collector 20, form a query result of the query request according to the received data, and output the result.
  • the data query server, the data collector and the system provided by the embodiment of the present invention use the first index unit to establish a centralized index table in the data query server, and use the second index unit to establish a local index table in the data collector, thereby improving data query. Processing speed.
  • the present invention does not need to query each data collector, which reduces the burden on the data collector.
  • the solution provided by the embodiment of the present invention can effectively reduce the system resource occupation of the data collector, so that the data collector can have more resources for improving the collection performance, thereby improving the overall processing of the system. ability.
  • FIG. 7 is a schematic diagram of the data query server 10 according to the embodiment.
  • the data query server 10 includes: a network interface 71, a processor 72, and a memory 73.
  • the system bus 74 is used to connect the network interface 71, the processor 72, and the memory 73.
  • Network interface 71 is used to communicate with data collector 20.
  • the memory 73 may be a persistent storage such as a hard disk drive and a flash memory having a software module and a device driver.
  • the software modules are capable of executing the various functional modules of the above described method of the present invention; the device drivers can be network and interface drivers.
  • Receiving an input query request where the query request carries a field to be queried and a query word in the field; Obtaining, by the centralized index table corresponding to the field, the collector identifier corresponding to the query term, where the centralized index table stores a correspondence between the query word and the collector identifier in the field; generating according to the query request a query command carrying the field and the query word, and sending the query command to the data collector corresponding to the collector identifier, where the data collector is carried by the query command in the data collector
  • the local index table query corresponding to the field obtains data matching the query word carried in the query command;
  • the data query server of the present embodiment finds the collector identifier corresponding to the query word through the centralized index table of the field, so that the corresponding data is obtained in the data collector corresponding to the collector identifier, which can effectively alleviate the system resource occupation of the data query server. Improve the processing speed of data queries.
  • the above instruction process is a process in which the data query server establishes a centralized index table, and establishes a mapping relationship between the query word and the collector identifier, so that when the data is queried, the collector identifier corresponding to the query word is found, and the matching is obtained from the corresponding data collector.
  • the data is a process in which the data query server establishes a centralized index table, and establishes a mapping relationship between the query word and the collector identifier, so that when the data is queried, the collector identifier corresponding to the query word is found, and the matching is obtained from the corresponding data collector.
  • the data is a process in which the data query server establishes a centralized index table, and establishes a mapping relationship between the query word and the collector identifier, so that when the data is queried, the collector identifier corresponding to the query word is found, and the matching is obtained from the corresponding data collector.
  • the query words of each field in the query request are obtained, and the logical relationship between the query words of the fields is recorded;
  • the collector identifier obtained from the query is filtered to obtain a collector identifier that satisfies the logical relationship.
  • the above instruction process is a process in which the data query server searches for a corresponding collector identifier for a plurality of query words of the field to be queried, and can avoid accessing data that cannot fully satisfy the query request.
  • FIG. 8 is a schematic diagram of the data collector 20 provided by the embodiment. As shown in FIG. 8, the data collector 20 includes: a network interface 81, a processor 82, and a memory 83. System bus 84 is used to connect network interface 81, processor 82, and memory 83.
  • Network interface 81 is used to communicate with data query server 10.
  • the memory 83 may be a persistent storage such as a hard disk drive and a flash memory having a software module and a device driver.
  • the software modules are capable of executing the various functional modules of the above described method of the present invention; the device drivers can be network and interface drivers.
  • a query command sent by the data query server where the query command includes a field to be queried carried in the query request received by the data query server and a query word in the field; querying from a local index table corresponding to the field Obtaining a storage location of the data matching the query command, where the local index table stores a correspondence between the query words in the field and the storage location of the data;
  • the data is acquired and sent to the data query server according to the storage location of the data.
  • the data collector of the embodiment finds the data corresponding to the query word through the local index table of the field, and provides the data query server, which can effectively reduce the system resource occupation of the data collector, so that the data collector can have more resources for the data collector. Improve the performance of the acquisition and improve the processing speed of data query.
  • the above instruction process is a process in which the data collector establishes a local index table, and obtains a data according to a storage location of the data corresponding to the query word by establishing a mapping relationship between the query word and the storage location of the data.
  • the above instruction process is a process in which the data collector establishes a report index table according to the local index table and sends it to the data query server, so that the data query server establishes a centralized index table.
  • the query command When the query command carries at least two fields to be queried, the query words of each field in the query command are obtained, and the logical relationship between the query words of the fields is recorded;
  • the storage location of the data satisfying the logical relationship is filtered out from the storage location of the data obtained by the query.
  • the above instruction process is a process in which the data collector finds a corresponding storage location for a plurality of query words of the field to be queried, and can avoid obtaining data that cannot completely satisfy the query command.

Abstract

Provided are a data query method and system. The method comprises: receiving an input query request, the query request carries a field to be queried and query words in the field; querying a concentration index table corresponding to the field to obtain a collector identity corresponding to the query words; generating, according to the query request, a query command carrying the field and the query words, and sending the query command to a data collector corresponding to the collector identity, so that the data collector queries a local index table corresponding to the field carried in the query command in the data collector for data matching the query words carried in the query command; and receiving the data returned by the data collector, forming a query result of the query request according to the received data and outputting the result. The present invention can improve processing speed of data query, reduce occupation of system resources of the data collector, and decrease the load and pressure of a data query server.

Description

数据查询方法及系统 本申请要求于 2012 年 12 月 24 日提交中国专利局、 申请号为 201210566137.4、 发明名称为 "数据查询方法及系统" 的中国专利申请的优先 权, 其全部内容通过引用结合在本申请中。 技术领域  The present invention claims the priority of the Chinese Patent Application filed on Dec. 24, 2012, the Chinese Patent Application No. 201210566137.4, the name of the invention is "data query method and system", the entire contents of which are incorporated by reference. In this application. Technical field
本发明涉及计算机网络技术领域, 尤其涉及一种数据查询方法、 数据查 询服务器、 数据采集器及一种数据查询系统。  The present invention relates to the field of computer network technologies, and in particular, to a data query method, a data query server, a data collector, and a data query system.
背景技术 在当前互联网极度发达的时代, 数据的采集与查询系统具有广泛的用途, 各种信息技术(Information Technology, IT ) 系统、 网络设备、 安全设备都会 产生大量的日志等数据, 其中有很多日志数据需要长期存档, 并用于各种审 计和查询。 BACKGROUND In the era when the Internet is extremely developed, data collection and query systems have a wide range of uses, and various information technology (Information Technology, IT) systems, network devices, and security devices generate a large amount of logs and the like, and many of them have logs. Data needs to be archived for long periods of time and used for various audits and queries.
在海量数据的采集查询系统中, 系统的架构有分布式存储和集中式存储 两种方式, 不论哪种方式, 都面临着海量日志数据的快速存储、 快速查询的 需求。  In the massive data collection and query system, the system architecture has two modes: distributed storage and centralized storage. In either case, it faces the need for rapid storage and fast query of massive log data.
现有的一种分布式数据查询系统包括一台数据查询服务器和多台数据采 集器, 数据采集器负责日志的采集(接收、 格式化、 归并)、 存储和建立索引, 数据查询服务器是日志查询的统一入口。 需要查询指定日志的时候, 由数据 查询服务器给所有的数据采集器下发查询命令, 将所有数据采集器的查询结 果收到之后汇总出最终的查询结果。 如果数据采集器很多, 一次查询中要查 询的日志仅存在于少数数据采集器中, 查询操作又很频繁, 则这种现有方案 会增加所有数据采集器的负担, 包括数据采集器的功耗和中央处理单元 ( Central Processing Unit, CPU ) 资源, 同时数据采集器除了查询之外, 还需 要做数据接收和入库的工作, 如果查询操作很频繁, 也会影响数据采集器的 采集性能, 降低了系统整体的处理能力。  An existing distributed data query system includes a data query server and multiple data collectors. The data collector is responsible for collecting (receiving, formatting, merging) logs, storing and indexing, and the data query server is a log query. Unified entrance. When the specified log is to be queried, the data query server sends a query command to all data collectors, and the query results of all data collectors are collected to summarize the final query result. If there are many data collectors, the logs to be queried in one query exist only in a few data collectors, and the query operations are very frequent. This existing solution will increase the burden on all data collectors, including the power consumption of the data collector. And the central processing unit (CPU) resources, in addition to the query, the data collector needs to do data receiving and warehousing work. If the query operation is frequent, it will also affect the data collector's acquisition performance and reduce The overall processing power of the system.
现有的另一种分布式数据查询系统的原始日志数据采用集中式存储, 每 个数据采集器只负责日志的采集(接收、 格式化、 归并)、 上报, 日志内容在 数据采集器处理之后并不在本地保存, 而是上报到数据查询服务器去存储。 数据查询服务器在收到数据采集器上报的日志之后, 集中存储到数据库中, 并建立索引, 需要对日志查询的时候直接到数据查询服务器的数据库中查询 就可以。 这种数据集中存储的方式, 使日志的查询操作仅限于在数据查询服 务器的数据库中执行, 不会影响到数据采集器。 然而, 由于日志数据集中存 放在数据查询服务器的数据库中, 数据采集器需要上报大量的日志数据, 一 方面使数据查询服务器的负荷大大增加, 另一方面也大量消耗了数据采集器 和数据查询服务器之间的带宽, 这样也就限制了一台数据查询服务器可以带 的数据采集器的数量, 整个系统的处理能力不可能很高。 发明内容 本发明提供了一种数据查询方法、 数据查询服务器、 数据采集器及数据 查询系统, 能够提高数据查询的处理速度, 减轻数据采集器的系统资源占用 和数据查询服务器的负荷压力, 提升整个系统的处理能力。 The original log data of another distributed data query system is centralized storage, each The data collector is only responsible for collecting (receiving, formatting, merging) and reporting the log. The log content is not saved locally after being processed by the data collector, but is reported to the data query server for storage. After receiving the log reported by the data collector, the data query server stores the data in the database and establishes an index. When the log query is needed, it can be directly queried in the database of the data query server. This way of storing data in a centralized manner, the log query operation is limited to being executed in the database of the data query server, and does not affect the data collector. However, since the log data is stored in the database of the data query server, the data collector needs to report a large amount of log data, which greatly increases the load of the data query server, and consumes the data collector and the data query server on the other hand. The bandwidth between the two limits the number of data collectors that a data query server can carry. The processing power of the entire system cannot be very high. SUMMARY OF THE INVENTION The present invention provides a data query method, a data query server, a data collector, and a data query system, which can improve the processing speed of data query, reduce the system resource occupation of the data collector, and load pressure of the data query server, and improve the entire The processing power of the system.
为实现上述目的, 本发明第一方面提供了一种数据查询方法, 所述方法 包括:  In order to achieve the above object, a first aspect of the present invention provides a data query method, where the method includes:
接收输入的查询请求, 所述查询请求中携带所要查询的字段和所述字段 中的查询词;  Receiving an input query request, where the query request carries a field to be queried and a query word in the field;
从所述字段对应的集中索引表中查询得到所述查询词对应的采集器标 识, 所述集中索引表中存储所述字段中的查询词与采集器标识的对应关系; 根据所述查询请求生成携带有所述字段和查询词的查询命令, 并将所述 查询命令发送给所述采集器标识对应的数据采集器, 用以所述数据采集器在 所述数据采集器中所述查询命令携带的字段对应的本地索引表中, 查询得到 与所述查询命令中携带的查询词相匹配的数据;  Obtaining, by the centralized index table corresponding to the field, the collector identifier corresponding to the query term, where the centralized index table stores a correspondence between the query word and the collector identifier in the field; generating according to the query request a query command carrying the field and the query word, and sending the query command to the data collector corresponding to the collector identifier, where the data collector performs the query command in the data collector In the local index table corresponding to the field, the query obtains data that matches the query word carried in the query command;
接收所述数据采集器返回的所述数据, 根据接收到的数据形成所述查询 请求的查询结果并输出。  Receiving the data returned by the data collector, forming a query result of the query request according to the received data, and outputting the result.
结合第一方面, 在第一方面的第一种可能的实施方式中, 所述从所述字 段对应的集中索引表中查询得到所述查询词对应的采集器标识之前, 还包括: 针对所述字段, 建立所述字段对应的集中索引表; 所述建立所述字段对应的集中索引表, 包括: With reference to the first aspect, in a first possible implementation manner of the first aspect, before the obtaining, by the centralized index table corresponding to the field, the collector identifier corresponding to the query term, the method further includes: a field, establishing a centralized index table corresponding to the field; The establishing a centralized index table corresponding to the field includes:
接收各数据采集器发送的所述字段的上报索引表, 所述上报索引表中包 括发送所述上报索引表的数据采集器中的数据对应于所述字段的查询词; 在所述字段的集中索引表中, 存储数据采集器的标识与所述数据采集器 上报的上报索引表中该字段的查询词的对应关系。  Receiving a report index table of the field sent by each data collector, where the report index table includes a query word corresponding to the field in a data collector that sends the report index table; In the index table, the correspondence between the identifier of the data collector and the query term of the field in the report index table reported by the data collector is stored.
结合第一方面, 在第一方面的第二种可能的实施方式中, 所述从所述字 段对应的集中索引表中查询得到所述查询词对应的采集器标识, 包括:  With reference to the first aspect, in a second possible implementation manner of the first aspect, the obtaining, by the query, the collector identifier corresponding to the query term from the centralized index table corresponding to the field includes:
若所述查询请求中携带至少两个所要查询的字段, 获取所述查询请求中 各字段的查询词, 并记录所述各字段的查询词之间的逻辑关系;  If the query request carries at least two fields to be queried, obtain query terms of each field in the query request, and record a logical relationship between the query words of the fields;
从所述各字段对应的集中索引表中查询得到所述各字段的查询词对应的 采集器标识;  Querying, from the centralized index table corresponding to each field, a collector identifier corresponding to the query word of each field;
根据所述各字段的查询词之间的逻辑关系, 从查询得到的所述采集器标 识中筛选得到满足所述逻辑关系的采集器标识。  And collecting, according to the logical relationship between the query words of the fields, the collector identifier that satisfies the logical relationship from the collector identifier obtained by the query.
第二方面, 本发明还提供了一种数据查询方法, 所述方法包括: 接收数据查询服务器发送的查询命令, 所述查询命令包括所述数据查询 服务器接收的查询请求中携带的所要查询的字段和所述字段中的查询词; 从所述字段对应的本地索引表中查询得到与所述查询命令中的查询词相 匹配的数据的存储位置, 所述本地索引表中存储所述字段中的查询词与所述 数据的存储位置的对应关系;  In a second aspect, the present invention further provides a data query method, the method includes: receiving a query command sent by a data query server, where the query command includes a field to be queried carried in a query request received by the data query server And a query term in the field; querying, from a local index table corresponding to the field, a storage location of data matching a query word in the query command, where the local index table stores the Correspondence between the query word and the storage location of the data;
根据所述数据的存储位置, 获取所述数据并发送给所述数据查询服务器。 结合第二方面, 在第二方面的第一种可能的实施方式中, 所述从所述字 段对应的本地索引表中查询得到与所述查询命令中的查询词相匹配的数据的 存者位置之前, 还包括:  The data is acquired and sent to the data query server according to the storage location of the data. With reference to the second aspect, in a first possible implementation manner of the second aspect, the querying, from the local index table corresponding to the field, the location of the data that matches the query word in the query command Previously, it also included:
针对所述字段, 建立所述字段对应的本地索引表;  Establishing a local index table corresponding to the field for the field;
所述建立所述字段对应的本地索引表, 包括:  The establishing a local index table corresponding to the field includes:
获取当前数据采集器中的数据和所述数据的存储位置, 所述数据中包括 至少一个字段的内容;  Obtaining data in the current data collector and a storage location of the data, where the data includes content of at least one field;
针对每一个字段, 将所述数据在该字段的内容作为所述数据的查询词, 建立所述查询词与所述存储位置的映射关系, 形成该字段在所述当前数据采 集器的本地索引表。 结合第二方面的第一种可能的实施方式, 在第二方面的第二种可能的实 施方式中, 所述在所述当前数据采集器中所述字段的本地索引表中, 存储所 述数据的查询词与所述数据的存储位置的对应关系之后, 还包括: For each field, the content of the data in the field is used as a query word of the data, and a mapping relationship between the query word and the storage location is established, and a local index table of the field in the current data collector is formed. . With reference to the first possible implementation manner of the second aspect, in a second possible implementation manner of the second aspect, the data is stored in a local index table of the field in the current data collector After the correspondence between the query word and the storage location of the data, the method further includes:
从该字段的本地索引表中提取出所述查询词, 对所述查询词进行去重处 理, 形成所述当前数据采集器的所述字段的上报索弓 )表;  Extracting the query term from the local index table of the field, performing deduplication processing on the query word to form a report of the field of the current data collector;
将所述字段的上报索引表发送给所述数据查询服务器, 用以所述数据查 询服务器建立所述字段对应的集中索引表。  And sending the report index table of the field to the data query server, where the data query server establishes a centralized index table corresponding to the field.
结合第二方面, 在第二方面的第三种可能的实施方式中, 所述从所述字 段对应的本地索引表中查询得到与所述查询命令中的查询词相匹配的数据的 存储位置, 包括:  With reference to the second aspect, in a third possible implementation manner of the second aspect, the obtaining, by the local index table corresponding to the field, the storage location of the data that matches the query term in the query command, Includes:
若所述查询命令中携带至少两个所要查询的字段, 获取所述查询命令中 各字段的查询词, 并记录所述各字段的查询词之间的逻辑关系;  If the query command carries at least two fields to be queried, the query words of each field in the query command are obtained, and the logical relationship between the query words of the fields is recorded;
从所述各字段对应的本地索引表中查询得到与所述查询命令中各字段的 查询词相匹配的数据的存储位置;  Querying, from the local index table corresponding to each field, a storage location of data matching the query words of each field in the query command;
根据所述各字段的查询词之间的逻辑关系, 从查询得到的所述数据的存 储位置中筛选得到满足所述逻辑关系的数据的存储位置。  And according to the logical relationship between the query words of the respective fields, the storage location of the data satisfying the logical relationship is filtered out from the storage location of the data obtained by the query.
第三方面, 本发明还提供了一种数据查询服务器, 所述数据查询服务器 包括:  In a third aspect, the present invention further provides a data query server, where the data query server includes:
第一接收单元, 用于接收输入的查询请求, 所述查询请求中携带所要查 询的字段和所述字段中的查询词;  a first receiving unit, configured to receive an input query request, where the query request carries a field to be queried and a query word in the field;
第一查询单元, 用于从所述字段对应的集中索引表中, 查询得到所述第 一接收单元接收的所述查询请求携带的查询词对应的采集器标识, 所述集中 索引表中存储所述字段中的查询词与采集器标识的对应关系;  a first querying unit, configured to: obtain, from the centralized index table corresponding to the field, a collector identifier corresponding to the query word carried by the first receiving unit, and the storage device in the centralized index table Corresponding relationship between the query word in the field and the collector identifier;
第一处理单元, 用于根据所述查询请求生成携带有所述字段和查询词的 查询命令, 并将所述查询命令发送给所述第一查询单元查询得到的采集器标 识对应的数据采集器, 用以所述数据采集器在所述数据采集器中所述查询命 令携带的字段对应的本地索引表中, 查询得到与所述查询命令中携带的查询 词相匹配的数据;  a first processing unit, configured to generate, according to the query request, a query command that carries the field and the query word, and send the query command to a data collector corresponding to the collector identifier obtained by querying by the first query unit And the data is obtained by the data collector in a local index table corresponding to the field carried by the query command in the data collector, and the query obtains data that matches the query word carried in the query command;
第一输出单元, 用于接收所述数据采集器返回的所述数据, 根据接收到 的数据形成所述查询请求的查询结果并输出。 结合第三方面, 在第三方面的第一种可能的实施方式中, 所述数据查询 服务器还包括: The first output unit is configured to receive the data returned by the data collector, form a query result of the query request according to the received data, and output the result. In conjunction with the third aspect, in a first possible implementation manner of the third aspect, the data query server further includes:
第一索引单元, 用于针对所述字段, 建立所述字段对应的集中索引表; 所述第一索引单元包括:  a first index unit, configured to establish, according to the field, a centralized index table corresponding to the field; the first index unit includes:
第一接收子单元, 用于接收各数据采集器发送的所述字段的上报索引表, 所述上报索引表中包括发送所述上报索引表的数据采集器中的数据对应于所 述字段的查询词;  a first receiving subunit, configured to receive a reporting index table of the field sent by each data collector, where the reporting index table includes a query corresponding to the field in the data collector that sends the reporting index table Word
第一索引子单元, 用于在所述字段的集中索引表中, 存储数据采集器的 标识与所述数据采集器上报的上报索引表中该字段的查询词的对应关系。  The first index sub-unit is configured to store, in the centralized index table of the field, a correspondence between an identifier of the data collector and a query word of the field in the report index table reported by the data collector.
结合第三方面, 在第三方面的第二种可能的实施方式中, 所述第一查询 单元包括:  With reference to the third aspect, in a second possible implementation manner of the third aspect, the first query unit includes:
第一解析子单元, 用于若所述第一接收单元接收的所述查询请求中携带 至少两个所要查询的字段, 获取所述查询请求中各字段的查询词, 并记录所 述各字段的查询词之间的逻辑关系;  a first parsing subunit, configured to: if the query request received by the first receiving unit carries at least two fields to be queried, obtain query words of each field in the query request, and record the fields Query the logical relationship between words;
第一查询子单元, 用于从所述各字段对应的集中索引表中查询得到所述 第一解析子单元获取的所述各字段的查询词对应的采集器标识;  a first query subunit, configured to query, from the centralized index table corresponding to each field, a collector identifier corresponding to the query word of each field acquired by the first parsing subunit;
第一过滤子单元, 用于根据所述第一解析子单元获取的所述各字段的查 询词之间的逻辑关系, 从所述第一查询子单元查询得到的所述采集器标识中 筛选得到满足所述逻辑关系的采集器标识。  a first filtering subunit, configured to filter, according to the logical relationship between the query words of the fields obtained by the first parsing subunit, from the collector identifier obtained by querying by the first query subunit A collector identifier that satisfies the logical relationship.
第四方面, 本发明还提供了一种数据采集器, 所述数据采集器包括: 第二接收单元, 用于接收数据查询服务器发送的查询命令, 所述查询命 令携带所要查询的字段和所述字段中的查询词;  In a fourth aspect, the present invention further provides a data collector, where the data collector includes: a second receiving unit, configured to receive a query command sent by a data query server, where the query command carries a field to be queried and the The query word in the field;
第二查询单元, 用于从所述字段对应的本地索引表中查询得到与所述第 二接收单元接收的查询命令中的查询词相匹配的数据的存储位置, 所述本地 索引表中存储所述字段中的查询词与所述数据的存储位置的对应关系;  a second query unit, configured to query, from a local index table corresponding to the field, a storage location of data that matches a query word in a query command received by the second receiving unit, where the local index table stores Corresponding relationship between the query word in the field and the storage location of the data;
第二处理单元, 用于根据所述第二查询单元查询得到的所述数据的存储 位置, 获取所述数据并发送给所述数据查询服务器。  And a second processing unit, configured to acquire the data according to the storage location of the data obtained by querying by the second query unit, and send the data to the data query server.
结合第四方面, 在第四方面的第一种可能的实施方式中, 所述数据采集 器还包括:  In conjunction with the fourth aspect, in a first possible implementation manner of the fourth aspect, the data collector further includes:
第二索引单元, 用于针对所述字段, 建立所述字段对应的本地索引表; 所述第二索引单元包括: a second indexing unit, configured to establish, according to the field, a local index table corresponding to the field; The second index unit includes:
获取子单元, 用于获取当前数据采集器中的数据和所述数据的存储位置, 所述数据中包括至少一个字段的内容;  Obtaining a subunit, configured to acquire data in a current data collector and a storage location of the data, where the data includes content of at least one field;
第二索引子单元, 用于针对所述获取子单元获取的每一个字段, 将所述 数据在该字段的内容作为所述数据的查询词, 在所述数据采集器中所述字段 的本地索引表中, 存储所述数据的查询词与所述数据的存储位置的对应关系。  a second index subunit, configured to use, for each field acquired by the obtaining subunit, a content of the data in the field as a query word of the data, and a local index of the field in the data collector In the table, a correspondence between a query word storing the data and a storage location of the data.
结合第四方面的第一种可能的实施方式, 在第四方面的第二种可能的实 施方式中, 所述第二索引单元还包括:  With reference to the first possible implementation of the fourth aspect, in a second possible implementation manner of the fourth aspect, the second indexing unit further includes:
第三索引子单元, 用于从所述第二索引子单元得到的该字段的本地索引 表中提取出所述查询词, 对所述查询词进行去重处理, 形成所述当前数据采 集器的所述字段的上报索引表;  a third index subunit, configured to extract the query word from a local index table of the field obtained by the second index subunit, and perform deduplication processing on the query word to form the current data collector The report index table of the field;
发送子单元, 用于将所述第三索引子单元形成的所述字段的上报索引表 发送给所述数据查询服务器, 用以所述数据查询服务器建立所述字段的集中 索引表。  And a sending subunit, configured to send the report index table of the field formed by the third index subunit to the data query server, where the data query server establishes a centralized index table of the field.
结合第四方面, 在第四方面的第三种可能的实施方式中, 所述第二查询 单元包括:  In conjunction with the fourth aspect, in a third possible implementation manner of the fourth aspect, the second query unit includes:
第二解析子单元, 用于若所述第二接收单元接收的所述查询命令中携带 至少两个所要查询的字段, 获取所述查询命令中各字段的查询词, 并记录所 述各字段的查询词之间的逻辑关系;  a second parsing subunit, configured to: if the query command received by the second receiving unit carries at least two fields to be queried, obtain query words of each field in the query command, and record the fields Query the logical relationship between words;
第二查询子单元, 用于从所述各字段对应的本地索引表中查询得到与所 述第二解析子单元获取的所述查询命令中各字段的查询词相匹配的数据的存 储位置;  a second query subunit, configured to query, from the local index table corresponding to each field, a storage location of data that matches a query word of each field in the query command acquired by the second parsing subunit;
第二过滤子单元, 用于根据所述第二解析子单元获取的所述各字段的查 询词之间的逻辑关系, 从所述第二查询子单元查询得到的所述数据的存储位 置中筛选得到满足所述逻辑关系的数据的存储位置。  a second filtering subunit, configured to filter, according to a logical relationship between the query words of the fields obtained by the second parsing subunit, from a storage location of the data obtained by querying by the second query subunit A storage location of data that satisfies the logical relationship is obtained.
第五方面, 本发明还提供了一种数据查询系统, 所述系统包括: 上述第三方面提供的数据查询服务器和上述第四方面提供的数据采集 器。  In a fifth aspect, the present invention further provides a data query system, where the system includes: the data query server provided by the third aspect, and the data collector provided by the fourth aspect.
本发明实施例提供的数据查询方法、 数据查询服务器、 数据采集器及数 据查询系统, 通过在数据采集器和数据查询服务器中分别建立本地索引表和 集中索引表, 可以有效的减轻数据采集器的系统资源占用, 使数据采集器可 以有更多的资源用于提高采集的性能, 从而提升系统整体的处理能力, 提高 数据查询的处理速度。 附图说明 图 1为本发明实施例提供的数据查询系统的架构图; The data query method, the data query server, the data collector, and the data query system provided by the embodiments of the present invention respectively establish a local index table in the data collector and the data query server, respectively. The centralized index table can effectively alleviate the system resource usage of the data collector, so that the data collector can have more resources for improving the collection performance, thereby improving the overall processing capability of the system and improving the processing speed of the data query. BRIEF DESCRIPTION OF DRAWINGS FIG. 1 is a structural diagram of a data query system according to an embodiment of the present invention;
图 2为本发明实施例一提供的索引建立过程的信令图;  2 is a signaling diagram of an index establishment process according to Embodiment 1 of the present invention;
图 3为本发明实施例一提供的一种数据查询方法流程图;  3 is a flowchart of a data query method according to Embodiment 1 of the present invention;
图 4为本发明实施例一提供的又一种数据查询方法流程图;  4 is a flowchart of still another data query method according to Embodiment 1 of the present invention;
图 5为本发明实施例二提供的数据查询系统的示意图;  FIG. 5 is a schematic diagram of a data query system according to Embodiment 2 of the present invention; FIG.
图 6为本发明实施例二提供的数据查询服务器和数据采集器的示意图; 图 7为本发明实施例三提供的数据查询服务器的示意图;  6 is a schematic diagram of a data query server and a data collector provided by Embodiment 2 of the present invention; FIG. 7 is a schematic diagram of a data query server according to Embodiment 3 of the present invention;
图 8为本发明实施例三提供的数据采集器的示意图。 具体实施方式 下面通过附图和实施例, 对本发明的技术方案 #文进一步的详细描述。 图 1是本发明实施例提供的数据查询系统的架构图, 如图 1所示, 本发 明采用分布式架构, 包括一台数据查询服务器 10和多台数据采集器 20,数据 采集器 20负责对日志源 30上报的海量日志等数据的采集(包括接收、 格式 化、 归并)、存储和索引等处理, 数据查询服务器 10是数据查询的统一入口。  FIG. 8 is a schematic diagram of a data collector provided by Embodiment 3 of the present invention. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS The technical solution of the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. 1 is a structural diagram of a data query system according to an embodiment of the present invention. As shown in FIG. 1, the present invention adopts a distributed architecture, including a data query server 10 and multiple data collectors 20. The data collector 20 is responsible for The data source server 10 is a unified entry of data query for collecting (including receiving, formatting, merging), storing, and indexing data such as massive logs reported by the log source 30.
本发明提供的数据查询方法可用于海量数据的快速查询, 在下面的实施 例中以日志数据为例进行说明。  The data query method provided by the present invention can be used for quick query of massive data. In the following embodiments, the log data is taken as an example for description.
实施例一  Embodiment 1
在进行日志数据的查询之前, 需预先对系统中已存储的数据建立索引, 通常在数据存储时进行, 用以系统根据建立的索引表进行数据的查询。  Before querying the log data, the data stored in the system needs to be indexed in advance, usually during data storage, for the system to query the data according to the established index table.
本实施例中在数据采集器和数据查询服务器中分别建立本地索引表和集 中索引表。 本地索引表用于存储当前数据采集器中的日志数据的索引, 其作 用是: 当给出查询条件时, 可以查找到本地数据中符合条件的所有日志的具 体存储位置。 集中索引表用于存储各字段的查询词与采集器标识的索引, 其 作用是: 当给出查询条件时, 可以查找到待查询数据可能存储在哪些数据采 集器上, 集中索引表中给出了存储有待查询数据的数据采集器的标识信息。 In this embodiment, a local index table and a centralized index table are respectively established in the data collector and the data query server. The local index table is used to store the index of the log data in the current data collector, and its function is: When the query condition is given, the specific storage location of all the logs in the local data that meet the conditions can be found. The centralized index table is used to store an index of the query word and the collector identifier of each field, The function is: When the query condition is given, it can be found on which data collectors the data to be queried may be stored, and the identification information of the data collector storing the data to be queried is given in the centralized index table.
图 2是本实施例提供的索引建立过程的信令图, 如图 2所示, 包括: 步骤 S101、 数据采集器获取当前数据采集器中的数据和所述数据的存储 位置。  2 is a signaling diagram of the index establishment process provided in this embodiment. As shown in FIG. 2, the method includes the following steps: Step S101: A data collector acquires data in a current data collector and a storage location of the data.
可选地, 数据采集器中存储的数据为日志源上报的原始日志数据。 在日 志源上报原始日志数据到数据采集器之后, 数据采集器还需要为原始日志数 据建本地索引。  Optionally, the data stored in the data collector is the original log data reported by the log source. After the log source reports the raw log data to the data collector, the data collector also needs to build a local index for the original log data.
数据采集器对原始日志数据进行格式化和归并处理, 将原始日志数据处 理为日志表中每条记录的形式(即日志表中的每行记录) , 每个日志表可能 有多个字段, 如下表 1所示, 该日志表包括字段 1和字段 2等字段, 序号表 示数据的存储位置。  The data collector formats and merges the original log data, and processes the original log data into the form of each record in the log table (that is, each row of records in the log table). Each log table may have multiple fields, as follows As shown in Table 1, the log table includes fields such as field 1 and field 2, and the serial number indicates the storage location of the data.
表 1
Figure imgf000010_0001
Table 1
Figure imgf000010_0001
步骤 S102、 数据采集器针对每一个字段, 将所述数据在该字段的内容作 为所述数据的查询词, 建立所述查询词与所述存储位置的映射关系, 形成该 字段在所述当前数据采集器的本地索引表。  Step S102: The data collector uses, for each field, the content of the data in the field as a query word of the data, and establishes a mapping relationship between the query word and the storage location, forming the field in the current data. The local index table of the collector.
在数据采集器的本地索引中, 针对日志表中每条记录的每个字段分别建 立本地索引表, 每个索引表对应指定的日志表中某个字段内容和含有所述指 定的某个字段内容的数据在日志表的存储位置信息。 其中, 字段内容作为对 应数据的查询词。 所述查询词与所述存储位置的映射关系可以但不限于采用 表格的形式表示, 如下表 2所示为数据采集器中字段 1的本地索引表:  In the local index of the data collector, a local index table is established for each field of each record in the log table, and each index table corresponds to a field content in the specified log table and contains a specified field content. The storage location information of the data in the log table. The content of the field is used as the query term for the corresponding data. The mapping relationship between the query term and the storage location may be, but is not limited to, expressed in the form of a table, as shown in Table 2 below, the local index table of the field 1 in the data collector:
表 2  Table 2
字段 1 内 aaa 2
Figure imgf000011_0001
Field 1 within aaa 2
Figure imgf000011_0001
通过对应字段的本地索引表可以快速地找到数据在该数据采集器上的具 体位置。 例如, 要字段 1某个内容的具体位置, 则根据字段 1的本地索引表, 即可快速找到对应的位置。  The specific location of the data on the data collector can be quickly found by the local index table of the corresponding field. For example, if you want the specific location of a content in field 1, you can quickly find the corresponding location based on the local index table of field 1.
步骤 S103、 数据采集器从该字段的本地索引表中提取出所述查询词, 对 所述查询词进行去重处理, 形成所述当前数据采集器的所述字段的上报索引 表。  Step S103: The data collector extracts the query word from the local index table of the field, and performs deduplication processing on the query word to form a report index table of the field of the current data collector.
数据采集器根据字段的本地索引表, 提取出上报的上报索引表。 上报索 引表中已经没有每个索引内容对应的具体原始日志位置, 而只有各个数据对 应该字段的内容, 即查询词。 通常在上报之前, 还会对查询词进行去重处理, 使上报的每个字段的不重复内容。 例如表 中字段 1 内容有两个 aaa和一个 bbb, 经过去重处理后, 上报索引表中只有一个 aaa和一个 bbb, 如下表 3所 示。  The data collector extracts the reported report index table according to the local index table of the field. There is no specific original log location corresponding to each index content in the report index table, and only the content of each data corresponding field, that is, the query word. Usually, before the report, the query words are de-reprocessed so that the fields of the reported fields are not duplicated. For example, the field 1 in the table has two aaa and one bbb. After de-reprocessing, there is only one aaa and one bbb in the index table, as shown in Table 3 below.
表 3  table 3
字段 1 内 aaa  Field 1 within aaa
bbb 对于新增的本地索引表中提取出的查询词 , 将其与已上报的上报索引表 相比对, 如果查询词相同, 则不添加到上报索引表中, 仅上传新增的不重复 的上报索引表。  Bbb compares the query words extracted from the newly added local index table with the reported report index table. If the query words are the same, they are not added to the report index table, only the new ones are not duplicated. Report the index table.
步骤 S104、 数据采集器将所述字段的上报索引表发送给所述数据查询服 务器。  Step S104: The data collector sends the report index table of the field to the data query server.
数据采集器将如表 3 所示的上报索引表上报给数据查询服务器。 对于本 引表中新增的内容。  The data collector reports the report index table as shown in Table 3 to the data query server. What's new in this table.
数据查询服务器对于多个数据采集器发送的上报索引表进行汇总, 并在 数据查询服务器中分别建立各个字段对应的集中索引表, 具体包括: 步骤 S105、 数据查询服务器接收各数据采集器发送的所述字段的上报索 引表。 The data query server summarizes the report index table sent by the multiple data collectors, and establishes a centralized index table corresponding to each field in the data query server, which specifically includes: Step S105: The data query server receives the report index table of the field sent by each data collector.
数据采集器的上报索引表中包括该数据采集器中的数据对应于所述字段 的查询词。  The report index table of the data collector includes a query word corresponding to the field in the data collector.
步骤 S106、 建立所述查询词与所述采集器标识的映射关系, 形成所述字 段的集中索引表。  Step S106: Establish a mapping relationship between the query word and the identifier of the collector, and form a centralized index table of the field.
针对不同的字段, 数据查询服务器分别建立对应字段的集中索引表。 例 如对于字段 1 ,数据查询服务器的集中索引表中会存储各个数据采集器上包含 的查询词以及对应的采集器标识, 如下表 4所示:  For different fields, the data query server respectively establishes a centralized index table of corresponding fields. For example, for field 1, the centralized index table of the data query server stores the query words and corresponding collector identifiers contained in each data collector, as shown in Table 4 below:
表 4  Table 4
Figure imgf000012_0001
Figure imgf000012_0001
又例如, 对于字段 2, 数据查询服务器建立的集中索引表如下表 5所示:  For another example, for field 2, the centralized index table established by the data query server is shown in Table 5 below:
表 5  table 5
Figure imgf000012_0002
Figure imgf000012_0002
数据查询服务器中对应每个字段有一个集中索引表, 将每个数据采集器 上报的上报索引表汇总到数据查询服务器的集中索引表中, 即在数据查询服 务器上所述字段的集中索引表中, 存储数据采集器的标识与所述数据采集器 上报的上报索引表中该字段的查询词的对应关系。  Each of the fields in the data query server has a centralized index table, and the report index table reported by each data collector is summarized into a centralized index table of the data query server, that is, in the centralized index table of the field on the data query server. Corresponding relationship between the identifier of the data collector and the query term of the field in the report index table reported by the data collector.
在建立索引表后, 当接收到查询请求时, 即可通过索引表找到需要查询 的数据。  After the index table is created, when the query request is received, the data to be queried can be found through the index table.
图 3是本实施例提供的数据查询方法流程图, 如图 3所示, 本发明的数 据查询方法包括: 步骤 S201、 数据查询服务器接收输入的查询请求。 FIG. 3 is a flowchart of a data query method according to the embodiment. As shown in FIG. 3, the data query method of the present invention includes: Step S201: The data query server receives the input query request.
用户通过表格检索或表达式检索等方式, 输入查询请求。 对于表格检索 方式, 与用户的交互界面中会给出固定的字段, 用户可以在多个字段的提示 框中输入查询词, 最后通过提交按键将查询请求提交给数据查询服务器。 对 于表达式检索的方式, 则是用户直接输入所要查询的字段及该字段的查询词, 提交给数据查询服务器。  The user inputs a query request by means of a form search or an expression search. For the form retrieval method, a fixed field is given in the interaction interface with the user, the user can input the query word in the prompt box of multiple fields, and finally submit the query request to the data query server through the submit button. For the expression retrieval method, the user directly inputs the field to be queried and the query word of the field, and submits it to the data query server.
数据查询服务器接收到的查询请求中携带所要查询的字段和所述字段的 查询词。  The query request received by the data query server carries the field to be queried and the query term of the field.
当查询请求包含多个字段时, 数据查询服务器接收到的查询请求还携带 各个字段的查询词之间的逻辑关系。 例如, 用户在不同的检索字段输入查询 词, 数据查询服务器接收这些在各个不同的检索字段的查询词及各查询词之 间的逻辑关系。  When the query request contains multiple fields, the query request received by the data query server also carries the logical relationship between the query words of the respective fields. For example, the user enters query terms in different search fields, and the data query server receives the logical relationships between the query words and the various query terms in the various search fields.
例如, 输入的查询请求为: (字段 l=aaa) AND (字段 2=ccc), 该查询请求 中包括字段 1和字段 2这两个字段, 其对应的查询词分别为 aaa和 ccc, AND 表示这两个查询词之间的逻辑关系是"与"的关系。  For example, the input query request is: (field l=aaa) AND (field 2=ccc), the query request includes two fields, field 1 and field 2, and the corresponding query words are aaa and ccc, AND represents The logical relationship between these two query words is the relationship between "and".
步骤 S202、 数据查询服务器从所述字段对应的集中索引表中查询得到所 述查询词对应的采集器标识。  Step S202: The data query server queries the centralized index table corresponding to the field to obtain a collector identifier corresponding to the query word.
当查询请求中包括字段 1时, 则字段 1的集中索引表中得到与查询词相 匹配的查询词, 得到对应的采集器标识, 从而得到所述查询词在哪一个的数 据采集器。  When the query request includes the field 1, the query word matching the query word is obtained in the centralized index table of the field 1, and the corresponding collector identifier is obtained, thereby obtaining the data collector of the query word.
例如, 查询请求为: 字段 l=aaa, 则从表 4所示的集中索引表中查询得到 aaa对应的采集器标识为采集器 1和采集器 3。  For example, if the query request is: field l=aaa, the collector identifier corresponding to aaa is obtained from the centralized index table shown in Table 4 as collector 1 and collector 3.
对于查询请求包括多个字段的情况, 本步骤具体包括:  For the case where the query request includes multiple fields, this step specifically includes:
步骤 S2021、获取所述查询请求中各字段的查询词, 并记录所述查询请求 中携带的所述各字段的查询词之间的逻辑关系。  Step S2021: Acquire a query word of each field in the query request, and record a logical relationship between the query words of the fields carried in the query request.
步骤 S2022、从所述各字段对应的集中索引表中查询得到所述各字段的查 询词对应的采集器标识。  Step S2022: Query, from the centralized index table corresponding to each field, the collector identifier corresponding to the query word of each field.
步骤 S2023、 根据所述各字段的查询词之间的逻辑关系, 从步骤 S2022 查询得到的所述采集器标识筛选得到满足所述逻辑关系的采集器标识。  Step S2023: According to the logical relationship between the query words of the fields, the collector identifier that is obtained by querying in step S2022 is filtered to obtain a collector identifier that satisfies the logical relationship.
例如, 查询请求为: (字段 l=aaa) AND (字段 2=ccc), 则从表 4所示的集 中索引表中查询得到 aaa对应的采集器标识为采集器 1和采集器 3 ,从表 5所 示的集中索引表中查询得到 ccc对应的采集器标识为采集器 1 , aaa与 ccc之 间的逻辑关系为"与", 则可以筛选出满足逻辑关系的采集器标识只有采集器 1。 For example, the query request is: (field l=aaa) AND (field 2=ccc), then from the set shown in Table 4. In the index table, the identifier of the collector corresponding to aaa is collected as collector 1 and collector 3. From the centralized index table shown in Table 5, the collector identifier corresponding to ccc is obtained as collector 1, between aaa and ccc. If the logical relationship is "and", then the collector identifier that satisfies the logical relationship can be filtered out only by the collector 1.
步骤 S203、 数据查询服务器根据所述查询请求生成携带有所述字段和查 询词的查询命令, 并将所述查询命令发送给所述采集器标识对应的数据采集 器。  Step S203: The data query server generates a query command that carries the field and the query word according to the query request, and sends the query command to the data collector corresponding to the collector identifier.
数据查询服务器生成的查询命令可以与输入的查询请求相同, 也可以只 包括所发送的目的数据采集器包括的查询词。例如,查询请求为:(字段 l=aaa) OR (字段 2=ccc),则从表 4所示的集中索引表中查询得到 aaa对应的采集器标 识为采集器 1和采集器 3 ,从表 5所示的集中索引表中查询得到 ccc对应的采 集器标识为采集器 1 , aaa与 ccc之间的逻辑关系为"或", 向采集器 1发送的 查询命令为: (字段 l=aaa) OR (字段 2=ccc), 向采集器 3发送的查询命令为: 2=ccc。  The query command generated by the data query server may be the same as the input query request, or may include only the query words included in the transmitted destination data collector. For example, if the query request is: (field l=aaa) OR (field 2=ccc), the collector identifier corresponding to aaa is obtained from the centralized index table shown in Table 4 as collector 1 and collector 3, and the slave table The query in the centralized index table shown in 5 shows that the collector identifier corresponding to ccc is the collector 1, and the logical relationship between aaa and ccc is "or". The query command sent to the collector 1 is: (field l=aaa) OR (field 2 = ccc), the query command sent to collector 3 is: 2 = ccc.
所述数据采集器通过所述数据采集器的本地索引表查询得到与所述查询 词相匹配的数据, 具体的过程后续结合图 3进行详细说明。  The data collector obtains data matching the query word through the local index table of the data collector, and the specific process is described in detail in conjunction with FIG. 3 .
步骤 S204、 数据查询服务器接收所述数据采集器返回的所述数据, 根据 接收到的数据形成所述查询请求的查询结果并输出。  Step S204: The data query server receives the data returned by the data collector, and forms a query result of the query request according to the received data, and outputs the result.
数据查询服务器对接收到的数据进行汇总, 可以但不限于以表格的形式 输出。  The data query server summarizes the received data, which may be, but is not limited to, output in the form of a table.
例如, 对于查询请求为: (字段 l=aaa) AND (字段 2=ccc), 最后输出的查 询结果如下表 6所示:  For example, for a query request: (field l=aaa) AND (field 2=ccc), the result of the last output query is shown in Table 6 below:
表 6
Figure imgf000014_0001
Table 6
Figure imgf000014_0001
图 4是本实施例提供的又一数据查询方法流程图, 如图 4所示, 本发明 的数据查询方法包括:  FIG. 4 is a flowchart of still another data query method provided by this embodiment. As shown in FIG. 4, the data query method of the present invention includes:
步骤 S301、 数据采集器接收数据查询服务器发送的查询命令。  Step S301: The data collector receives the query command sent by the data query server.
所述查询命令包括所述数据查询服务器接收的查询请求中携带的所要查 询的字段和所述字段的查询词, 可选地, 也可以包括多个字段的查询词和所 述查询词之间的逻辑关系。 The query command includes a field to be queried and a query word of the field carried in the query request received by the data query server, and optionally, a query word and a plurality of fields The logical relationship between the query words.
步骤 S302、 数据采集器从所述字段对应的本地索引表中查询得到与所述 查询命令相匹配的数据的存储位置。  Step S302: The data collector queries the local index table corresponding to the field to obtain a storage location of data that matches the query command.
对于查询命令中包含多个字段的查询词的情况, 本步骤具体包括: 步骤 S3021、 数据采集器获取步骤 S301接收的查询命令中各字段的查询 词, 并记录所述查询命令中携带的各字段的查询词之间的逻辑关系。  For the case of the query word including the multiple fields in the query command, the step includes: Step S3021: The data collector obtains the query words of each field in the query command received in step S301, and records each field carried in the query command. The logical relationship between the query words.
步骤 S3022、数据采集器从所述各字段对应的本地索引表中查询得到与所 述各字段的查询词相匹配的数据的存储位置。  Step S3022: The data collector queries, from the local index table corresponding to each field, a storage location of data that matches the query words of the respective fields.
例如, 查询命令为: (字段 l=aaa) OR (字段 2=ccc), 则在字段 1的本地索 引表中查询得到数据对应的存储位置为 2,在字段 2的本地索引表中查询得到 数据对应的存储位置也为 2。  For example, if the query command is: (field l=aaa) OR (field 2=ccc), then the storage location corresponding to the data obtained in the local index table of field 1 is 2, and the data is queried in the local index table of field 2. The corresponding storage location is also 2.
步骤 S3023、数据采集器根据所述各字段的查询词之间的逻辑关系,从查 询得到的所述数据的存储位置中筛选得到满足所述逻辑关系的数据的存储位 置。  Step S3023: The data collector filters, according to the logical relationship between the query words of the fields, the storage location of the data that satisfies the logical relationship from the storage location of the data obtained by the query.
数据采集器对匹配得到的数据的存储位置根据查询词之间的逻辑关系进 行筛选得到满足所述逻辑关系的数据的存储位置。  The data collector filters the storage location of the matched data according to the logical relationship between the query words to obtain the storage location of the data satisfying the logical relationship.
步骤 S303、 数据采集器根据所述数据的存储位置, 获取所述数据并发送 给所述数据查询服务器。  Step S303: The data collector acquires the data according to the storage location of the data, and sends the data to the data query server.
数据采集器根据步骤 S302得到的数据的存储位置,获取对应的所述数据, 例如: 可以获取得到满足所述查询命令的数据为:
Figure imgf000015_0001
The data collector obtains the corresponding data according to the storage location of the data obtained in step S302. For example, the data that can be obtained to satisfy the query command is:
Figure imgf000015_0001
数据采集器将所述数据发送给数据查询服务器, 用以数据查询服务器汇 总输出所述查询请求的查询结果。  The data collector sends the data to the data query server, and the data query server aggregates and outputs the query result of the query request.
本实施例提供的数据查询方法, 通过在数据采集器和数据查询服务器中 分别建立本地索引表和集中索引表, 在查询数据时, 通过对应字段的集中索 引表找到查询词对应的采集器标识, 从而在采集器标识对应的数据采集器中 获取对应的数据, 这样可以有效的减轻数据查询服务器和数据采集器的系统 资源占用, 使数据采集器可以有更多的资源用于提高采集的性能, 提高数据 查询的处理速度。 实施例二 In the data query method provided by the embodiment, the local index table and the centralized index table are respectively established in the data collector and the data query server, and when the data is queried, the collector identifier corresponding to the query word is found through the centralized index table of the corresponding field. Therefore, the corresponding data is obtained in the data collector corresponding to the collector identifier, which can effectively reduce the system resource occupation of the data query server and the data collector, so that the data collector can have more resources for improving the collection performance. Improve the processing speed of data queries. Embodiment 2
图 5是本实施例提供的数据查询系统的示意图, 如图 5所示, 本发明实 施例的数据查询系统包括: 数据查询服务器 10和数据采集器 20。 其中, 数据 采集器 20负责数据的采集, 包括接收、 格式化、 归并, 数据的存储并对存储 的数据建立索引。 数据查询服务器 10用于对多个数据采集器 20上存储的内 容进行统一管理, 并作为数据查询的统一入口。  FIG. 5 is a schematic diagram of a data query system according to the embodiment. As shown in FIG. 5, the data query system of the embodiment of the present invention includes: a data query server 10 and a data collector 20. The data collector 20 is responsible for data collection, including receiving, formatting, merging, storing data, and indexing the stored data. The data query server 10 is used for unified management of the contents stored on the plurality of data collectors 20 and serves as a unified entry for data query.
图 6是本实施例提供的数据查询服务器 10和数据采集器 20的示意图, 如图 6所示, 数据查询服务器 10包括第一索引单元 100、 第一接收单元 101、 第一查询单元 102、 第一处理单元 103和第一输出单元 104。  FIG. 6 is a schematic diagram of the data query server 10 and the data collector 20 provided in this embodiment. As shown in FIG. 6, the data query server 10 includes a first index unit 100, a first receiving unit 101, and a first query unit 102. A processing unit 103 and a first output unit 104.
数据采集器 20包括第二索引单元 200、 第二接收单元 201、 第二查询单 元 202和第二处理单元 203。  The data collector 20 includes a second index unit 200, a second receiving unit 201, a second query unit 202, and a second processing unit 203.
在进行日志数据的查询之前, 数据采集器 20和数据查询服务器 10需预 先对系统中已存储的数据建立索引, 通常在数据存储时完成, 用以系统根据 建立的索引表进行数据的查询。 数据查询服务器 10利用第一索引单元 100针 对每一个所述字段, 建立所述字段的集中索引表。 数据采集器 20利用第二索 引单元 200建立字段的本地索引表。  Before the query of the log data is performed, the data collector 20 and the data query server 10 need to pre-index the data stored in the system, which is usually completed during the data storage, and is used by the system to query the data according to the established index table. The data query server 10 establishes a centralized index table of the fields for each of the fields using the first index unit 100. The data collector 20 uses the second index unit 200 to establish a local index table of fields.
本地索引表用于存储当前数据采集器中的日志数据的索引, 其作用是: 当给出查询条件时, 可以查找到本地数据中符合条件的所有日志的具体存储 位置。 集中索引表用于存储待查询数据与采集器标识的索引, 其作用是: 当 给出查询条件时, 可以查找到待查询数据可能存储在哪些数据采集器上, 集 中索引表中给出了存储有待查询数据的数据采集器的标识信息。  The local index table is used to store the index of the log data in the current data collector. The function is: When the query condition is given, the specific storage location of all the logs in the local data that meet the conditions can be found. The centralized index table is used to store the index of the data to be queried and the identifier of the collector, and its function is: When the query condition is given, it can be found on which data collectors the data to be queried may be stored, and the storage is given in the centralized index table. Identification information of the data collector of the data to be queried.
其中, 第二索引单元 200包括获取子单元 2001、 第二索引子单元 2002、 第三索引子单元 2003和发送子单元 2004。  The second index unit 200 includes an acquisition subunit 2001, a second index subunit 2002, a third index subunit 2003, and a transmission subunit 2004.
获取子单元 2001用于获取当前数据采集器中的数据和所述数据的存储位 置, 所述数据中包括至少一个字段的内容。  The obtaining subunit 2001 is configured to acquire data in the current data collector and a storage location of the data, where the data includes content of at least one field.
可选地, 数据采集器中存储的数据为日志源上报的原始日志数据。 在日 志源上报原始日志数据到数据采集器之后, 数据采集器还需要为原始日志数 据建本地索引。  Optionally, the data stored in the data collector is the original log data reported by the log source. After the log source reports the raw log data to the data collector, the data collector also needs to build a local index for the original log data.
数据采集器对原始日志数据进行格式化和归并处理, 将原始日志数据处 理为日志表中每条记录的形式(即日志表中的每行记录) , 每个日志表可能 有多个字段, 如表 1所示, 该日志表包括字段 1和字段 2等字段, 序号表示 数据的存储位置。 The data collector formats and merges the original log data, and processes the original log data into the form of each record in the log table (ie, each row in the log table), and each log table may There are multiple fields, as shown in Table 1, the log table includes fields such as field 1 and field 2, and the serial number indicates the storage location of the data.
第二索引子单元 2002用于针对每一个字段, 将所述数据在该字段的内容 作为所述数据的查询词, 建立所述查询词与所述存储位置的映射关系, 形成 该字段在所述当前数据采集器的本地索引表。  The second index sub-unit 2002 is configured to use, for each field, the content of the data in the field as a query word of the data, to establish a mapping relationship between the query word and the storage location, to form the field in the The local index table of the current data collector.
在数据采集器 20的本地索引中, 针对日志表中每条记录的每个字段分别 建立本地索引表, 每个索引表对应指定的日志表中某个字段内容和含有所述 指定的某个字段内容的数据在日志表的存储位置信息。 其中, 字段内容作为 对应数据的查询词。 所述查询词与所述存储位置的映射关系可以但不限于采 用表格的形式表示, 如表 2所示。  In the local index of the data collector 20, a local index table is established for each field of each record in the log table, and each index table corresponds to a field content in the specified log table and a certain field containing the specified The data of the content is stored in the log table. Among them, the field content is used as the query word of the corresponding data. The mapping relationship between the query term and the storage location may be, but is not limited to, expressed in the form of a table, as shown in Table 2.
通过对应字段的本地索引表可以快速地找到数据在该数据采集器上的具 体位置。 例如, 要字段 1某个内容的具体位置, 则根据字段 1的本地索引表, 即可快速找到对应的位置。  The specific location of the data on the data collector can be quickly found by the local index table of the corresponding field. For example, if you want the specific location of a content in field 1, you can quickly find the corresponding location based on the local index table of field 1.
第三索引子单元 2003用于从该字段的本地索引表中提取出所述查询词, 对所述查询词进行去重处理, 形成所述当前数据采集器的所述字段的上报索 引表。  The third index subunit 2003 is configured to extract the query word from the local index table of the field, and perform deduplication processing on the query word to form a report index table of the field of the current data collector.
数据采集器根据字段的本地索引表, 提取出上报的上报索引表。 上报索 引表中已经没有每个索引内容对应的具体原始日志位置, 而只有各个数据对 应该字段的内容, 即查询词。 通常在上报之前, 还会对查询词进行去重处理, 使上报的每个字段的不重复内容。 例如表 中字段 1 内容有两个 aaa和一个 bbb, 经过去重处理后, 上报索引表中只有一个 aaa和一个 bbb, 如表 3所示。  The data collector extracts the reported report index table according to the local index table of the field. There is no specific original log location corresponding to each index content in the report index table, and only the content of each data corresponding field, that is, the query word. Usually, before the report, the query words are de-reprocessed so that the fields of the reported fields are not duplicated. For example, the field 1 in the table has two aaa and one bbb. After de-reprocessing, there is only one aaa and one bbb in the index table, as shown in Table 3.
对于新增的本地索引表中提取出的查询词, 将其与已上报的上报索引表 相比对, 如果查询词相同, 则不添加到上报索引表中, 仅上传新增的不重复 的上报索引表。  For the query words extracted from the newly added local index table, compare them with the reported report index table. If the query words are the same, they are not added to the report index table, and only the newly added non-repeating reports are uploaded. direction chart.
发送子单元 2004用于将所述字段的上报索引表发送给数据查询服务器 10 的第一索引单元 100, 用以所述数据查询服务器建立所述字段的集中索引表。  The sending sub-unit 2004 is configured to send the report index table of the field to the first index unit 100 of the data query server 10, and use the data query server to establish a centralized index table of the field.
第一索引单元 100包括第一接收子单元 1001和第一索引子单元 1002。 第一接收子单元 1001用于接收各数据采集器发送的所述字段的上报索引 表。 所述上报索引表中包括发送上报索引表的数据采集器中的数据对应于所 述字段的查询词。 第一索引子单元 1002 用于建立所述查询词与所述采集器标识的映射关 系, 形成所述字段的集中索引表。 The first index unit 100 includes a first receiving subunit 1001 and a first index subunit 1002. The first receiving subunit 1001 is configured to receive a reporting index table of the field sent by each data collector. The report index table includes a query word corresponding to the field in the data collector that sends the report index table. The first index sub-unit 1002 is configured to establish a mapping relationship between the query word and the collector identifier, and form a centralized index table of the field.
第一索引子单元 1002针对不同的字段,分别建立对应字段的集中索引表。 例如对于字段 1 ,集中索引表中会存储各个数据采集器上包含的查询词以及对 应的采集器标识, 如表 4所示。  The first index subunit 1002 respectively establishes a centralized index table of corresponding fields for different fields. For example, for field 1, the query words contained in each data collector and the corresponding collector identifiers are stored in the centralized index table, as shown in Table 4.
第一索引子单元 1002对应每个字段建立一个集中索引表, 每个数据采集 器上报的上报索引表汇总到数据查询服务器的集中索引表中。  The first index sub-unit 1002 establishes a centralized index table for each field, and the report index table reported by each data collector is summarized into a centralized index table of the data query server.
通过第一索引单元 100和第二索引单元 200分别在数据查询服务器 10和 数据采集器 20中建立索引表后, 当接收到查询请求时, 即可通过索引表找到 需要查询的数据。  After the index table is established in the data query server 10 and the data collector 20 by the first index unit 100 and the second index unit 200, respectively, when the query request is received, the data to be queried can be found through the index table.
第一接收单元 101用于接收输入的查询请求。  The first receiving unit 101 is configured to receive an input query request.
用户通过表格检索或表达式检索等方式, 输入查询请求。 对于表格检索 方式, 与用户的交互界面中会给出固定的字段, 用户可以在多个字段的提示 框中输入查询词, 最后通过提交按键将查询请求提交给数据查询服务器。 对 于表达式检索的方式, 则是用户直接输入所要查询的字段及该字段的查询词, 提交给数据查询服务器。  The user inputs a query request by means of a form search or an expression search. For the form retrieval method, a fixed field is given in the interaction interface with the user, the user can input the query word in the prompt box of multiple fields, and finally submit the query request to the data query server through the submit button. For the expression retrieval method, the user directly inputs the field to be queried and the query word of the field, and submits it to the data query server.
第一接收单元 101接收到的查询请求中携带所要查询的字段和所述字段 的查询词。 The query request received by the first receiving unit 101 carries the field to be queried and the query word of the field.
当查询请求包含多个字段时, 第一接收单元 101接收到的查询请求还包 括各个字段的查询词之间的逻辑关系。 例如, 用户在不同的检索字段输入查 询词, 第一接收单元 101接收这些在各个不同的检索字段的查询词及各查询 词之间的逻辑关系。  When the query request contains a plurality of fields, the query request received by the first receiving unit 101 further includes a logical relationship between the query words of the respective fields. For example, the user inputs a query word in a different search field, and the first receiving unit 101 receives the logical relationship between the query words and the respective query words in the respective different search fields.
第一查询单元 102用于从所述字段的集中索引表中, 查询得到第一接收 单元 101接收的所述查询请求携带的查询词对应的采集器标识。  The first querying unit 102 is configured to obtain, from the centralized index table of the field, the collector identifier corresponding to the query word carried by the query request received by the first receiving unit 101.
当查询请求中包括字段 1时, 第一查询单元 102则从字段 1的集中索引 表中得到与查询词相匹配的查询词, 得到对应的采集器标识, 从而得到所述 查询词在哪一个的数据采集器。  When the query request includes the field 1, the first query unit 102 obtains the query word matching the query word from the centralized index table of the field 1, and obtains the corresponding collector identifier, thereby obtaining the query word. Data collector.
对于查询请求包括多个字段的情况, 第一查询单元 102 包括: 解析子单 元、 第一查询子单元和第一过滤子单元(图未示) 。  For the case where the query request includes multiple fields, the first query unit 102 includes: a parsing subunit, a first query subunit, and a first filtering subunit (not shown).
第一解析子单元用于获取所述查询请求中各字段的查询词, 并记录所述 各字段的查询词之间的逻辑关系。 The first parsing subunit is configured to obtain a query word of each field in the query request, and record the The logical relationship between the query words of each field.
第一查询子单元, 用于从所述字段对应的集中索引表中查询得到所述第 一解析子单元获取的所述各字段的查询词对应的采集器标识。  The first query sub-unit is configured to query, from the centralized index table corresponding to the field, the collector identifier corresponding to the query word of each field obtained by the first parsing sub-unit.
第一过滤子单元, 用于根据所述第一解析子单元获取的所述各字段的查 询词之间的逻辑关系, 从第一查询子单元查询得到的所述采集器标识筛选得 到满足所述逻辑关系的采集器标识。  a first filtering subunit, configured to: according to the logical relationship between the query words of the fields obtained by the first parsing subunit, the collector identifier that is obtained by querying from the first query subunit is satisfied The collector ID of the logical relationship.
第一处理单元 103 用于根据所述查询请求生成携带有所述字段和查询词 的查询命令, 并将所述查询命令发送给第一查询单元 102查询得到的采集器 标识对应的数据采集器 20。  The first processing unit 103 is configured to generate, according to the query request, a query command that carries the field and the query word, and send the query command to the data collector 20 corresponding to the obtained collector identifier that is obtained by the first query unit 102. .
第一处理单元 103 生成的查询命令可以与输入的查询请求相同, 也可以 只包括所发送的目的数据采集器包括的查询词。例如,查询请求为:(字段 l=aaa) OR (字段 2=ccc), 则第一查询单元 102从表 4所示的集中索引表中查询得到 aaa对应的采集器标识为采集器 1和采集器 3 , 第一查询单元 102从表 5所示 的集中索引表中查询得到 ccc对应的采集器标识为采集器 1 , aaa与 ccc之间 的逻辑关系为"或",第一处理单元 103生成向采集器 1发送的查询命令为:(字 段 l=aaa) OR (字段 2=ccc), 第一处理单元 103生成向采集器 3发送的查询命 令为: 字段 2=ccc。  The query command generated by the first processing unit 103 may be the same as the input query request, or may include only the query words included in the transmitted destination data collector. For example, if the query request is: (field l=aaa) OR (field 2=ccc), the first query unit 102 queries the centralized index table shown in Table 4 to obtain the collector identifier corresponding to aaa as the collector 1 and the collection. The first query unit 102 queries the centralized index table shown in Table 5 to obtain the collector identifier corresponding to ccc as the collector 1, and the logical relationship between aaa and ccc is "OR", and the first processing unit 103 generates The query command sent to the collector 1 is: (field l = aaa) OR (field 2 = ccc), and the first processing unit 103 generates a query command sent to the collector 3 as: field 2 = ccc.
数据采集器 20的第二接收单元 201用于接收数据查询服务器 10发送的 查询命令。  The second receiving unit 201 of the data collector 20 is configured to receive the query command sent by the data query server 10.
所述查询命令包括数据查询服务器 10接收的查询请求中携带的所要查询 的字段和所述字段的查询词, 可以包括多个字段的查询词和所述查询词之间 的逻辑关系。  The query command includes a field to be queried and a query term of the field carried in the query request received by the data query server 10, and may include a logical relationship between the query words of the plurality of fields and the query word.
第二查询单元 202用于从所述字段对应的本地索引表中查询得到与第二 接收单元 201接收的查询命令中的查询词相匹配的数据的存储位置。  The second query unit 202 is configured to query, from the local index table corresponding to the field, a storage location of data that matches the query word in the query command received by the second receiving unit 201.
可选地, 对于查询命令中包含多个字段的查询词的情况, 第二查询单元 202包括: 第二解析子单元、 第二查询子单元和第二过滤子单元(图未示) 。  Optionally, for the case that the query word includes multiple fields in the query command, the second query unit 202 includes: a second parsing subunit, a second query subunit, and a second filtering subunit (not shown).
第二解析子单元用于若第二接收单元 201接收的所述查询命令中携带多 个所要查询的字段时, 获取所述查询命令中各字段的查询词, 并记录所述查 询命令中携带的各字段的查询词之间的逻辑关系。 第二解析子单元获取的所述查询命令中各字段的查询词相匹配的数据的存储 位置。 The second parsing sub-unit is configured to: when the query command received by the second receiving unit 201 carries a plurality of fields to be queried, obtain query terms of each field in the query command, and record the information carried in the query command The logical relationship between the query words of each field. The storage location of the data matching the query words of each field in the query command obtained by the second parsing subunit.
第二过滤子单元用于根据所述第二解析子单元获取的所述各字段的查询 词之间的逻辑关系, 从所述第二查询子单元查询得到的所述数据的存储位置 中筛选得到满足所述逻辑关系的数据的存储位置。  The second filtering subunit is configured to filter, according to the logical relationship between the query words of the fields obtained by the second parsing subunit, from the storage location of the data obtained by querying by the second query subunit The storage location of the data that satisfies the logical relationship.
第二处理单元 203用于根据第二查询单元 202查询得到的所述数据的存 储位置, 获取所述数据并发送给所述数据查询服务器。  The second processing unit 203 is configured to obtain the data according to the storage location of the data that is queried by the second query unit 202, and send the data to the data query server.
第二处理单元 203将第二查询单元 202查询得到的数据发送给数据查询 服务器 10的第一输出单元 104, 用以输出所述查询请求的查询结果。  The second processing unit 203 sends the data queried by the second query unit 202 to the first output unit 104 of the data query server 10 for outputting the query result of the query request.
第一输出单元 104用于接收数据采集器 20的第二处理单元 203返回的所 述数据, 根据接收到的数据形成所述查询请求的查询结果并输出。  The first output unit 104 is configured to receive the data returned by the second processing unit 203 of the data collector 20, form a query result of the query request according to the received data, and output the result.
本发明实施例提供的数据查询服务器、 数据采集器及系统, 利用第一索 引单元在数据查询服务器中建立集中索引表, 利用第二索引单元在数据采集 器中建立本地索引表, 可以提高数据查询的处理速度。 对于待查询的数据只 存在于少数数据采集器的情况, 本发明不用每台数据采集器都去查询, 降低 了数据采集器的负担。 对于需要频繁查询的日志系统, 本发明实施例提供的 方案可以有效的减轻数据采集器的系统资源占用, 使数据采集器可以有更多 的资源用于提高采集的性能, 从而提升系统整体的处理能力。  The data query server, the data collector and the system provided by the embodiment of the present invention use the first index unit to establish a centralized index table in the data query server, and use the second index unit to establish a local index table in the data collector, thereby improving data query. Processing speed. In the case that the data to be queried exists only in a few data collectors, the present invention does not need to query each data collector, which reduces the burden on the data collector. For a log system that requires frequent query, the solution provided by the embodiment of the present invention can effectively reduce the system resource occupation of the data collector, so that the data collector can have more resources for improving the collection performance, thereby improving the overall processing of the system. ability.
实施例三  Embodiment 3
图 7为本实施例提供的数据查询服务器 10的示意图, 如图 7所示, 所述 数据查询服务器 10包括: 网络接口 71、 处理器 72和存储器 73。 系统总线 74 用于连接网络接口 71、 处理器 72和存储器 73。  FIG. 7 is a schematic diagram of the data query server 10 according to the embodiment. As shown in FIG. 7, the data query server 10 includes: a network interface 71, a processor 72, and a memory 73. The system bus 74 is used to connect the network interface 71, the processor 72, and the memory 73.
网络接口 71用于与数据采集器 20进行通信。  Network interface 71 is used to communicate with data collector 20.
存储器 73可以是永久存储器, 例如硬盘驱动器和闪存, 存储器 73中具 有软件模块和设备驱动程序。 软件模块能够执行本发明上述方法的各种功能 模块; 设备驱动程序可以是网络和接口驱动程序。  The memory 73 may be a persistent storage such as a hard disk drive and a flash memory having a software module and a device driver. The software modules are capable of executing the various functional modules of the above described method of the present invention; the device drivers can be network and interface drivers.
在启动时, 这些软件组件被加载到存储器 73中, 然后被处理器 72访问 并执行如下指令:  At startup, these software components are loaded into memory 73 and then accessed by processor 72 and executed as follows:
接收输入的查询请求, 所述查询请求中携带所要查询的字段和所述字段 中的查询词; 从所述字段对应的集中索引表中查询得到所述查询词对应的采集器标 识, 所述集中索引表中存储所述字段中的查询词与采集器标识的对应关系; 根据所述查询请求生成携带有所述字段和查询词的查询命令, 并将所述 查询命令发送给所述采集器标识对应的数据采集器, 用以所述数据采集器通 过所述数据采集器中所述查询命令携带的字段对应的本地索引表查询得到与 所述查询命令中携带的查询词相匹配的数据; Receiving an input query request, where the query request carries a field to be queried and a query word in the field; Obtaining, by the centralized index table corresponding to the field, the collector identifier corresponding to the query term, where the centralized index table stores a correspondence between the query word and the collector identifier in the field; generating according to the query request a query command carrying the field and the query word, and sending the query command to the data collector corresponding to the collector identifier, where the data collector is carried by the query command in the data collector The local index table query corresponding to the field obtains data matching the query word carried in the query command;
接收所述数据采集器返回的所述数据, 根据接收到的数据形成所述查询 请求的查询结果并输出。  Receiving the data returned by the data collector, forming a query result of the query request according to the received data, and outputting the result.
本实施例的数据查询服务器通过字段的集中索引表找到查询词对应的采 集器标识, 从而在采集器标识对应的数据采集器中获取对应的数据, 可以有 效的减轻数据查询服务器的系统资源占用, 提高数据查询的处理速度。  The data query server of the present embodiment finds the collector identifier corresponding to the query word through the centralized index table of the field, so that the corresponding data is obtained in the data collector corresponding to the collector identifier, which can effectively alleviate the system resource occupation of the data query server. Improve the processing speed of data queries.
进一步的, 所述处理器访问存储器 73的软件组件后, 执行以下过程的指 令:  Further, after the processor accesses the software component of the memory 73, the instructions of the following process are executed:
针对所述字段, 建立所述字段对应的集中索引表;  Establishing a centralized index table corresponding to the field for the field;
所述建立所述字段对应的集中索引表, 包括:  The establishing a centralized index table corresponding to the field includes:
接收各数据采集器发送的所述字段的上报索引表, 所述上报索引表中包 括发送所述上报索引表的数据采集器中的数据对应于所述字段的查询词; 建立所述查询词与所述采集器标识的映射关系, 形成所述字段的集中索 引表。  Receiving a report index table of the field sent by each data collector, where the report index table includes a query word corresponding to the field in a data collector that sends the report index table; establishing the query word and The mapping relationship of the collector identifier forms a centralized index table of the field.
上述指令过程就是数据查询服务器建立集中索引表的过程, 通过建立查 询词与采集器标识的映射关系, 以便在数据查询时, 找到查询词对应的采集 器标识, 并从对应的数据采集器得到匹配的数据。  The above instruction process is a process in which the data query server establishes a centralized index table, and establishes a mapping relationship between the query word and the collector identifier, so that when the data is queried, the collector identifier corresponding to the query word is found, and the matching is obtained from the corresponding data collector. The data.
进一步的, 所述处理器访问存储器 73的软件组件后, 执行以下过程的指 令:  Further, after the processor accesses the software component of the memory 73, the instructions of the following process are executed:
当所述查询请求中携带至少两个所要查询的字段时, 获取所述查询请求 中各字段的查询词, 并记录所述各字段的查询词之间的逻辑关系;  When the query request carries at least two fields to be queried, the query words of each field in the query request are obtained, and the logical relationship between the query words of the fields is recorded;
从所述各字段对应的集中索引表中查询得到所述各字段的查询词对应的 采集器标识;  Querying, from the centralized index table corresponding to each field, a collector identifier corresponding to the query word of each field;
根据所述各字段的查询词之间的逻辑关系, 从查询得到的所述采集器标 识筛选得到满足所述逻辑关系的采集器标识。 上述指令过程就是数据查询服务器针对多个所要查询的字段的查询词查 找对应的采集器标识的过程, 可以避免访问不能完全满足查询请求的数据采According to the logical relationship between the query words of the fields, the collector identifier obtained from the query is filtered to obtain a collector identifier that satisfies the logical relationship. The above instruction process is a process in which the data query server searches for a corresponding collector identifier for a plurality of query words of the field to be queried, and can avoid accessing data that cannot fully satisfy the query request.
^l^S^。 ^l^S^.
图 8为本实施例提供的数据采集器 20的示意图, 如图 8所示, 所述数据 采集器 20包括: 网络接口 81、 处理器 82和存储器 83。 系统总线 84用于连 接网络接口 81、 处理器 82和存储器 83。  FIG. 8 is a schematic diagram of the data collector 20 provided by the embodiment. As shown in FIG. 8, the data collector 20 includes: a network interface 81, a processor 82, and a memory 83. System bus 84 is used to connect network interface 81, processor 82, and memory 83.
网络接口 81用于与数据查询服务器 10进行通信。  Network interface 81 is used to communicate with data query server 10.
存储器 83可以是永久存储器, 例如硬盘驱动器和闪存, 存储器 83中具 有软件模块和设备驱动程序。 软件模块能够执行本发明上述方法的各种功能 模块; 设备驱动程序可以是网络和接口驱动程序。  The memory 83 may be a persistent storage such as a hard disk drive and a flash memory having a software module and a device driver. The software modules are capable of executing the various functional modules of the above described method of the present invention; the device drivers can be network and interface drivers.
在启动时, 这些软件组件被加载到存储器 83中, 然后被处理器 82访问 并执行如下指令:  At startup, these software components are loaded into memory 83 and then accessed by processor 82 and executed as follows:
接收数据查询服务器发送的查询命令, 所述查询命令包括所述数据查询 服务器接收的查询请求中携带的所要查询的字段和所述字段中的查询词; 从所述字段对应的本地索引表中查询得到与所述查询命令相匹配的数据 的存储位置, 所述本地索引表中存储所述字段中的查询词与所述数据的存储 位置的对应关系;  Receiving a query command sent by the data query server, where the query command includes a field to be queried carried in the query request received by the data query server and a query word in the field; querying from a local index table corresponding to the field Obtaining a storage location of the data matching the query command, where the local index table stores a correspondence between the query words in the field and the storage location of the data;
根据所述数据的存储位置, 获取所述数据并发送给所述数据查询服务器。 本实施例的数据采集器通过字段的本地索引表找到查询词对应的数据, 提供给数据查询服务器, 可以有效的减轻数据采集器的系统资源占用, 使数 据采集器可以有更多的资源用于提高采集的性能, 提高数据查询的处理速度。  The data is acquired and sent to the data query server according to the storage location of the data. The data collector of the embodiment finds the data corresponding to the query word through the local index table of the field, and provides the data query server, which can effectively reduce the system resource occupation of the data collector, so that the data collector can have more resources for the data collector. Improve the performance of the acquisition and improve the processing speed of data query.
进一步的, 所述处理器访问存储器 83的软件组件后, 执行以下过程的指 令:  Further, after the processor accesses the software component of the memory 83, the instructions of the following process are executed:
针对所述字段, 建立所述字段对应的本地索引表;  Establishing a local index table corresponding to the field for the field;
所述建立所述字段对应的本地索引表, 包括:  The establishing a local index table corresponding to the field includes:
获取当前数据采集器中的数据和所述数据的存储位置, 所述数据中包括 至少一个字段的内容;  Obtaining data in the current data collector and a storage location of the data, where the data includes content of at least one field;
针对每一个字段, 将所述数据在该字段的内容作为所述数据的查询词, 建立所述查询词与所述存储位置的映射关系, 形成该字段在所述当前数据采 集器的本地索引表。 上述指令过程就是数据采集器建立本地索引表的过程, 通过建立查询词 与数据的存储位置的映射关系, 从而根据查询词对应的数据的存储位置获取 数据。 For each field, the content of the data in the field is used as a query word of the data, and a mapping relationship between the query word and the storage location is established, and a local index table of the field in the current data collector is formed. . The above instruction process is a process in which the data collector establishes a local index table, and obtains a data according to a storage location of the data corresponding to the query word by establishing a mapping relationship between the query word and the storage location of the data.
进一步的, 所述处理器访问存储器 83的软件组件后, 执行以下过程的指 令:  Further, after the processor accesses the software component of the memory 83, the instructions of the following process are executed:
从该字段的本地索引表中提取出所述查询词, 对所述查询词进行去重处 理, 形成所述当前数据采集器的所述字段的上报索弓 )表;  Extracting the query term from the local index table of the field, performing deduplication processing on the query word to form a report of the field of the current data collector;
将所述字段的上报索引表发送给所述数据查询服务器, 用以所述数据查 询服务器建立所述字段对应的集中索引表。  And sending the report index table of the field to the data query server, where the data query server establishes a centralized index table corresponding to the field.
上述指令过程就是数据采集器根据本地索引表建立上报索引表并发送给 数据查询服务器的过程, 以便数据查询服务器建立集中索引表。  The above instruction process is a process in which the data collector establishes a report index table according to the local index table and sends it to the data query server, so that the data query server establishes a centralized index table.
进一步的, 所述处理器访问存储器 83的软件组件后, 执行以下过程的指 令:  Further, after the processor accesses the software component of the memory 83, the instructions of the following process are executed:
当所述查询命令中携带至少两个所要查询的字段时, 获取所述查询命令 中各字段的查询词, 并记录所述各字段的查询词之间的逻辑关系;  When the query command carries at least two fields to be queried, the query words of each field in the query command are obtained, and the logical relationship between the query words of the fields is recorded;
从所述各字段对应的本地索引表中查询得到与所述各字段的查询词相匹 配的数据的存储位置;  Querying, from the local index table corresponding to each field, a storage location of data matching the query words of the respective fields;
根据所述各字段的查询词之间的逻辑关系, 从查询得到的所述数据的存 储位置中筛选得到满足所述逻辑关系的数据的存储位置。  And according to the logical relationship between the query words of the respective fields, the storage location of the data satisfying the logical relationship is filtered out from the storage location of the data obtained by the query.
上述指令过程就是数据采集器针对多个所要查询的字段的查询词找到对 应的存储位置的过程, 可以避免获取不能完全满足查询命令的数据。  The above instruction process is a process in which the data collector finds a corresponding storage location for a plurality of query words of the field to be queried, and can avoid obtaining data that cannot completely satisfy the query command.
专业人员应该还可以进一步意识到, 结合本文中所公开的实施例描述的 各示例的单元及算法步骤, 能够以电子硬件、 计算机软件或者二者的结合来 实现, 为了清楚地说明硬件和软件的可互换性, 在上述说明中已经按照功能 一般性地描述了各示例的组成及步骤。 这些功能究竟以硬件还是软件方式来 执行, 取决于技术方案的特定应用和设计约束条件。 专业技术人员可以对每 个特定的应用来使用不同方法来实现所描述的功能, 但是这种实现不应认为 超出本发明的范围。  A person skilled in the art should further appreciate that the elements and algorithm steps of the various examples described in connection with the embodiments disclosed herein can be implemented in electronic hardware, computer software, or a combination of both, in order to clearly illustrate hardware and software. Interchangeability, the composition and steps of the various examples have been generally described in terms of function in the above description. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the solution. A person skilled in the art can use different methods for implementing the described functions for each particular application, but such implementation should not be considered to be beyond the scope of the present invention.
结合本文中所公开的实施例描述的方法或算法的步骤可以用硬件、 处理 器执行的软件模块, 或者二者的结合来实施。 软件模块可以置于随机存储器 ( RAM ) 、 内存、 只读存储器(ROM ) 、 电可编程 ROM、 电可擦除可编程 ROM, 寄存器、 硬盘、 可移动磁盘、 CD-ROM, 或技术领域内所公知的任意 其它形式的存储介质中。 The steps of a method or algorithm described in connection with the embodiments disclosed herein can be implemented in hardware, a software module executed by a processor, or a combination of both. Software modules can be placed in random access memory (RAM), memory, read only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or any other form of storage known in the art. In the medium.
以上所述的具体实施方式, 对本发明的目的、 技术方案和有益效果进行 了进一步详细说明, 所应理解的是, 以上所述仅为本发明的具体实施方式而 已, 并不用于限定本发明的保护范围, 凡在本发明的精神和原则之内, 所做 的任何修改、 等同替换、 改进等, 均应包含在本发明的保护范围之内。  The above described embodiments of the present invention are further described in detail, and the embodiments of the present invention are intended to be illustrative only. The scope of the protection, any modifications, equivalents, improvements, etc., made within the spirit and scope of the invention are intended to be included within the scope of the invention.

Claims

权 利 要 求 Rights request
1、 一种数据查询方法, 其特征在于, 所述方法包括: 1. A data query method, characterized in that the method includes:
接收输入的查询请求, 所述查询请求中携带所要查询的字段和所述字段 中的查询词; Receive an input query request, which carries the field to be queried and the query word in the field;
从所述字段对应的集中索引表中查询得到所述查询词对应的采集器标 识, 所述集中索引表中存储所述字段中的查询词与采集器标识的对应关系; 根据所述查询请求生成携带有所述字段和查询词的查询命令, 并将所述 查询命令发送给所述采集器标识对应的数据采集器, 用以所述数据采集器在 所述数据采集器中所述查询命令携带的字段对应的本地索引表中, 查询得到 与所述查询命令中携带的查询词相匹配的数据; Query the collector identification corresponding to the query word from the centralized index table corresponding to the field, and store the corresponding relationship between the query word in the field and the collector identification in the centralized index table; Generate according to the query request A query command carrying the field and query word is sent to the data collector corresponding to the collector identification, so that the data collector carries the query command in the data collector. In the local index table corresponding to the field, query to obtain data that matches the query word carried in the query command;
接收所述数据采集器返回的所述数据, 根据接收到的数据形成所述查询 请求的查询结果并输出。 Receive the data returned by the data collector, form the query result of the query request based on the received data and output it.
2、 根据权利要求 1所述的数据查询方法, 其特征在于, 所述从所述字段 对应的集中索引表中查询得到所述查询词对应的采集器标识之前, 还包括: 针对所述字段, 建立所述字段对应的集中索引表; 2. The data query method according to claim 1, characterized in that, before querying the centralized index table corresponding to the field to obtain the collector identifier corresponding to the query word, it further includes: for the field, Establish a centralized index table corresponding to the fields;
所述建立所述字段对应的集中索引表, 包括: The establishment of a centralized index table corresponding to the field includes:
接收各数据采集器发送的所述字段的上报索引表, 所述上报索引表中包 括发送所述上报索引表的数据采集器中的数据对应于所述字段的查询词; 在所述字段的集中索引表中, 存储数据采集器的标识与所述数据采集器 上报的上报索引表中该字段的查询词的对应关系。 Receive the reporting index table of the field sent by each data collector. The reporting index table includes the query word corresponding to the field in the data collector that sent the reporting index table; in the set of the field In the index table, the correspondence relationship between the identification of the data collector and the query word of this field in the reporting index table reported by the data collector is stored.
3、 根据权利要求 1所述的数据查询方法, 其特征在于, 所述从所述字段 对应的集中索引表中查询得到所述查询词对应的采集器标识, 包括: 3. The data query method according to claim 1, characterized in that: querying the centralized index table corresponding to the field to obtain the collector identification corresponding to the query word includes:
若所述查询请求中携带至少两个所要查询的字段, 获取所述查询请求中 各字段的查询词, 并记录所述查询请求中携带的所述各字段的查询词之间的 逻辑关系; If the query request carries at least two fields to be queried, obtain the query words of each field in the query request, and record the logical relationship between the query words of each field carried in the query request;
从所述各字段对应的集中索引表中查询得到所述各字段的查询词对应的 采集器标识; Query from the centralized index table corresponding to each field to obtain the query term corresponding to each field. Collector ID;
根据所述各字段的查询词之间的逻辑关系, 从查询得到的所述采集器标 识中筛选得到满足所述逻辑关系的采集器标识。 According to the logical relationship between the query words in each field, the collector identifiers obtained from the query are screened to obtain the collector identifiers that satisfy the logical relationship.
4、 一种数据查询方法, 其特征在于, 所述方法包括: 4. A data query method, characterized in that the method includes:
接收数据查询服务器发送的查询命令, 所述查询命令携带所要查询的字 段和所述字段中的查询词; Receive a query command sent by the data query server, the query command carries the field to be queried and the query word in the field;
从所述字段对应的本地索引表中查询得到与所述查询命令中的查询词相 匹配的数据的存储位置, 所述本地索引表中存储所述字段中的查询词与所述 数据的存储位置的对应关系; Search the local index table corresponding to the field to obtain the storage location of the data that matches the query word in the query command. The local index table stores the query word in the field and the storage location of the data. corresponding relationship;
根据所述数据的存储位置, 获取所述数据并发送给所述数据查询服务器。 According to the storage location of the data, the data is obtained and sent to the data query server.
5、 根据权利要求 4所述的数据查询方法, 其特征在于, 所述从所述字段 对应的本地索引表中查询得到与所述查询命令中的查询词相匹配的数据的存 储位置之前, 还包括: 5. The data query method according to claim 4, characterized in that, before querying the local index table corresponding to the field to obtain the storage location of the data matching the query word in the query command, include:
针对所述字段, 建立所述字段对应的本地索引表; For the field, establish a local index table corresponding to the field;
所述建立所述字段对应的本地索引表, 包括: The establishment of a local index table corresponding to the field includes:
获取当前数据采集器中的数据和所述数据的存储位置, 所述数据中包括 至少一个字段的内容; Obtain the data in the current data collector and the storage location of the data, where the data includes the content of at least one field;
针对每一个字段, 将所述数据在该字段的内容作为所述数据的查询词, 在所述当前数据采集器中所述字段的本地索引表中, 存储所述数据的查询词 与所述数据的存储位置的对应关系。 For each field, use the content of the data in the field as the query word of the data, and store the query word of the data and the data in the local index table of the field in the current data collector. corresponding relationship between storage locations.
6、 根据权利要求 5所述的数据查询方法, 其特征在于, 所述在所述当前 数据采集器中所述字段的本地索引表中, 存储所述数据的查询词与所述数据 的存储位置的对应关系之后, 还包括: 6. The data query method according to claim 5, characterized in that, in the local index table of the field in the current data collector, the query words of the data and the storage location of the data are stored. After the corresponding relationship, it also includes:
从该字段的本地索引表中提取出所述查询词, 对所述查询词进行去重处 理, 形成所述当前数据采集器的所述字段的上报索弓 )表; Extract the query word from the local index table of the field, perform deduplication processing on the query word, and form a reporting index table of the field of the current data collector;
将所述字段的上报索引表发送给所述数据查询服务器, 用以所述数据查 询服务器建立所述字段对应的集中索引表。 The reported index table of the field is sent to the data query server, so that the data query server establishes a centralized index table corresponding to the field.
7、 根据权利要求 4所述的数据查询方法, 其特征在于, 所述从所述字段 对应的本地索引表中查询得到与所述查询命令中的查询词相匹配的数据的存 储位置, 包括: 7. The data query method according to claim 4, characterized in that: querying the local index table corresponding to the field to obtain the storage location of the data matching the query word in the query command includes:
若所述查询命令中携带至少两个所要查询的字段, 获取所述查询命令中 各字段的查询词, 并记录所述查询命令中携带的各字段的查询词之间的逻辑 关系; If the query command carries at least two fields to be queried, obtain the query words of each field in the query command, and record the logical relationship between the query words of each field carried in the query command;
从所述各字段对应的本地索引表中查询得到与所述查询命令中各字段的 查询词相匹配的数据的存储位置; Query from the local index table corresponding to each field to obtain the storage location of the data that matches the query words of each field in the query command;
根据所述各字段的查询词之间的逻辑关系, 从查询得到的所述数据的存 储位置中筛选得到满足所述逻辑关系的数据的存储位置。 According to the logical relationship between the query words in each field, the storage locations of the data obtained from the query are filtered to obtain the storage locations of the data that satisfy the logical relationship.
8、 一种数据查询服务器, 其特征在于, 所述数据查询服务器包括: 第一接收单元, 用于接收输入的查询请求, 所述查询请求中携带所要查 询的字段和所述字段中的查询词; 8. A data query server, characterized in that, the data query server includes: a first receiving unit, configured to receive an input query request, where the query request carries the field to be queried and the query word in the field. ;
第一查询单元, 用于从所述字段对应的集中索引表中, 查询得到所述第 一接收单元接收的所述查询请求携带的查询词对应的采集器标识, 所述集中 索引表中存储所述字段中的查询词与采集器标识的对应关系; The first query unit is configured to query the collector identification corresponding to the query word carried in the query request received by the first receiving unit from the centralized index table corresponding to the field, and the centralized index table stores all the collector identifiers. The corresponding relationship between the query words in the above field and the collector identification;
第一处理单元, 用于根据所述查询请求生成携带有所述字段和查询词的 查询命令, 并将所述查询命令发送给所述第一查询单元查询得到的采集器标 识对应的数据采集器, 用以所述数据采集器在所述数据采集器中所述查询命 令携带的字段对应的本地索引表中, 查询得到与所述查询命令中携带的查询 词相匹配的数据; The first processing unit is configured to generate a query command carrying the field and the query word according to the query request, and send the query command to the data collector corresponding to the collector identification obtained by querying the first query unit. , used by the data collector to query the local index table corresponding to the field carried by the query command in the data collector to obtain data that matches the query word carried in the query command;
第一输出单元, 用于接收所述数据采集器返回的所述数据, 根据接收到 的数据形成所述查询请求的查询结果并输出。 The first output unit is configured to receive the data returned by the data collector, form the query result of the query request based on the received data, and output it.
9、 根据权利要求 8所述的数据查询服务器, 其特征在于, 所述数据查询 服务器还包括: 9. The data query server according to claim 8, characterized in that the data query server further includes:
第一索引单元, 用于针对所述字段, 建立所述字段对应的集中索引表; 所述第一索引单元包括: 第一接收子单元, 用于接收各数据采集器发送的所述字段的上报索引表, 所述上报索引表中包括发送所述上报索引表的数据采集器中的数据对应于所 述字段的查询词; The first index unit is used to establish a centralized index table corresponding to the field for the field; the first index unit includes: The first receiving subunit is used to receive the reporting index table of the field sent by each data collector. The reporting index table includes a query corresponding to the field in the data collector that sent the reporting index table. word;
第一索引子单元, 用于在所述字段的集中索引表中, 存储数据采集器的 标识与所述数据采集器上报的上报索引表中该字段的查询词的对应关系。 The first index subunit is used to store, in the centralized index table of the field, the corresponding relationship between the identification of the data collector and the query word of the field in the reporting index table reported by the data collector.
10、 根据权利要求 8所述的数据查询服务器, 其特征在于, 所述第一查 询单元包括: 10. The data query server according to claim 8, characterized in that the first query unit includes:
第一解析子单元, 用于若所述第一接收单元接收的所述查询请求中携带 至少两个所要查询的字段, 获取所述查询请求中各字段的查询词, 并记录所 述查询请求中携带的所述各字段的查询词之间的逻辑关系; The first parsing subunit is used to obtain the query words of each field in the query request if the query request received by the first receiving unit carries at least two fields to be queried, and record the query words in the query request. The logical relationship between the query words carried in each field;
第一查询子单元, 用于从所述各字段对应的集中索引表中查询得到所述 第一解析子单元获取的所述各字段的查询词对应的采集器标识; The first query subunit is configured to query the centralized index table corresponding to each field to obtain the collector identifier corresponding to the query word of each field obtained by the first parsing subunit;
第一过滤子单元, 用于根据所述第一解析子单元获取的所述各字段的查 询词之间的逻辑关系, 从所述第一查询子单元查询得到的所述采集器标识中 筛选得到满足所述逻辑关系的采集器标识。 The first filtering subunit is configured to filter the collector identification obtained from the query of the first querying subunit according to the logical relationship between the query words of each field obtained by the first parsing subunit. The identifier of the collector that satisfies the logical relationship.
11、 一种数据采集器, 其特征在于, 包括: 11. A data collector, characterized by including:
第二接收单元, 用于接收数据查询服务器发送的查询命令, 所述查询命 令携带所要查询的字段和所述字段中的查询词; The second receiving unit is configured to receive a query command sent by the data query server, where the query command carries the field to be queried and the query word in the field;
第二查询单元, 用于从所述字段对应的本地索引表中查询得到与所述第 二接收单元接收的查询命令中的查询词相匹配的数据的存储位置, 所述本地 索引表中存储所述字段中的查询词与所述数据的存储位置的对应关系; The second query unit is configured to query the local index table corresponding to the field to obtain the storage location of the data that matches the query word in the query command received by the second receiving unit, where all the data stored in the local index table are stored. The corresponding relationship between the query words in the field and the storage location of the data;
第二处理单元, 用于根据所述第二查询单元查询得到的所述数据的存储 位置, 获取所述数据并发送给所述数据查询服务器。 The second processing unit is configured to obtain the data according to the storage location of the data queried by the second query unit and send it to the data query server.
12、 根据权利要求 11所述的数据采集器, 其特征在于, 所述数据采集器 还包括: 12. The data collector according to claim 11, characterized in that, the data collector further includes:
第二索引单元, 用于针对所述字段, 建立所述字段对应的本地索引表; 所述第二索引单元包括: 获取子单元, 用于获取当前数据采集器中的数据和所述数据的存储位置, 所述数据中包括至少一个字段的内容; The second index unit is used to establish a local index table corresponding to the field for the field; the second index unit includes: The acquisition subunit is used to acquire the data in the current data collector and the storage location of the data, where the data includes the content of at least one field;
第二索引子单元, 用于针对所述获取子单元获取的每一个字段, 将所述 数据在该字段的内容作为所述数据的查询词, 在所述数据采集器中所述字段 的本地索引表中, 存储所述数据的查询词与所述数据的存储位置的对应关系。 The second index subunit is used for each field obtained by the acquisition subunit, using the content of the data in the field as the query word of the data, and the local index of the field in the data collector. In the table, the corresponding relationship between the query words of the data and the storage location of the data is stored.
13、 根据权利要求 12所述的数据采集器, 其特征在于, 所述第二索引单 元还包括: 13. The data collector according to claim 12, characterized in that the second index unit further includes:
第三索引子单元, 用于从所述第二索引子单元得到的该字段的本地索引 表中提取出所述查询词, 对所述查询词进行去重处理, 形成所述数据采集器 的所述字段的上报索引表; The third index subunit is used to extract the query word from the local index table of the field obtained by the second index subunit, and perform deduplication processing on the query word to form all the data collector. Reporting index table for the above fields;
发送子单元, 用于将所述第三索引子单元形成的所述字段的上报索引表 发送给所述数据查询服务器, 用以所述数据查询服务器建立所述字段的集中 索引表。 The sending subunit is configured to send the reporting index table of the field formed by the third indexing subunit to the data query server, so that the data query server can establish a centralized index table of the field.
14、 根据权利要求 11所述的数据采集器, 其特征在于, 所述第二查询单 元包括: 14. The data collector according to claim 11, characterized in that the second query unit includes:
第二解析子单元, 用于若所述第二接收单元接收的所述查询命令中携带 至少两个所要查询的字段, 获取所述查询命令中各字段的查询词, 并记录所 述查询命令中携带的各字段的查询词之间的逻辑关系; The second parsing subunit is used to obtain the query words of each field in the query command if the query command received by the second receiving unit carries at least two fields to be queried, and record the query words in the query command. The logical relationship between the query words carried in each field;
第二查询子单元, 用于从所述各字段对应的本地索引表中查询得到与所 述第二解析子单元获取的所述查询命令中各字段的查询词相匹配的数据的存 储位置; The second query subunit is used to query the local index table corresponding to each field to obtain the storage location of the data that matches the query words of each field in the query command obtained by the second parsing subunit;
第二过滤子单元, 用于根据所述第二解析子单元获取的所述各字段的查 询词之间的逻辑关系, 从所述第二查询子单元查询得到的所述数据的存储位 置中筛选得到满足所述逻辑关系的数据的存储位置。 The second filtering subunit is used to filter the storage locations of the data obtained from the query of the second querying subunit according to the logical relationship between the query words of each field obtained by the second parsing subunit. Obtain the storage location of data that satisfies the logical relationship.
15、 一种数据查询系统, 其特征在于, 所述系统包括: 15. A data query system, characterized in that the system includes:
如权利要求 8 ~ 10任一权项所述的数据查询服务器和如权利要求 11 ~ 14 任一权项所述的数据采集器。 The data query server as described in any one of claims 8 to 10 and the data collector as described in any one of claims 11 to 14.
PCT/CN2013/082130 2012-12-24 2013-08-23 Data query method and system WO2014101445A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201210566137.4A CN103064933B (en) 2012-12-24 2012-12-24 Data query method and system
CN201210566137.4 2012-12-24

Publications (1)

Publication Number Publication Date
WO2014101445A1 true WO2014101445A1 (en) 2014-07-03

Family

ID=48107563

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2013/082130 WO2014101445A1 (en) 2012-12-24 2013-08-23 Data query method and system

Country Status (2)

Country Link
CN (1) CN103064933B (en)
WO (1) WO2014101445A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023220973A1 (en) * 2022-05-18 2023-11-23 京东方科技集团股份有限公司 Data processing method and apparatus, and electronic device and computer-readable storage medium

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103064933B (en) * 2012-12-24 2016-06-29 华为技术有限公司 Data query method and system
CN105099735B (en) * 2014-05-07 2018-05-22 中国移动通信集团福建有限公司 A kind of method and system for obtaining magnanimity more detailed logging
CN105302827B (en) * 2014-06-30 2018-11-20 华为技术有限公司 A kind of searching method and equipment of event
CN104216957A (en) * 2014-08-20 2014-12-17 北京奇艺世纪科技有限公司 Query system and query method for video metadata
CN104317924A (en) * 2014-10-30 2015-01-28 中国银行股份有限公司 Data query method and device in local clearings
CN105871951A (en) * 2015-01-21 2016-08-17 上海可鲁系统软件有限公司 Industrial internet of things distributed business voucher processing method
CN107015990B (en) * 2016-01-27 2020-06-09 阿里巴巴集团控股有限公司 Data searching method and device
CN105930441B (en) * 2016-04-18 2019-04-26 华信咨询设计研究院有限公司 A kind of radio monitoring data query method
CN106354823A (en) * 2016-08-30 2017-01-25 北京旷视科技有限公司 Method, device and system for summarizing face matching system operation data
CN107784050A (en) * 2016-12-14 2018-03-09 平安科技(深圳)有限公司 Log information lookup method and device
CN107066610A (en) * 2017-05-02 2017-08-18 中国联合网络通信集团有限公司 A kind of price queries method and apparatus
CN107577506B (en) * 2017-08-07 2021-03-19 台州市吉吉知识产权运营有限公司 Data preloading method and system
CN109299219B (en) * 2018-08-31 2022-08-12 北京奥星贝斯科技有限公司 Data query method and device, electronic equipment and computer readable storage medium
CN109308305B (en) * 2018-09-30 2021-06-08 广州圣亚科技有限公司 Monitoring data query method and device and computer equipment
CN109299348B (en) * 2018-11-28 2021-09-28 北京字节跳动网络技术有限公司 Data query method and device, electronic equipment and storage medium
CN109885548A (en) * 2019-02-22 2019-06-14 网易(杭州)网络有限公司 Log inquiring method, device, storage medium and electronic device
CN110502915B (en) * 2019-08-30 2021-07-30 恩亿科(北京)数据科技有限公司 Data processing method, device and system
CN110674369A (en) * 2019-09-23 2020-01-10 杭州迪普科技股份有限公司 Data query method and device
CN111062193B (en) * 2019-12-16 2023-04-25 医渡云(北京)技术有限公司 Medical data labeling method and device, storage medium and electronic equipment
CN113486048A (en) * 2021-07-13 2021-10-08 广西电力职业技术学院 Data retrieval system and data retrieval method
CN117271562B (en) * 2023-11-21 2024-01-19 成都凌亚科技有限公司 Data acquisition processing method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1968132A (en) * 2006-10-16 2007-05-23 华为技术有限公司 Method for establishing call log correlation between network entities and searching correlated call log
CN102193917A (en) * 2010-03-01 2011-09-21 中国移动通信集团公司 Method and device for processing and querying data
CN102375853A (en) * 2010-08-24 2012-03-14 中国移动通信集团公司 Distributed database system, method for building index therein and query method
CN103064933A (en) * 2012-12-24 2013-04-24 华为技术有限公司 Data query method and system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5903888A (en) * 1997-02-28 1999-05-11 Oracle Corporation Method and apparatus for using incompatible types of indexes to process a single query
CN102789487B (en) * 2012-06-29 2015-09-02 用友软件股份有限公司 Data query retrieval process device and data query search processing method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1968132A (en) * 2006-10-16 2007-05-23 华为技术有限公司 Method for establishing call log correlation between network entities and searching correlated call log
CN102193917A (en) * 2010-03-01 2011-09-21 中国移动通信集团公司 Method and device for processing and querying data
CN102375853A (en) * 2010-08-24 2012-03-14 中国移动通信集团公司 Distributed database system, method for building index therein and query method
CN103064933A (en) * 2012-12-24 2013-04-24 华为技术有限公司 Data query method and system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023220973A1 (en) * 2022-05-18 2023-11-23 京东方科技集团股份有限公司 Data processing method and apparatus, and electronic device and computer-readable storage medium

Also Published As

Publication number Publication date
CN103064933B (en) 2016-06-29
CN103064933A (en) 2013-04-24

Similar Documents

Publication Publication Date Title
WO2014101445A1 (en) Data query method and system
US9672233B2 (en) Integrated search for shared storage using index throttling to maintain quality of service
CN103544261B (en) A kind of magnanimity structuring daily record data global index's management method and device
CN109936571B (en) Mass data sharing method, open sharing platform and electronic equipment
US20120290555A1 (en) Method, System and Apparatus of Hybrid Federated Search
US10877810B2 (en) Object storage system with metadata operation priority processing
WO2014015488A1 (en) Method and apparatus for data storage and query
CN111221791A (en) Method for importing multi-source heterogeneous data into data lake
WO2015062201A1 (en) Data query method, device, server and system
WO2013097231A1 (en) File access method and system
US11086995B2 (en) Malware scanning for network-attached storage systems
US20180181631A1 (en) Method and System for Big Data Exchange
WO2023273544A1 (en) Log file storage method and apparatus, device, and storage medium
WO2017161540A1 (en) Data query method, data object storage method and data system
CN106294826A (en) A kind of company-data Query method in real time and system
WO2021082401A1 (en) Data uploading method, system and apparatus, and electronic device
WO2017092384A1 (en) Clustered database distributed storage method and device
WO2011131079A1 (en) Event processing method and system for distributed control system
WO2017174013A1 (en) Data storage management method and apparatus, and data storage system
CN112162707A (en) Storage method, electronic device and storage medium for distributed storage system
WO2014101520A1 (en) Method and system for achieving analytic function based on mapreduce
WO2016082616A1 (en) Method and device for providing website authentication data for search engine
WO2014145099A1 (en) Shared media crawler database method and system
US9201889B1 (en) Integrated search for shared storage
EP2208317A2 (en) Compressing null columns in rows of the tabular data stream protocol

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13869336

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13869336

Country of ref document: EP

Kind code of ref document: A1