WO2014101445A1 - Procédé et système d'interrogation de données - Google Patents

Procédé et système d'interrogation de données Download PDF

Info

Publication number
WO2014101445A1
WO2014101445A1 PCT/CN2013/082130 CN2013082130W WO2014101445A1 WO 2014101445 A1 WO2014101445 A1 WO 2014101445A1 CN 2013082130 W CN2013082130 W CN 2013082130W WO 2014101445 A1 WO2014101445 A1 WO 2014101445A1
Authority
WO
WIPO (PCT)
Prior art keywords
query
data
field
index table
collector
Prior art date
Application number
PCT/CN2013/082130
Other languages
English (en)
Chinese (zh)
Inventor
谢永方
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2014101445A1 publication Critical patent/WO2014101445A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/93Document management systems

Definitions

  • the present invention relates to the field of computer network technologies, and in particular, to a data query method, a data query server, a data collector, and a data query system.
  • the system architecture In the massive data collection and query system, the system architecture has two modes: distributed storage and centralized storage. In either case, it faces the need for rapid storage and fast query of massive log data.
  • An existing distributed data query system includes a data query server and multiple data collectors.
  • the data collector is responsible for collecting (receiving, formatting, merging) logs, storing and indexing, and the data query server is a log query. Unified entrance.
  • the data query server sends a query command to all data collectors, and the query results of all data collectors are collected to summarize the final query result.
  • the logs to be queried in one query exist only in a few data collectors, and the query operations are very frequent.
  • This existing solution will increase the burden on all data collectors, including the power consumption of the data collector.
  • the central processing unit (CPU) resources in addition to the query, the data collector needs to do data receiving and warehousing work. If the query operation is frequent, it will also affect the data collector's acquisition performance and reduce The overall processing power of the system.
  • the original log data of another distributed data query system is centralized storage, each The data collector is only responsible for collecting (receiving, formatting, merging) and reporting the log.
  • the log content is not saved locally after being processed by the data collector, but is reported to the data query server for storage.
  • the data query server After receiving the log reported by the data collector, the data query server stores the data in the database and establishes an index. When the log query is needed, it can be directly queried in the database of the data query server. This way of storing data in a centralized manner, the log query operation is limited to being executed in the database of the data query server, and does not affect the data collector.
  • the present invention provides a data query method, a data query server, a data collector, and a data query system, which can improve the processing speed of data query, reduce the system resource occupation of the data collector, and load pressure of the data query server, and improve the entire The processing power of the system.
  • a first aspect of the present invention provides a data query method, where the method includes:
  • the collector identifier corresponding to the query term Obtaining, by the centralized index table corresponding to the field, the collector identifier corresponding to the query term, where the centralized index table stores a correspondence between the query word and the collector identifier in the field; generating according to the query request a query command carrying the field and the query word, and sending the query command to the data collector corresponding to the collector identifier, where the data collector performs the query command in the data collector In the local index table corresponding to the field, the query obtains data that matches the query word carried in the query command;
  • the method before the obtaining, by the centralized index table corresponding to the field, the collector identifier corresponding to the query term, the method further includes: a field, establishing a centralized index table corresponding to the field;
  • the establishing a centralized index table corresponding to the field includes:
  • the report index table includes a query word corresponding to the field in a data collector that sends the report index table;
  • the index table the correspondence between the identifier of the data collector and the query term of the field in the report index table reported by the data collector is stored.
  • the obtaining, by the query, the collector identifier corresponding to the query term from the centralized index table corresponding to the field includes:
  • the query request carries at least two fields to be queried, obtain query terms of each field in the query request, and record a logical relationship between the query words of the fields;
  • the present invention further provides a data query method, the method includes: receiving a query command sent by a data query server, where the query command includes a field to be queried carried in a query request received by the data query server And a query term in the field; querying, from a local index table corresponding to the field, a storage location of data matching a query word in the query command, where the local index table stores the Correspondence between the query word and the storage location of the data;
  • the data is acquired and sent to the data query server according to the storage location of the data.
  • the querying from the local index table corresponding to the field, the location of the data that matches the query word in the query command Previously, it also included:
  • the content of the data in the field is used as a query word of the data, and a mapping relationship between the query word and the storage location is established, and a local index table of the field in the current data collector is formed.
  • the data is stored in a local index table of the field in the current data collector After the correspondence between the query word and the storage location of the data, the method further includes:
  • the obtaining, by the local index table corresponding to the field, the storage location of the data that matches the query term in the query command includes:
  • the query command carries at least two fields to be queried, the query words of each field in the query command are obtained, and the logical relationship between the query words of the fields is recorded;
  • the storage location of the data satisfying the logical relationship is filtered out from the storage location of the data obtained by the query.
  • the present invention further provides a data query server, where the data query server includes:
  • a first receiving unit configured to receive an input query request, where the query request carries a field to be queried and a query word in the field;
  • a first querying unit configured to: obtain, from the centralized index table corresponding to the field, a collector identifier corresponding to the query word carried by the first receiving unit, and the storage device in the centralized index table Corresponding relationship between the query word in the field and the collector identifier;
  • a first processing unit configured to generate, according to the query request, a query command that carries the field and the query word, and send the query command to a data collector corresponding to the collector identifier obtained by querying by the first query unit And the data is obtained by the data collector in a local index table corresponding to the field carried by the query command in the data collector, and the query obtains data that matches the query word carried in the query command;
  • the first output unit is configured to receive the data returned by the data collector, form a query result of the query request according to the received data, and output the result.
  • the data query server further includes:
  • a first index unit configured to establish, according to the field, a centralized index table corresponding to the field; the first index unit includes:
  • a first receiving subunit configured to receive a reporting index table of the field sent by each data collector, where the reporting index table includes a query corresponding to the field in the data collector that sends the reporting index table Word
  • the first index sub-unit is configured to store, in the centralized index table of the field, a correspondence between an identifier of the data collector and a query word of the field in the report index table reported by the data collector.
  • the first query unit includes:
  • a first parsing subunit configured to: if the query request received by the first receiving unit carries at least two fields to be queried, obtain query words of each field in the query request, and record the fields Query the logical relationship between words;
  • a first query subunit configured to query, from the centralized index table corresponding to each field, a collector identifier corresponding to the query word of each field acquired by the first parsing subunit;
  • a first filtering subunit configured to filter, according to the logical relationship between the query words of the fields obtained by the first parsing subunit, from the collector identifier obtained by querying by the first query subunit A collector identifier that satisfies the logical relationship.
  • the present invention further provides a data collector, where the data collector includes: a second receiving unit, configured to receive a query command sent by a data query server, where the query command carries a field to be queried and the The query word in the field;
  • a second query unit configured to query, from a local index table corresponding to the field, a storage location of data that matches a query word in a query command received by the second receiving unit, where the local index table stores Corresponding relationship between the query word in the field and the storage location of the data;
  • a second processing unit configured to acquire the data according to the storage location of the data obtained by querying by the second query unit, and send the data to the data query server.
  • the data collector further includes:
  • the second index unit includes:
  • Obtaining a subunit configured to acquire data in a current data collector and a storage location of the data, where the data includes content of at least one field;
  • a second index subunit configured to use, for each field acquired by the obtaining subunit, a content of the data in the field as a query word of the data, and a local index of the field in the data collector In the table, a correspondence between a query word storing the data and a storage location of the data.
  • the second indexing unit further includes:
  • a third index subunit configured to extract the query word from a local index table of the field obtained by the second index subunit, and perform deduplication processing on the query word to form the current data collector
  • the report index table of the field
  • a sending subunit configured to send the report index table of the field formed by the third index subunit to the data query server, where the data query server establishes a centralized index table of the field.
  • the second query unit includes:
  • a second parsing subunit configured to: if the query command received by the second receiving unit carries at least two fields to be queried, obtain query words of each field in the query command, and record the fields Query the logical relationship between words;
  • a second query subunit configured to query, from the local index table corresponding to each field, a storage location of data that matches a query word of each field in the query command acquired by the second parsing subunit;
  • a second filtering subunit configured to filter, according to a logical relationship between the query words of the fields obtained by the second parsing subunit, from a storage location of the data obtained by querying by the second query subunit A storage location of data that satisfies the logical relationship is obtained.
  • the present invention further provides a data query system, where the system includes: the data query server provided by the third aspect, and the data collector provided by the fourth aspect.
  • FIG. 1 is a structural diagram of a data query system according to an embodiment of the present invention
  • FIG. 2 is a signaling diagram of an index establishment process according to Embodiment 1 of the present invention.
  • FIG. 3 is a flowchart of a data query method according to Embodiment 1 of the present invention.
  • FIG. 5 is a schematic diagram of a data query system according to Embodiment 2 of the present invention.
  • FIG. 6 is a schematic diagram of a data query server and a data collector provided by Embodiment 2 of the present invention
  • FIG. 7 is a schematic diagram of a data query server according to Embodiment 3 of the present invention
  • FIG. 8 is a schematic diagram of a data collector provided by Embodiment 3 of the present invention.
  • DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS The technical solution of the present invention will be further described in detail below with reference to the accompanying drawings and embodiments.
  • 1 is a structural diagram of a data query system according to an embodiment of the present invention. As shown in FIG. 1, the present invention adopts a distributed architecture, including a data query server 10 and multiple data collectors 20.
  • the data collector 20 is responsible for
  • the data source server 10 is a unified entry of data query for collecting (including receiving, formatting, merging), storing, and indexing data such as massive logs reported by the log source 30.
  • the data query method provided by the present invention can be used for quick query of massive data.
  • the log data is taken as an example for description.
  • the data stored in the system Before querying the log data, the data stored in the system needs to be indexed in advance, usually during data storage, for the system to query the data according to the established index table.
  • a local index table and a centralized index table are respectively established in the data collector and the data query server.
  • the local index table is used to store the index of the log data in the current data collector, and its function is: When the query condition is given, the specific storage location of all the logs in the local data that meet the conditions can be found.
  • the centralized index table is used to store an index of the query word and the collector identifier of each field, The function is: When the query condition is given, it can be found on which data collectors the data to be queried may be stored, and the identification information of the data collector storing the data to be queried is given in the centralized index table.
  • Step S101 A data collector acquires data in a current data collector and a storage location of the data.
  • the data stored in the data collector is the original log data reported by the log source.
  • the data collector After the log source reports the raw log data to the data collector, the data collector also needs to build a local index for the original log data.
  • the data collector formats and merges the original log data, and processes the original log data into the form of each record in the log table (that is, each row of records in the log table).
  • Each log table may have multiple fields, as follows As shown in Table 1, the log table includes fields such as field 1 and field 2, and the serial number indicates the storage location of the data.
  • Step S102 The data collector uses, for each field, the content of the data in the field as a query word of the data, and establishes a mapping relationship between the query word and the storage location, forming the field in the current data.
  • the local index table of the collector uses, for each field, the content of the data in the field as a query word of the data, and establishes a mapping relationship between the query word and the storage location, forming the field in the current data.
  • a local index table is established for each field of each record in the log table, and each index table corresponds to a field content in the specified log table and contains a specified field content.
  • the content of the field is used as the query term for the corresponding data.
  • the mapping relationship between the query term and the storage location may be, but is not limited to, expressed in the form of a table, as shown in Table 2 below, the local index table of the field 1 in the data collector:
  • the specific location of the data on the data collector can be quickly found by the local index table of the corresponding field. For example, if you want the specific location of a content in field 1, you can quickly find the corresponding location based on the local index table of field 1.
  • Step S103 The data collector extracts the query word from the local index table of the field, and performs deduplication processing on the query word to form a report index table of the field of the current data collector.
  • the data collector extracts the reported report index table according to the local index table of the field. There is no specific original log location corresponding to each index content in the report index table, and only the content of each data corresponding field, that is, the query word. Usually, before the report, the query words are de-reprocessed so that the fields of the reported fields are not duplicated. For example, the field 1 in the table has two aaa and one bbb. After de-reprocessing, there is only one aaa and one bbb in the index table, as shown in Table 3 below.
  • Bbb compares the query words extracted from the newly added local index table with the reported report index table. If the query words are the same, they are not added to the report index table, only the new ones are not duplicated. Report the index table.
  • Step S104 The data collector sends the report index table of the field to the data query server.
  • the data collector reports the report index table as shown in Table 3 to the data query server. What's new in this table.
  • the data query server summarizes the report index table sent by the multiple data collectors, and establishes a centralized index table corresponding to each field in the data query server, which specifically includes: Step S105: The data query server receives the report index table of the field sent by each data collector.
  • the report index table of the data collector includes a query word corresponding to the field in the data collector.
  • Step S106 Establish a mapping relationship between the query word and the identifier of the collector, and form a centralized index table of the field.
  • the data query server respectively establishes a centralized index table of corresponding fields.
  • the centralized index table of the data query server stores the query words and corresponding collector identifiers contained in each data collector, as shown in Table 4 below:
  • Each of the fields in the data query server has a centralized index table, and the report index table reported by each data collector is summarized into a centralized index table of the data query server, that is, in the centralized index table of the field on the data query server. Corresponding relationship between the identifier of the data collector and the query term of the field in the report index table reported by the data collector.
  • the index table is created, when the query request is received, the data to be queried can be found through the index table.
  • FIG. 3 is a flowchart of a data query method according to the embodiment.
  • the data query method of the present invention includes: Step S201: The data query server receives the input query request.
  • the user inputs a query request by means of a form search or an expression search.
  • a form search or an expression search.
  • a fixed field is given in the interaction interface with the user, the user can input the query word in the prompt box of multiple fields, and finally submit the query request to the data query server through the submit button.
  • the expression retrieval method the user directly inputs the field to be queried and the query word of the field, and submits it to the data query server.
  • the query request received by the data query server carries the field to be queried and the query term of the field.
  • the query request received by the data query server also carries the logical relationship between the query words of the respective fields. For example, the user enters query terms in different search fields, and the data query server receives the logical relationships between the query words and the various query terms in the various search fields.
  • Step S202 The data query server queries the centralized index table corresponding to the field to obtain a collector identifier corresponding to the query word.
  • the query word matching the query word is obtained in the centralized index table of the field 1, and the corresponding collector identifier is obtained, thereby obtaining the data collector of the query word.
  • the collector identifier corresponding to aaa is obtained from the centralized index table shown in Table 4 as collector 1 and collector 3.
  • this step specifically includes:
  • Step S2021 Acquire a query word of each field in the query request, and record a logical relationship between the query words of the fields carried in the query request.
  • Step S2022 Query, from the centralized index table corresponding to each field, the collector identifier corresponding to the query word of each field.
  • Step S2023 According to the logical relationship between the query words of the fields, the collector identifier that is obtained by querying in step S2022 is filtered to obtain a collector identifier that satisfies the logical relationship.
  • the identifier of the collector corresponding to aaa is collected as collector 1 and collector 3.
  • the collector identifier corresponding to ccc is obtained as collector 1, between aaa and ccc. If the logical relationship is "and", then the collector identifier that satisfies the logical relationship can be filtered out only by the collector 1.
  • Step S203 The data query server generates a query command that carries the field and the query word according to the query request, and sends the query command to the data collector corresponding to the collector identifier.
  • the data collector obtains data matching the query word through the local index table of the data collector, and the specific process is described in detail in conjunction with FIG. 3 .
  • Step S204 The data query server receives the data returned by the data collector, and forms a query result of the query request according to the received data, and outputs the result.
  • the data query server summarizes the received data, which may be, but is not limited to, output in the form of a table.
  • FIG. 4 is a flowchart of still another data query method provided by this embodiment. As shown in FIG. 4, the data query method of the present invention includes:
  • Step S301 The data collector receives the query command sent by the data query server.
  • the query command includes a field to be queried and a query word of the field carried in the query request received by the data query server, and optionally, a query word and a plurality of fields The logical relationship between the query words.
  • Step S302 The data collector queries the local index table corresponding to the field to obtain a storage location of data that matches the query command.
  • Step S3021 The data collector obtains the query words of each field in the query command received in step S301, and records each field carried in the query command. The logical relationship between the query words.
  • Step S3022 The data collector queries, from the local index table corresponding to each field, a storage location of data that matches the query words of the respective fields.
  • the storage location corresponding to the data obtained in the local index table of field 1 is 2, and the data is queried in the local index table of field 2.
  • the corresponding storage location is also 2.
  • Step S3023 The data collector filters, according to the logical relationship between the query words of the fields, the storage location of the data that satisfies the logical relationship from the storage location of the data obtained by the query.
  • the data collector filters the storage location of the matched data according to the logical relationship between the query words to obtain the storage location of the data satisfying the logical relationship.
  • Step S303 The data collector acquires the data according to the storage location of the data, and sends the data to the data query server.
  • the data collector obtains the corresponding data according to the storage location of the data obtained in step S302.
  • the data that can be obtained to satisfy the query command is:
  • the data collector sends the data to the data query server, and the data query server aggregates and outputs the query result of the query request.
  • the local index table and the centralized index table are respectively established in the data collector and the data query server, and when the data is queried, the collector identifier corresponding to the query word is found through the centralized index table of the corresponding field. Therefore, the corresponding data is obtained in the data collector corresponding to the collector identifier, which can effectively reduce the system resource occupation of the data query server and the data collector, so that the data collector can have more resources for improving the collection performance. Improve the processing speed of data queries.
  • FIG. 5 is a schematic diagram of a data query system according to the embodiment.
  • the data query system of the embodiment of the present invention includes: a data query server 10 and a data collector 20.
  • the data collector 20 is responsible for data collection, including receiving, formatting, merging, storing data, and indexing the stored data.
  • the data query server 10 is used for unified management of the contents stored on the plurality of data collectors 20 and serves as a unified entry for data query.
  • FIG. 6 is a schematic diagram of the data query server 10 and the data collector 20 provided in this embodiment.
  • the data query server 10 includes a first index unit 100, a first receiving unit 101, and a first query unit 102.
  • the data collector 20 includes a second index unit 200, a second receiving unit 201, a second query unit 202, and a second processing unit 203.
  • the data collector 20 and the data query server 10 need to pre-index the data stored in the system, which is usually completed during the data storage, and is used by the system to query the data according to the established index table.
  • the data query server 10 establishes a centralized index table of the fields for each of the fields using the first index unit 100.
  • the data collector 20 uses the second index unit 200 to establish a local index table of fields.
  • the local index table is used to store the index of the log data in the current data collector.
  • the function is: When the query condition is given, the specific storage location of all the logs in the local data that meet the conditions can be found.
  • the centralized index table is used to store the index of the data to be queried and the identifier of the collector, and its function is: When the query condition is given, it can be found on which data collectors the data to be queried may be stored, and the storage is given in the centralized index table. Identification information of the data collector of the data to be queried.
  • the second index unit 200 includes an acquisition subunit 2001, a second index subunit 2002, a third index subunit 2003, and a transmission subunit 2004.
  • the obtaining subunit 2001 is configured to acquire data in the current data collector and a storage location of the data, where the data includes content of at least one field.
  • the data stored in the data collector is the original log data reported by the log source.
  • the data collector After the log source reports the raw log data to the data collector, the data collector also needs to build a local index for the original log data.
  • the data collector formats and merges the original log data, and processes the original log data into the form of each record in the log table (ie, each row in the log table), and each log table may There are multiple fields, as shown in Table 1, the log table includes fields such as field 1 and field 2, and the serial number indicates the storage location of the data.
  • the second index sub-unit 2002 is configured to use, for each field, the content of the data in the field as a query word of the data, to establish a mapping relationship between the query word and the storage location, to form the field in the The local index table of the current data collector.
  • a local index table is established for each field of each record in the log table, and each index table corresponds to a field content in the specified log table and a certain field containing the specified
  • the data of the content is stored in the log table.
  • the field content is used as the query word of the corresponding data.
  • the mapping relationship between the query term and the storage location may be, but is not limited to, expressed in the form of a table, as shown in Table 2.
  • the specific location of the data on the data collector can be quickly found by the local index table of the corresponding field. For example, if you want the specific location of a content in field 1, you can quickly find the corresponding location based on the local index table of field 1.
  • the third index subunit 2003 is configured to extract the query word from the local index table of the field, and perform deduplication processing on the query word to form a report index table of the field of the current data collector.
  • the data collector extracts the reported report index table according to the local index table of the field. There is no specific original log location corresponding to each index content in the report index table, and only the content of each data corresponding field, that is, the query word. Usually, before the report, the query words are de-reprocessed so that the fields of the reported fields are not duplicated. For example, the field 1 in the table has two aaa and one bbb. After de-reprocessing, there is only one aaa and one bbb in the index table, as shown in Table 3.
  • query words extracted from the newly added local index table compare them with the reported report index table. If the query words are the same, they are not added to the report index table, and only the newly added non-repeating reports are uploaded. direction chart.
  • the sending sub-unit 2004 is configured to send the report index table of the field to the first index unit 100 of the data query server 10, and use the data query server to establish a centralized index table of the field.
  • the first index unit 100 includes a first receiving subunit 1001 and a first index subunit 1002.
  • the first receiving subunit 1001 is configured to receive a reporting index table of the field sent by each data collector.
  • the report index table includes a query word corresponding to the field in the data collector that sends the report index table.
  • the first index sub-unit 1002 is configured to establish a mapping relationship between the query word and the collector identifier, and form a centralized index table of the field.
  • the first index subunit 1002 respectively establishes a centralized index table of corresponding fields for different fields. For example, for field 1, the query words contained in each data collector and the corresponding collector identifiers are stored in the centralized index table, as shown in Table 4.
  • the first index sub-unit 1002 establishes a centralized index table for each field, and the report index table reported by each data collector is summarized into a centralized index table of the data query server.
  • the index table is established in the data query server 10 and the data collector 20 by the first index unit 100 and the second index unit 200, respectively, when the query request is received, the data to be queried can be found through the index table.
  • the first receiving unit 101 is configured to receive an input query request.
  • the user inputs a query request by means of a form search or an expression search.
  • a form search or an expression search.
  • a fixed field is given in the interaction interface with the user, the user can input the query word in the prompt box of multiple fields, and finally submit the query request to the data query server through the submit button.
  • the expression retrieval method the user directly inputs the field to be queried and the query word of the field, and submits it to the data query server.
  • the query request received by the first receiving unit 101 carries the field to be queried and the query word of the field.
  • the query request received by the first receiving unit 101 further includes a logical relationship between the query words of the respective fields. For example, the user inputs a query word in a different search field, and the first receiving unit 101 receives the logical relationship between the query words and the respective query words in the respective different search fields.
  • the first querying unit 102 is configured to obtain, from the centralized index table of the field, the collector identifier corresponding to the query word carried by the query request received by the first receiving unit 101.
  • the first query unit 102 obtains the query word matching the query word from the centralized index table of the field 1, and obtains the corresponding collector identifier, thereby obtaining the query word.
  • Data collector When the query request includes the field 1, the first query unit 102 obtains the query word matching the query word from the centralized index table of the field 1, and obtains the corresponding collector identifier, thereby obtaining the query word.
  • the first query unit 102 includes: a parsing subunit, a first query subunit, and a first filtering subunit (not shown).
  • the first parsing subunit is configured to obtain a query word of each field in the query request, and record the The logical relationship between the query words of each field.
  • the first query sub-unit is configured to query, from the centralized index table corresponding to the field, the collector identifier corresponding to the query word of each field obtained by the first parsing sub-unit.
  • a first filtering subunit configured to: according to the logical relationship between the query words of the fields obtained by the first parsing subunit, the collector identifier that is obtained by querying from the first query subunit is satisfied The collector ID of the logical relationship.
  • the first processing unit 103 is configured to generate, according to the query request, a query command that carries the field and the query word, and send the query command to the data collector 20 corresponding to the obtained collector identifier that is obtained by the first query unit 102. .
  • the first query unit 102 queries the centralized index table shown in Table 5 to obtain the collector identifier corresponding to ccc as the collector 1, and the logical relationship between aaa and ccc is "OR", and the first processing unit 103 generates
  • the second receiving unit 201 of the data collector 20 is configured to receive the query command sent by the data query server 10.
  • the query command includes a field to be queried and a query term of the field carried in the query request received by the data query server 10, and may include a logical relationship between the query words of the plurality of fields and the query word.
  • the second query unit 202 is configured to query, from the local index table corresponding to the field, a storage location of data that matches the query word in the query command received by the second receiving unit 201.
  • the second query unit 202 includes: a second parsing subunit, a second query subunit, and a second filtering subunit (not shown).
  • the second parsing sub-unit is configured to: when the query command received by the second receiving unit 201 carries a plurality of fields to be queried, obtain query terms of each field in the query command, and record the information carried in the query command The logical relationship between the query words of each field. The storage location of the data matching the query words of each field in the query command obtained by the second parsing subunit.
  • the second filtering subunit is configured to filter, according to the logical relationship between the query words of the fields obtained by the second parsing subunit, from the storage location of the data obtained by querying by the second query subunit The storage location of the data that satisfies the logical relationship.
  • the second processing unit 203 is configured to obtain the data according to the storage location of the data that is queried by the second query unit 202, and send the data to the data query server.
  • the second processing unit 203 sends the data queried by the second query unit 202 to the first output unit 104 of the data query server 10 for outputting the query result of the query request.
  • the first output unit 104 is configured to receive the data returned by the second processing unit 203 of the data collector 20, form a query result of the query request according to the received data, and output the result.
  • the data query server, the data collector and the system provided by the embodiment of the present invention use the first index unit to establish a centralized index table in the data query server, and use the second index unit to establish a local index table in the data collector, thereby improving data query. Processing speed.
  • the present invention does not need to query each data collector, which reduces the burden on the data collector.
  • the solution provided by the embodiment of the present invention can effectively reduce the system resource occupation of the data collector, so that the data collector can have more resources for improving the collection performance, thereby improving the overall processing of the system. ability.
  • FIG. 7 is a schematic diagram of the data query server 10 according to the embodiment.
  • the data query server 10 includes: a network interface 71, a processor 72, and a memory 73.
  • the system bus 74 is used to connect the network interface 71, the processor 72, and the memory 73.
  • Network interface 71 is used to communicate with data collector 20.
  • the memory 73 may be a persistent storage such as a hard disk drive and a flash memory having a software module and a device driver.
  • the software modules are capable of executing the various functional modules of the above described method of the present invention; the device drivers can be network and interface drivers.
  • Receiving an input query request where the query request carries a field to be queried and a query word in the field; Obtaining, by the centralized index table corresponding to the field, the collector identifier corresponding to the query term, where the centralized index table stores a correspondence between the query word and the collector identifier in the field; generating according to the query request a query command carrying the field and the query word, and sending the query command to the data collector corresponding to the collector identifier, where the data collector is carried by the query command in the data collector
  • the local index table query corresponding to the field obtains data matching the query word carried in the query command;
  • the data query server of the present embodiment finds the collector identifier corresponding to the query word through the centralized index table of the field, so that the corresponding data is obtained in the data collector corresponding to the collector identifier, which can effectively alleviate the system resource occupation of the data query server. Improve the processing speed of data queries.
  • the above instruction process is a process in which the data query server establishes a centralized index table, and establishes a mapping relationship between the query word and the collector identifier, so that when the data is queried, the collector identifier corresponding to the query word is found, and the matching is obtained from the corresponding data collector.
  • the data is a process in which the data query server establishes a centralized index table, and establishes a mapping relationship between the query word and the collector identifier, so that when the data is queried, the collector identifier corresponding to the query word is found, and the matching is obtained from the corresponding data collector.
  • the data is a process in which the data query server establishes a centralized index table, and establishes a mapping relationship between the query word and the collector identifier, so that when the data is queried, the collector identifier corresponding to the query word is found, and the matching is obtained from the corresponding data collector.
  • the query words of each field in the query request are obtained, and the logical relationship between the query words of the fields is recorded;
  • the collector identifier obtained from the query is filtered to obtain a collector identifier that satisfies the logical relationship.
  • the above instruction process is a process in which the data query server searches for a corresponding collector identifier for a plurality of query words of the field to be queried, and can avoid accessing data that cannot fully satisfy the query request.
  • FIG. 8 is a schematic diagram of the data collector 20 provided by the embodiment. As shown in FIG. 8, the data collector 20 includes: a network interface 81, a processor 82, and a memory 83. System bus 84 is used to connect network interface 81, processor 82, and memory 83.
  • Network interface 81 is used to communicate with data query server 10.
  • the memory 83 may be a persistent storage such as a hard disk drive and a flash memory having a software module and a device driver.
  • the software modules are capable of executing the various functional modules of the above described method of the present invention; the device drivers can be network and interface drivers.
  • a query command sent by the data query server where the query command includes a field to be queried carried in the query request received by the data query server and a query word in the field; querying from a local index table corresponding to the field Obtaining a storage location of the data matching the query command, where the local index table stores a correspondence between the query words in the field and the storage location of the data;
  • the data is acquired and sent to the data query server according to the storage location of the data.
  • the data collector of the embodiment finds the data corresponding to the query word through the local index table of the field, and provides the data query server, which can effectively reduce the system resource occupation of the data collector, so that the data collector can have more resources for the data collector. Improve the performance of the acquisition and improve the processing speed of data query.
  • the above instruction process is a process in which the data collector establishes a local index table, and obtains a data according to a storage location of the data corresponding to the query word by establishing a mapping relationship between the query word and the storage location of the data.
  • the above instruction process is a process in which the data collector establishes a report index table according to the local index table and sends it to the data query server, so that the data query server establishes a centralized index table.
  • the query command When the query command carries at least two fields to be queried, the query words of each field in the query command are obtained, and the logical relationship between the query words of the fields is recorded;
  • the storage location of the data satisfying the logical relationship is filtered out from the storage location of the data obtained by the query.
  • the above instruction process is a process in which the data collector finds a corresponding storage location for a plurality of query words of the field to be queried, and can avoid obtaining data that cannot completely satisfy the query command.

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

L'invention concerne un procédé et un système d'interrogation de données. Ledit procédé consiste à : recevoir une demande d'interrogation entrée, la demande d'interrogation comprenant un champ à interroger et des mots d'interrogation dans le champ; interroger une table d'indices de concentration correspondant au champ pour obtenir une identité de collecteur correspondant aux mots d'interrogation; générer, conformément à la demande d'interrogation, une commande d'interrogation comportant le champ et les mots d'interrogation, et envoyer la commande d'interrogation à un collecteur de données correspondant à l'identité du collecteur de sorte que le collecteur de données interroge une table d'indices locale correspondant au champ figurant dans la commande d'interrogation dans le collecteur de données pour obtenir les données correspondant aux mots d'interrogation figurant dans la commande d'interrogation; et recevoir les données renvoyées par le collecteur de données, former un résultat d'interrogation de la demande d'interrogation conformément aux données reçues et générer le résultat. L'invention permet d'améliorer la vitesse de traitement d'une interrogation de données, de réduire l'occupation des ressources système du collecteur de données et de diminuer la charge et la pression d'un serveur d'interrogation de données.
PCT/CN2013/082130 2012-12-24 2013-08-23 Procédé et système d'interrogation de données WO2014101445A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201210566137.4 2012-12-24
CN201210566137.4A CN103064933B (zh) 2012-12-24 2012-12-24 数据查询方法及系统

Publications (1)

Publication Number Publication Date
WO2014101445A1 true WO2014101445A1 (fr) 2014-07-03

Family

ID=48107563

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2013/082130 WO2014101445A1 (fr) 2012-12-24 2013-08-23 Procédé et système d'interrogation de données

Country Status (2)

Country Link
CN (1) CN103064933B (fr)
WO (1) WO2014101445A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023220973A1 (fr) * 2022-05-18 2023-11-23 京东方科技集团股份有限公司 Procédé et appareil de traitement de données, et dispositif électronique et support de stockage lisible par ordinateur

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103064933B (zh) * 2012-12-24 2016-06-29 华为技术有限公司 数据查询方法及系统
CN105099735B (zh) * 2014-05-07 2018-05-22 中国移动通信集团福建有限公司 一种获取海量详细日志的方法及系统
CN105302827B (zh) * 2014-06-30 2018-11-20 华为技术有限公司 一种事件的搜索方法和设备
CN104216957A (zh) * 2014-08-20 2014-12-17 北京奇艺世纪科技有限公司 一种视频元数据的查询系统及查询方法
CN104317924A (zh) * 2014-10-30 2015-01-28 中国银行股份有限公司 同城票据交换中的数据查询方法和装置
CN105871951A (zh) * 2015-01-21 2016-08-17 上海可鲁系统软件有限公司 一种工业物联网分布式业务凭证处理方法
CN107015990B (zh) * 2016-01-27 2020-06-09 阿里巴巴集团控股有限公司 一种数据查找方法和装置
CN105930441B (zh) * 2016-04-18 2019-04-26 华信咨询设计研究院有限公司 一种无线电监测数据查询方法
CN106354823A (zh) * 2016-08-30 2017-01-25 北京旷视科技有限公司 汇总人脸比对系统的操作数据的方法、装置及系统
CN107784050A (zh) * 2016-12-14 2018-03-09 平安科技(深圳)有限公司 日志信息查找方法及装置
CN107066610A (zh) * 2017-05-02 2017-08-18 中国联合网络通信集团有限公司 一种价格查询方法及设备
CN107577506B (zh) * 2017-08-07 2021-03-19 台州市吉吉知识产权运营有限公司 一种数据预加载的方法和系统
CN109299219B (zh) * 2018-08-31 2022-08-12 北京奥星贝斯科技有限公司 数据查询方法、装置、电子设备及计算机可读存储介质
CN109308305B (zh) * 2018-09-30 2021-06-08 广州圣亚科技有限公司 监测数据的查询方法、装置和计算机设备
CN109299348B (zh) * 2018-11-28 2021-09-28 北京字节跳动网络技术有限公司 一种数据查询方法、装置、电子设备及存储介质
CN109885548A (zh) * 2019-02-22 2019-06-14 网易(杭州)网络有限公司 日志查询方法、装置、存储介质和电子装置
CN110502915B (zh) * 2019-08-30 2021-07-30 恩亿科(北京)数据科技有限公司 一种数据处理的方法、装置及系统
CN110674369A (zh) * 2019-09-23 2020-01-10 杭州迪普科技股份有限公司 一种数据查询方法及装置
CN111062193B (zh) * 2019-12-16 2023-04-25 医渡云(北京)技术有限公司 医疗数据标注方法及装置、存储介质、电子设备
CN113486048A (zh) * 2021-07-13 2021-10-08 广西电力职业技术学院 一种数据检索系统和数据检索方法
CN117271562B (zh) * 2023-11-21 2024-01-19 成都凌亚科技有限公司 一种数据采集处理方法及系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1968132A (zh) * 2006-10-16 2007-05-23 华为技术有限公司 建立网络实体间呼叫日志关联及查找关联呼叫日志的方法
CN102193917A (zh) * 2010-03-01 2011-09-21 中国移动通信集团公司 一种数据处理和查询方法和装置
CN102375853A (zh) * 2010-08-24 2012-03-14 中国移动通信集团公司 分布式数据库系统、在其中建立索引的方法和查询方法
CN103064933A (zh) * 2012-12-24 2013-04-24 华为技术有限公司 数据查询方法及系统

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5903888A (en) * 1997-02-28 1999-05-11 Oracle Corporation Method and apparatus for using incompatible types of indexes to process a single query
CN102789487B (zh) * 2012-06-29 2015-09-02 用友软件股份有限公司 数据查询检索处理装置和数据查询检索处理方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1968132A (zh) * 2006-10-16 2007-05-23 华为技术有限公司 建立网络实体间呼叫日志关联及查找关联呼叫日志的方法
CN102193917A (zh) * 2010-03-01 2011-09-21 中国移动通信集团公司 一种数据处理和查询方法和装置
CN102375853A (zh) * 2010-08-24 2012-03-14 中国移动通信集团公司 分布式数据库系统、在其中建立索引的方法和查询方法
CN103064933A (zh) * 2012-12-24 2013-04-24 华为技术有限公司 数据查询方法及系统

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023220973A1 (fr) * 2022-05-18 2023-11-23 京东方科技集团股份有限公司 Procédé et appareil de traitement de données, et dispositif électronique et support de stockage lisible par ordinateur

Also Published As

Publication number Publication date
CN103064933A (zh) 2013-04-24
CN103064933B (zh) 2016-06-29

Similar Documents

Publication Publication Date Title
WO2014101445A1 (fr) Procédé et système d'interrogation de données
US9672233B2 (en) Integrated search for shared storage using index throttling to maintain quality of service
CN109936571B (zh) 一种海量数据共享方法、开放共享平台及电子设备
CN103544261B (zh) 一种海量结构化日志数据全局索引管理方法及装置
US8799291B2 (en) Forensic index method and apparatus by distributed processing
US10877810B2 (en) Object storage system with metadata operation priority processing
WO2014015488A1 (fr) Procédé et appareil de stockage et d'interrogation de données
US10706062B2 (en) Method and system for exchanging data from a big data source to a big data target corresponding to components of the big data source
WO2015062201A1 (fr) Procédé, dispositif, serveur et système d'interrogation de données
CN111221791A (zh) 一种多源异构数据导入数据湖的方法
WO2013097231A1 (fr) Procédé et système d'accès aux fichiers
US11086995B2 (en) Malware scanning for network-attached storage systems
WO2023273544A1 (fr) Procédé et appareil de stockage de fichier journal, dispositif, et support de stockage
WO2017161540A1 (fr) Procédé d'interrogation de données, procédé de stockage d'objets de données et système de données
WO2021082401A1 (fr) Procédé, système et appareil de téléchargement de données et dispositif électronique
WO2021169275A1 (fr) Procédé et appareil d'accès à un dispositif de réseau sdn, dispositif informatique et support de stockage
CN106294826A (zh) 一种集群数据实时查询方法及系统
WO2017092384A1 (fr) Procédé et dispositif de stockage distribué de base de données groupée
WO2011131079A1 (fr) Procédé et système de traitement d'événement pour un système de commande distribué
CN112162707A (zh) 用于分布式存储系统的存储方法、电子设备及存储介质
KR20170088950A (ko) 검색 엔진으로 웹 사이트 인증 데이터를 제공하기 위한 방법 및 장치
WO2014145099A1 (fr) Procédé et système de base de données de moteur de recherche internet d'élément multimédia partagé
US9201889B1 (en) Integrated search for shared storage
WO2009055496A2 (fr) Compression de colonnes de zéros dans des lignes du protocole de flux de données tabulaires
US20220083507A1 (en) Trust chain for official data and documents

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13869336

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13869336

Country of ref document: EP

Kind code of ref document: A1