CN116628010A - Data processing method, device and equipment - Google Patents
Data processing method, device and equipment Download PDFInfo
- Publication number
- CN116628010A CN116628010A CN202310617476.9A CN202310617476A CN116628010A CN 116628010 A CN116628010 A CN 116628010A CN 202310617476 A CN202310617476 A CN 202310617476A CN 116628010 A CN116628010 A CN 116628010A
- Authority
- CN
- China
- Prior art keywords
- target
- data
- primary key
- field
- query
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 20
- 238000012545 processing Methods 0.000 claims abstract description 115
- 238000000034 method Methods 0.000 claims abstract description 59
- 238000000605 extraction Methods 0.000 claims abstract description 19
- 230000014509 gene expression Effects 0.000 claims description 49
- 238000001914 filtration Methods 0.000 claims description 33
- 238000012216 screening Methods 0.000 claims description 5
- 238000003860 storage Methods 0.000 description 20
- 230000001133 acceleration Effects 0.000 description 14
- 230000008569 process Effects 0.000 description 14
- 238000010586 diagram Methods 0.000 description 13
- 238000004590 computer program Methods 0.000 description 11
- 230000006870 function Effects 0.000 description 8
- 230000006872 improvement Effects 0.000 description 8
- 238000005516 engineering process Methods 0.000 description 7
- 238000005192 partition Methods 0.000 description 7
- 238000004364 calculation method Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 238000007781 pre-processing Methods 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000000750 progressive effect Effects 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 230000005055 memory storage Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 229920001296 polysiloxane Polymers 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 239000010979 ruby Substances 0.000 description 1
- 229910001750 ruby Inorganic materials 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
- G06F16/2433—Query languages
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2453—Query optimisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24553—Query execution of query operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24553—Query execution of query operations
- G06F16/24558—Binary matching operations
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the specification provides a data processing method, a device and equipment, wherein the method comprises the following steps: receiving a data query request aiming at a target data table, carrying out field extraction processing on a data query statement carried in the data query request to obtain a target field corresponding to the data query request, acquiring a non-primary key field in the target data table contained in the target field under the condition that query cannot be carried out according to a primary key field in the target data table based on the data query statement, acquiring a target index table corresponding to the non-primary key field in the target data table in a secondary index table corresponding to the target data table, carrying out query processing on the target index table based on the data query statement and the primary key field in the target index table to obtain a first query result, and carrying out query processing on the target data table based on the data query statement, the first query result and the primary key field in the target data table to obtain a data query result corresponding to the data query request.
Description
Technical Field
The present disclosure relates to the field of data processing technologies, and in particular, to a data processing method, apparatus, and device.
Background
With the rapid development of computer technology, the variety and number of application services provided by enterprises for users are also increasing, and accordingly, the data volume of user data is increasing, the data structure is becoming complex, and how to improve the query efficiency of data becomes a problem of increasing attention of business processors. When the data is queried, the data table can be prevented from being globally scanned through the index of the primary key, so that the data query efficiency is improved.
However, when the primary key cannot be used for indexing (e.g., the query filtering condition does not include the primary key and the database does not support creating an index for the created data table), global scanning is required for the data table, and when the data amount of the data table is large, the data query efficiency is low, so a solution capable of improving the data query efficiency is required.
Disclosure of Invention
The embodiment of the specification aims to provide a data processing method, a data processing device and data processing equipment so as to provide a solution capable of improving data query efficiency.
In order to achieve the above technical solution, the embodiments of the present specification are implemented as follows:
in a first aspect, embodiments of the present disclosure provide a data processing method, including: receiving a data query request aiming at a target data table, and carrying out field extraction processing on a data query statement carried in the data query request to obtain a target field corresponding to the data query request; acquiring a non-primary key field in the target data table contained in the target field under the condition that the query cannot be performed according to the primary key field in the target data table based on the data query statement; acquiring a target index table corresponding to a non-primary key field in the target data table in a secondary index table corresponding to the target data table, wherein the primary key field of the secondary index table is the non-primary key field in the target data table; and carrying out query processing on the target index table based on the data query statement and the primary key field in the target index table to obtain a first query result, and carrying out query processing on the target data table based on the data query statement, the first query result and the primary key field in the target data table to obtain a data query result corresponding to the data query request.
In a second aspect, embodiments of the present disclosure provide a data processing apparatus, the apparatus comprising: the request receiving module is used for receiving a data query request aiming at a target data table, and carrying out field extraction processing on a data query statement carried in the data query request to obtain a target field corresponding to the data query request; the first acquisition module is used for acquiring non-primary key fields in the target data table contained in the target field under the condition that the query cannot be performed according to the primary key fields in the target data table based on the data query statement; the second acquisition module is used for acquiring a target index table corresponding to a non-primary key field in the target data table in a secondary index table corresponding to the target data table, wherein the primary key field of the secondary index table is the non-primary key field in the target data table; and the result determining module is used for carrying out query processing on the target index table based on the data query statement and the primary key field in the target index table to obtain a first query result, and carrying out query processing on the target data table based on the data query statement, the first query result and the primary key field in the target data table to obtain a data query result corresponding to the data query request.
In a third aspect, embodiments of the present specification provide a data processing apparatus, the data processing apparatus comprising: a processor; and a memory arranged to store computer executable instructions that, when executed, cause the processor to: receiving a data query request aiming at a target data table, and carrying out field extraction processing on a data query statement carried in the data query request to obtain a target field corresponding to the data query request; acquiring a non-primary key field in the target data table contained in the target field under the condition that the query cannot be performed according to the primary key field in the target data table based on the data query statement; acquiring a target index table corresponding to a non-primary key field in the target data table in a secondary index table corresponding to the target data table, wherein the primary key field of the secondary index table is the non-primary key field in the target data table; and carrying out query processing on the target index table based on the data query statement and the primary key field in the target index table to obtain a first query result, and carrying out query processing on the target data table based on the data query statement, the first query result and the primary key field in the target data table to obtain a data query result corresponding to the data query request.
In a fourth aspect, embodiments of the present description provide a storage medium for storing computer-executable instructions that, when executed, implement the following: receiving a data query request aiming at a target data table, and carrying out field extraction processing on a data query statement carried in the data query request to obtain a target field corresponding to the data query request; acquiring a non-primary key field in the target data table contained in the target field under the condition that the query cannot be performed according to the primary key field in the target data table based on the data query statement; acquiring a target index table corresponding to a non-primary key field in the target data table in a secondary index table corresponding to the target data table, wherein the primary key field of the secondary index table is the non-primary key field in the target data table; and carrying out query processing on the target index table based on the data query statement and the primary key field in the target index table to obtain a first query result, and carrying out query processing on the target data table based on the data query statement, the first query result and the primary key field in the target data table to obtain a data query result corresponding to the data query request.
Drawings
In order to more clearly illustrate the embodiments of the present description or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some of the embodiments described in the present description, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a data processing system of the present specification;
FIG. 2A is a flow chart of an embodiment of a data processing method of the present disclosure;
FIG. 2B is a schematic diagram illustrating a data processing method according to the present disclosure;
FIG. 3 is a schematic diagram of a data query process according to the present disclosure;
FIG. 4 is a schematic diagram illustrating a processing procedure of another data processing method according to the present disclosure;
FIG. 5 is a schematic diagram illustrating a processing procedure of another data processing method according to the present disclosure;
FIG. 6 is a schematic diagram of an embodiment of a data processing apparatus according to the present disclosure;
fig. 7 is a schematic diagram of a data processing apparatus according to the present specification.
Detailed Description
The embodiment of the specification provides a data processing method, a device and equipment.
In order to make the technical solutions in the present specification better understood by those skilled in the art, the technical solutions in the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification, and it is obvious that the described embodiments are only some embodiments of the present specification, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are intended to be within the scope of the present disclosure.
The embodiment of the specification provides a data processing method, a device and equipment.
In order to make the technical solutions in the present specification better understood by those skilled in the art, the technical solutions in the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification, and it is obvious that the described embodiments are only some embodiments of the present specification, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are intended to be within the scope of the present disclosure.
The technical scheme of the specification can be applied to a data processing system, as shown in fig. 1, the data processing system can be provided with terminal equipment and a server, wherein the server can be an independent server or a server cluster formed by a plurality of servers, and the terminal equipment can be equipment such as a personal computer and the like or mobile terminal equipment such as a mobile phone, a tablet personal computer and the like.
The data processing system may include n terminal devices and m servers, where n and m are positive integers greater than or equal to 1, where the terminal devices may be configured to collect data samples, for example, the terminal devices may obtain corresponding data samples for different anomaly detection scenarios, for example, for a data anomaly detection scenario of the question-answering system, the terminal devices may collect feedback information of a user-needle session as the data samples, for a data anomaly detection scenario of a preset service, the terminal devices may collect service data corresponding to the preset service (such as data required for executing the preset service) as the data samples, and so on.
The terminal device may send the collected data samples to any server in the data processing system, and the server may perform preprocessing on the received data samples, generate a data table based on the preprocessed data samples, and store the generated data table in a preset database. Among other things, the preprocessing operations may include text conversion preprocessing (i.e., converting audio data into text data, etc.), text format conversion processing (e.g., converting english text into chinese text, etc.), and the like.
In addition, the server can also generate a corresponding data table based on the local log data and store the data table in a preset database, taking a service scene of tracing the data leakage as an example, when tracing the data, the server needs to perform filtering, searching, statistics, analysis and other processes based on the existing clues in the log data such as network flow, operation behavior and the like, so as to locate the leakage source. Because such data is stored in large daily increments and has a long storage time span, the data can be stored and managed by the data warehouse.
Taking the storage and calculation of the bulk structured data as an example, a solution of a mass data warehouse and a big data calculation service platform (such as maxcomputer) for analyzing and modeling the big data can be provided, because the log data is stored in the big data calculation service platform, when the data leakage needs to be traced, the processing of inquiring, searching and the like in the mass log data of the big data calculation service platform is necessarily involved, and the magnitude of the data set to be inquired is usually over one hundred billion and one hundred billion, and the magnitude can reach the trillion level for the inquiry processing of a long time period. For such large data volume, the log data set is directly queried through the large data computing service platform, the query time is long, and the quick response requirement of the security tracing is difficult to meet, so that a set of scheme is needed to accelerate the data query process, and the quick query analysis capability is provided for the security tracing scene.
Because the big data computing service platform does not support indexing of the created data table, the query process cannot be directly accelerated by way of indexing. However, when creating a data table in the big data computing service platform, the big data computing service platform supports the specification of which columns in the data table are used as primary keys for the barreling by a barreling method (such as a clustered by (or range clustered by) clause), and specifies the ordering method of the fields in the barrel (such as the ordering method of the fields in the barrel can be specified by a dissolved by clause).
When the data table is created, the column or the combination of columns which are designated like this can be used as the main key of the data table, and when the query filtering condition comprises the main key and the operator associated with the main key meets the index using condition, the big data computing service platform can index according to the main key, thereby avoiding the whole table scanning of the data table and accelerating the query speed. However, the method for establishing the index by designating the primary key when the data table is established can only be performed when the data table is established, cannot be dynamically added or deleted according to the data query requirement, and can only designate one primary key, so that the method can support limited acceleration scenes, namely the method can realize the query acceleration of the data only when the query filtering condition comprises the primary key and the primary key accords with the leftmost prefix matching rule.
In the case of rapid query analysis of a massive data set, the main key field in the secondary index table and the main key field in the target data table are used for indexing, so that the whole table scanning of the target data table is avoided, the data query efficiency is improved under the condition that the data volume of the target data table is large, and the minute-level query of the massive data is realized.
The data processing method in the following embodiments can be implemented based on the above-described data processing system configuration.
Example 1
As shown in fig. 2A and fig. 2B, the embodiment of the present disclosure provides a data processing method, where an execution body of the method may be a server, and the server may be an independent server or may be a server cluster formed by a plurality of servers. The method specifically comprises the following steps:
in S202, a data query request for a target data table is received, and a field extraction process is performed on a data query statement carried in the data query request, so as to obtain a target field corresponding to the data query request.
The target data table may be a data table corresponding to a preset user and/or a preset service, for example, the target data table may store user data of the preset user, for example, the target data table may store user information, device information, application program information and other data of a certain user, or the target data table may also store service data required for executing the preset service, for example, the target data table may store service data required for executing a resource transfer service, or the target data table may also be a data table generated based on log data of a certain server, or the target data table may be a query statement that can be executed by any server, for example, the data query statement may be an SQL statement, the target field may be one or more fields included in the target data table, for example, assuming that the target data table includes field 1, field 2 and field 3, and the data query statement is used for querying the target data table based on field 1, then the target field may be field 1.
In implementation, with the rapid development of computer technology, the types and the number of application services provided by enterprises for users are also increasing, and accordingly, the data volume of user data is increasing, the data structure is becoming complex, and how to improve the data query efficiency is becoming a problem of increasing attention of business processors. When the data is queried, the data table can be prevented from being globally scanned through the index of the primary key, so that the data query efficiency is improved. However, when the primary key cannot be used for indexing (e.g., the query filtering condition does not include the primary key and the database does not support creating an index for the created data table), global scanning is required for the data table, and when the data amount of the data table is large, the data query efficiency is low, so a solution capable of improving the data query efficiency is required. For this reason, the embodiments of the present specification provide a technical solution that can solve the above-mentioned problems, and specifically, reference may be made to the following.
Taking a service processing scenario of tracing data leakage as an example, the server may receive a data query request for a target data table, where the target data table may be a data table related to the data leakage scenario, for example, the server may acquire the data table related to the user 1 when detecting that the private data of the user 1 is leaked, and determine the acquired data table as the target data table.
Or the server can also receive a data query request sent by the preset management side and aiming at a target data table, wherein the target data table is the data table determined by the preset management side based on the data leakage scene.
The data query request may carry a data query statement, and the server may perform field extraction processing on the data query statement to obtain a target field corresponding to the data query request, for example, the server may determine the target field corresponding to the data query request based on a filtering condition expression in the data query statement.
Specifically, taking a data query statement as an example of an SQL statement, the data query statement obtained by the server may be:
SELECT field 1
FROM data Table 1
WHERE field 2=1.
The data table 1 is a target data table, the field 2=1 is a filtering condition expression in the data query statement, and the field obtained by performing field extraction processing on the data query statement by the server may be a field included in the filtering condition expression, that is, the extracted target field is the field 2.
In addition, the above description is given of how to obtain the target field by taking the data query statement as an example of the SQL statement, and in the actual application scenario, the method for obtaining the target field may also be varied, and may be varied according to the different actual application scenarios, which is not particularly limited in the embodiment of the present disclosure.
In S204, if it is determined that the query cannot be performed based on the primary key field in the target data table based on the data query statement, a non-primary key field in the target data table included in the target field is acquired.
Wherein the primary key field may be a field or a combination of fields, the value of the primary key field may uniquely identify each row in the data table, and the entity integrity of the data table may be enforced by the primary key field.
In implementation, in a big data computing service platform that performs data processing in a bucket manner, a field (column) specified by a managed by (or range clustered by) and a managed by clause when creating a data table may be determined as a primary key field of the data table, but the value of the primary key field does not have uniqueness.
For example, table 1 may be stored in the big data computing service platform, and when table 1 is created, the bucket processing may be performed through the id field, and then the id field may be used as the primary key field of table 1.
Because the big data computing service platform does not support the creation of an index to the existing data table, when the query cannot be performed according to the primary key field in the target data table (e.g. the filtering condition expression of the data query statement does not contain the primary key field, etc.), the full table scan table is required to be performed on the target data table, and the data query efficiency is higher.
Therefore, the server can acquire the non-primary key field in the target data table contained in the target field under the condition that the server determines that the query cannot be performed according to the primary key field in the target data table based on the data query statement, wherein the non-primary key field is a field except the primary key field in the fields of the target data table.
In S206, a target index table corresponding to the non-primary key field in the target data table in the secondary index table corresponding to the target data table is obtained.
The primary key field of the secondary index table may be a non-primary key field in the target data table, and the server may perform data synchronization on the secondary index table from the target data table in an asynchronous manner.
In implementations, a server may construct a secondary index table for a periodically scheduled partition data table with query acceleration requirements based on one or more non-primary key fields in the data table that have a query frequency above a preset frequency. The constructed secondary index table can determine the non-primary key field or the combination of the non-primary key fields in the target data table as the primary key field, the primary key field of the target data table is taken as the common column thereof, and the scheduling period, the partition fields and the like can be kept consistent with the target data table. The server may construct a plurality of different secondary index tables for a target data table to adapt to different query filtering conditions.
The server can also synchronize the incremental partition data of the target data table to the partition corresponding to the target data table in the secondary index table corresponding to the target data table at regular intervals through a periodic synchronization task, and clear the data of the partition exceeding the life cycle range in the secondary index table so as to realize data synchronization and maintenance.
The server may determine the target index table in the secondary index table based on the non-primary key field in the target data table contained in the target field. For example, assume that the target data table includes a field 1, a field 2, and a field 3, where the field 1 is a primary key field of the target data table, a secondary index table corresponding to the target data table constructed by the server may include the secondary index table 1 and the secondary index table 2, the secondary index table 1 may include the field 1 and the field 2, where the field 2 may be a primary key field of the secondary index table 1, and the secondary index table 2 may include the field 1 and the field 3, where the field 3 may be a primary key field of the secondary index table 2.
If the non-primary key field in the target data table included in the target field is field 3, the secondary index table 2 may be determined as the target index table, that is, the server may determine, as the target index table, the secondary index table with a higher matching degree with the data query statement in the secondary index table corresponding to the target data table.
In S208, the target index table is queried based on the data query statement and the primary key field in the target index table to obtain a first query result, and the target data table is queried based on the data query statement, the first query result and the primary key field in the target data table to obtain a data query result corresponding to the data query request.
In implementation, assume that the data query statement is an SQL statement:
SELECT info
FROM target
WHERE cid=1。
the target is a target data table, and as shown in fig. 3, the target data table may include a field id, a field cid, and a field info, where the field id may be a primary key field in the target data table, and the target index table corresponding to the target data table may include the field cid and the field id, where the field cid may be a primary key field of the target index table.
The server can index through the primary key field (i.e. field cid) in the target index table to query the target index table to obtain a first query result, and then index through the primary key field (i.e. field id) in the target data table to query the target data table to obtain a data query result corresponding to the data query request.
Therefore, the server can avoid carrying out full table scanning on the data table through twice indexing, and can improve the data query efficiency under the condition that the data volume of the data table to be queried is large.
The embodiment of the specification provides a data processing method, which is used for receiving a data query request aiming at a target data table, carrying out field extraction processing on a data query statement carried in the data query request to obtain a target field corresponding to the data query request, acquiring a non-primary key field in the target data table contained in the target field under the condition that query cannot be carried out according to a primary key field in the target data table based on the data query statement, acquiring a target index table corresponding to the non-primary key field in the target data table in a secondary index table corresponding to the target data table, wherein the primary key field of the secondary index table is the non-primary key field in the target data table, carrying out query processing on the target index table based on the data query statement and the primary key field in the target index table to obtain a first query result, and carrying out query processing on the target data table based on the data query statement, the first query result and the primary key field in the target data table to obtain a data query result corresponding to the data query request. Therefore, the server can establish a secondary index table corresponding to the target data table through the non-primary key field of the target data table and combine the secondary index table to perform data query, thereby fully utilizing the index acceleration characteristic of the primary key field of the data table, avoiding the whole table scanning of the data table and improving the data query efficiency.
Example two
The embodiment of the specification provides a data processing method, and an execution subject of the method may be a server, where the server may be an independent server or may be a server cluster formed by a plurality of servers. The method specifically comprises the following steps:
in S202, a data query request for a target data table is received.
In S402, a filter term expression in a data query statement is acquired.
In implementation, the server may extract the filter condition expression in the data query statement, and reject all the "logical OR" operators in the filter condition expression and their left and right conditions, and then extract the field names in the filter condition expression that satisfy the following conditions:
1. the field name is a left value or a right value of a preset relational operator, and the preset relational operator is one of "=", ">", "<", "> =", "<=", and "IN" or other relational operators capable of effectively using the secondary index table;
2. when the field name is the left value (right value) of the preset relation operator, the corresponding right value (left value) does not contain any field name.
The extracted field name is the target field.
Taking a data query statement as an example, an expression after WHERE in the SQL statement may be a filtering condition expression in the data query statement, for example, assume that the data query statement is:
SELECT field 1
FROM data Table 1
WHERE field 2=1 and field 3>5.
Then, the filter term expression obtained by the server is field 2=1 and field 3>5.
The method for acquiring the filtering condition expression is an optional and realizable acquisition method, and in an actual application scenario, there may be a plurality of different acquisition methods, and may be different according to the actual application scenario, which is not specifically limited in the embodiment of the present disclosure.
In S404, among preset relational operators of the filter condition expression, a preset relational operator having a field on one side and a non-field on the other side adjacent to each other is determined as a target operator.
The preset relation operator may include a relation operator that can be effectively used, for example, the preset relation operator may include "=", ">", "<", "> =", "<=", "IN", and the like.
In implementation, for example, assume that the filter conditional expression is field 2=1 and field 3>5, where "=" and ">" are target operators since the preset relationship cloud computing service includes "=" and ">", "=" is field 2 on one side and number 1 on the other side, ">" is field 3 on the other side, and number 5 on the other side.
In S406, a field corresponding to the target operator is determined as a target field corresponding to the data query request.
In implementations, for example, assuming that the filter conditional expression is field 2=1 and field 3>5, and the target operator is "=" and ">", the server may determine the "=" corresponding field 2, and the ">" corresponding field 3 as target fields.
In S408, in the case that the target field does not include the primary key field in the target data table, it is determined that the query cannot be performed according to the primary key field in the target data table; or under the condition that the filtering condition expression corresponding to the main key field in the target data table in the data query statement is a preset condition expression, determining that query cannot be performed according to the main key field in the target data table.
Wherein the preset conditional expressions include, but are not limited to, fuzzy search expressions, unequal search expressions.
In implementation, if the extracted target field does not include the primary key field in the target data, that is, the data query request is for the non-primary key field in the target data, the server cannot perform index query according to the primary key field in the target data table. Or, in the case that the filtering condition expression corresponding to the primary key field in the target data table in the data query statement is a fuzzy search expression or is not equal to the search expression, the server cannot query according to the primary key field in the target data table.
In S204, if it is determined that the query cannot be performed based on the primary key field in the target data table based on the data query statement, a non-primary key field in the target data table included in the target field is acquired.
In S410, non-primary key fields in the target data table are acquired.
In S412, based on the service processing requirement corresponding to the target data table, a field screening process is performed on the non-primary key field in the target data table, so as to obtain a target non-primary key field.
In an implementation, the server may obtain a historical data query request for the target data table, and determine a service processing requirement for the target data table based on the historical data query request, so as to perform field screening processing on the non-primary key field in the target data table based on the service processing requirement, to obtain a target non-primary key field, so that a secondary index table constructed based on the target non-primary key field with high distinction degree may be used for the service processing requirement for the target data table.
For example, the server may obtain a historical query frequency of non-primary key fields in the target data table based on the historical data query request, and the server may determine non-primary key fields with query frequencies above a preset frequency threshold as target non-primary key fields.
The method for determining the target non-primary key field is an optional and implementable determination method, and in an actual application scenario, there may be a plurality of different determination methods, and may be different according to the actual application scenario, which is not specifically limited in the embodiment of the present disclosure.
Because the fields in the filtering condition expression are usually fields such as identity information and equipment information with high distinction degree, the main key field index acceleration characteristic of the big data computing service platform can be fully utilized by establishing a secondary index table for common filtering fields in the main table in advance and combining the secondary index table to perform the associated query or sub-query, so that full table scanning is avoided, and query acceleration is realized.
In S414, the target non-primary key field is determined to be the primary key field of the secondary index table, and the primary key field in the target data table is determined to be the non-primary key field of the secondary index table, so as to obtain the secondary index table corresponding to the target data table.
In implementation, the target field may include non-primary key fields in multiple target data tables, and multiple secondary index tables may be constructed through the target field, that is, the server may combine the non-primary key fields in multiple target data tables, and determine the non-primary key field combination as the primary key field of the secondary index table.
For example, assuming that the target data table includes a field 1, a field 2, and a field 3, where the field 1 is a primary key field of the target data table, the field 2 and the field 3 are non-primary key fields of the target data table, if the target non-primary key field determined by the server includes a field 2 and a field 3, a secondary index table 1 may be constructed based on the field 1 and the field 2, a secondary index table 2 may be constructed based on the field 1 and the field 3, and a secondary index table 3 may be constructed based on the field 1, the field 2 is a primary key field in the secondary index table 1, the field 1 is a non-primary key field, the field 3 is a primary key field in the secondary index table 2, the field 1 is a non-primary key field, and the field 2 and the field 3 may be joint primary key fields in the secondary index table 3.
In S416, non-primary key fields in the plurality of target data tables are combined based on the leftmost prefix matching rule, resulting in a plurality of field combinations.
The field combination may include non-primary key fields in one or more target data tables, and the leftmost prefix matching rule refers to that when a joint index including a plurality of fields is created for a data table, the data table is queried, the data table may be matched from left to right according to the sequence of the fields in the defined index until a non-equivalent query is encountered.
In practice, assuming that the non-primary key fields in the target data table contain field 1, field 2, and field 3, then combining these multiple non-primary key fields based on the leftmost prefix match rule, the resulting field combination may contain field combination 1: (field 1), field combination 2: (field 1, field 2) and field combination 3: (field 1, field 2, field 3).
In S418, the field combination and the primary key field of the secondary index table are subjected to matching processing to obtain a matching result, and a target index table corresponding to the non-primary key field in the target data table in the secondary index table corresponding to the target data table is determined based on the matching result.
The primary key field of the secondary index table may be a non-primary key field in the target data table.
In an implementation, assuming that the field combination includes the field combination 1, the field combination 2, and the field combination 3, the secondary index table includes a secondary index table 1 and a secondary index table 2, where a primary key field of the secondary index table 1 is the field 1, and a primary key field of the secondary index table 2 includes the field 1 and the field 2, the server may obtain a matching degree between each field combination and each secondary index table, and determine the target index table based on the matching degree and the number of fields included in the field combination.
As described above, the degree of matching between the field combination 2 and the primary key field of the secondary index table 2 is 100%, the degree of matching between the field combination 1 and the primary key field of the secondary index table 1 is 100%, the field combination 1 includes 2 fields, and the field combination 1 includes 1 field, and therefore, the secondary index table 2 can be determined as the target index table.
In S420, the number of items of the data item corresponding to the data query statement in the target index table is determined based on the data query statement and the primary key field in the secondary index table.
In implementation, a part of the filtering condition expression of the data query statement, which can be directly searched through the target index table, can be extracted, that is, the server can determine the total number of rows that the filtering condition expression can hit in the target index table as the number of items of the data item corresponding to the data query statement in the target index table.
The server may perform the data query processing by means of sub-query or association query based on the relationship between the number of items and the preset number of items threshold, i.e., in the case where the number of items is not greater than the preset number of items threshold as shown in fig. 4, S422 may be continuously performed after S420, or in the case where the number of items is greater than the preset number of items threshold as shown in fig. 5, S424 may be continuously performed after S420.
The preset term number threshold value can be determined according to the business processing requirement of the target data table.
In S422, under the condition that the number of terms is not greater than the preset threshold, the target index table is queried based on the data query statement and the primary key field in the target index table in a sub-query manner to obtain a first query result, and the target data table is queried based on the data query statement, the first query result and the primary key field in the target data table to obtain a data query result corresponding to the data query request.
In practice, the processing manner of S422 may be varied, and the following provides an alternative implementation manner, and the following steps A1 to A2 may be specifically referred to:
and A1, updating the data query statement based on the data query statement, the primary key field in the target index table and the primary key field in the target data table to obtain a first data query statement.
The first data query statement may include a first sub-data query statement for performing query processing based on a primary key field in the target index table to obtain a first query result, and a second sub-query statement for performing query processing based on the first query result and the primary key field in the target data table to obtain a data query result corresponding to the data query request.
And step A2, executing the first data query statement to obtain a data query result corresponding to the data query request.
In implementation, taking a data query statement as an example, the server can rewrite the SQL statement to realize data query in a sub-query mode. For example, the server may generate a sub-query SQL statement (i.e., a first sub-data query statement) based on the original SQL statement that expresses the following semantics: 1. extracting a part which can be directly searched through a main key field of a target index table from a filtering condition expression of an original SQL statement as a filtering condition expression of a sub-query SQL statement, so that a set of main key field values of all target data tables meeting the filtering condition expression can be searched through the target index table based on the sub-query SQL statement; 2. and splicing an IN operator taking the primary key field name of the target data table as a left value and the sub-query SQL statement as a right value IN the original SQL filtering conditional expression.
For example, assume that the original SQL statement is:
SELECT info
FROM target
WHERE cid=1。
the first data query statement obtained by updating the data query statement may be:
SELECT info
FROM target
WHERE id in(SELETC id
FROM index
WHERE cid=1)。
the target is a target data table, the index is a target index table, and the first sub-query statement is:
SELETC id
FROM index
WHERE cid=1。
Because the field cid as the filtering condition in the original SQL sentence is the non-primary key field of the target data table target, the server cannot use the index of the primary key field id to accelerate the query when querying, and needs to scan the target data table in a full table. The server can rewrite the data query statement through the target index table, and change the filtering condition of the non-main key field of the target data table in the original SQL statement into the filtering query piece of the main key field cid of the target index table and the main key field id of the target data table, thereby fully playing the index characteristic of the data table, avoiding full table scanning and realizing the acceleration of the whole query process.
In S424, when the number of terms is greater than the preset term threshold, the target index table is queried based on the data query statement and the primary key field in the target index table in the association query manner, so as to obtain a first query result, and the target data table is queried based on the data query statement, the first query result and the primary key field in the target data table, so as to obtain a data query result corresponding to the data query request.
In practice, the processing manner of S424 may be varied, and the following provides an alternative implementation, which can be specifically referred to the following steps B1-B2:
And B1, updating the data query statement based on the data query statement, the main key field in the target index table and the main key field in the target data table to obtain a second data query statement.
In implementation, the server may prefix the table name of the target data table before all field names of the original SQL statement, and add a JOIN condition of the target data table and the target index table, where the ON condition is: primary key field name of target data table = corresponding field name in target index table. The server can extract the part which can be directly searched through the target index table in the original filtering condition expression, adds the table name prefix of the target index table before all the field names, and is spliced into the original filtering condition expression.
For example, assume that the original SQL statement is:
SELECT info
FROM target
WHERE cid=1。
the second data query statement obtained by updating the data query statement is:
SELECT info
FROM target
JOIN index
ON target.id=indx.id
WHERE target.cid=1
AND index.cid=1。
in addition, the server may further perform the alignment process ON the target data table and the target index table by adding the partition field name of the target data table=the partition field name of the target index table in the ON condition.
And B1, executing a second data query statement, carrying out query processing on the target index table based on the main key field in the target index table to obtain a first query result, and carrying out query processing on the data table obtained by associating the target index table and the target data table based on the first query result and the main key field in the target data table to obtain a data query result corresponding to the data query request.
In implementation, the server submits the rewritten new SQL statement (i.e. the second data query statement) to the big data computing service platform for execution, and then the main key indexes of the main table (i.e. the target data table) and the secondary index table (i.e. the target index table) can be utilized simultaneously, so that full-table scanning is avoided, and query acceleration is realized.
Because the query SQL is rewritten to obtain the updated data query statement, the implementation is lighter and the machine cost is lower. The query speed can be greatly improved, which is equivalent to the acceleration effect of realizing query optimization through a distributed query acceleration engine.
The embodiment of the specification provides a data processing method, which is used for receiving a data query request aiming at a target data table, carrying out field extraction processing on a data query statement carried in the data query request to obtain a target field corresponding to the data query request, acquiring a non-primary key field in the target data table contained in the target field under the condition that query cannot be carried out according to a primary key field in the target data table based on the data query statement, acquiring a target index table corresponding to the non-primary key field in the target data table in a secondary index table corresponding to the target data table, wherein the primary key field of the secondary index table is the non-primary key field in the target data table, carrying out query processing on the target index table based on the data query statement and the primary key field in the target index table to obtain a first query result, and carrying out query processing on the target data table based on the data query statement, the first query result and the primary key field in the target data table to obtain a data query result corresponding to the data query request. Therefore, the server can establish a secondary index table corresponding to the target data table through the non-primary key field of the target data table and combine the secondary index table to perform data query, thereby fully utilizing the index acceleration characteristic of the primary key field of the data table, avoiding the whole table scanning of the data table and improving the data query efficiency.
Example III
The data processing method provided in the embodiment of the present disclosure is based on the same concept, and the embodiment of the present disclosure further provides a data processing device, as shown in fig. 6.
The data processing apparatus includes: a request receiving module 601, a first acquiring module 602, a second acquiring module 603, and a result determining module 604, wherein:
the request receiving module 601 is configured to receive a data query request for a target data table, and perform field extraction processing on a data query statement carried in the data query request to obtain a target field corresponding to the data query request;
a first obtaining module 602, configured to obtain a non-primary key field in the target data table included in the target field, when it is determined that query cannot be performed according to a primary key field in the target data table based on the data query statement;
a second obtaining module 603, configured to obtain a target index table corresponding to a non-primary key field in the target data table in a secondary index table corresponding to the target data table, where a primary key field of the secondary index table is a non-primary key field in the target data table;
the result determining module 604 is configured to perform query processing on the target index table based on the data query statement and the primary key field in the target index table to obtain a first query result, and perform query processing on the target data table based on the data query statement, the first query result and the primary key field in the target data table to obtain a data query result corresponding to the data query request.
In the embodiment of the present disclosure, the result determining module 604 is configured to:
determining the number of items of the data item corresponding to the data query statement in the target index table based on the data query statement and a primary key field in the secondary index table;
under the condition that the number of items is not greater than a preset item number threshold value, inquiring the target index table based on the data inquiry statement and a main key field in the target index table in a sub-inquiry mode to obtain the first inquiry result, and inquiring the target data table based on the data inquiry statement, the first inquiry result and the main key field in the target data table to obtain the data inquiry result corresponding to the data inquiry request;
and under the condition that the number of items is larger than a preset threshold value of the number of items, inquiring the target index table based on the data inquiry statement and the main key field in the target index table in an associated inquiry mode to obtain the first inquiry result, and inquiring the target data table based on the data inquiry statement, the first inquiry result and the main key field in the target data table to obtain the data inquiry result corresponding to the data inquiry request.
In the embodiment of the present disclosure, the result determining module 604 is configured to:
updating the data query statement based on the data query statement, a main key field in the target index table and a main key field in the target data table to obtain a first data query statement, wherein the first data query statement comprises a first sub-data query statement for obtaining the first query result based on the main key field in the target index table, and a second sub-query statement for obtaining the data query result corresponding to the data query request based on the first query result and the main key field in the target data table;
and executing the first data query statement to obtain the data query result corresponding to the data query request.
In the embodiment of the present disclosure, the result determining module 604 is configured to:
updating the data query statement based on the data query statement, the primary key field in the target index table and the primary key field in the target data table to obtain a second data query statement;
Executing the second data query statement, carrying out query processing on the target index table based on a main key field in the target index table to obtain the first query result, and carrying out query processing on a data table obtained by associating the target index table with the target data table based on the first query result and the main key field in the target data table to obtain the data query result corresponding to the data query request.
In an embodiment of the present disclosure, the apparatus further includes:
a third obtaining module, configured to obtain a non-primary key field in the target data table;
the field screening module is used for carrying out field screening processing on the non-primary key field in the target data table based on the service processing requirement corresponding to the target data table to obtain a target non-primary key field;
and the construction module is used for determining the target non-primary key field as the primary key field of the secondary index table, determining the primary key field in the target data table as the non-primary key field of the secondary index table, and obtaining the secondary index table corresponding to the target data table.
In the embodiment of the present specification, the request receiving module 601 is configured to:
Acquiring a filtering condition expression in the data query statement;
determining a preset relation operator with a field on one side and a non-field on the other side adjacent to the preset relation operator in the filtering condition expression as a target operator;
and determining the field corresponding to the target operator as the target field corresponding to the data query request.
In the embodiment of the present disclosure, the first obtaining module 602 is configured to:
based on a leftmost prefix matching rule, combining non-primary key fields in the plurality of target data tables to obtain a plurality of field combinations, wherein the field combinations comprise one or more non-primary key fields in the target data tables;
and carrying out matching processing on the field combination and the primary key field of the secondary index table to obtain a matching result, and determining a target index table corresponding to a non-primary key field in the target data table in the secondary index table corresponding to the target data table based on the matching result.
In an embodiment of the present disclosure, the apparatus further includes:
the first determining module is used for determining that query cannot be performed according to the primary key field in the target data table under the condition that the target field does not contain the primary key field in the target data table; or (b)
And the second determining module is used for determining that query cannot be performed according to the main key field in the target data table under the condition that the filtering condition expression corresponding to the main key field in the target data table in the data query statement is a preset condition expression, wherein the preset condition expression comprises but is not limited to a fuzzy search expression and a search expression which is not equal to the fuzzy search expression.
The embodiment of the specification provides a data processing device, which receives a data query request for a target data table, performs field extraction processing on a data query statement carried in the data query request to obtain a target field corresponding to the data query request, obtains a non-primary key field in the target data table contained in the target field under the condition that query cannot be performed according to a primary key field in the target data table based on the data query statement, obtains a target index table corresponding to the non-primary key field in the target data table in a secondary index table corresponding to the target data table, wherein the primary key field of the secondary index table is the non-primary key field in the target data table, performs query processing on the target index table based on the data query statement and the primary key field in the target index table to obtain a first query result, and performs query processing on the target data table based on the data query statement, the first query result and the primary key field in the target data table to obtain a data query result corresponding to the data query request. Therefore, the server can establish a secondary index table corresponding to the target data table through the non-primary key field of the target data table and combine the secondary index table to perform data query, thereby fully utilizing the index acceleration characteristic of the primary key field of the data table, avoiding the whole table scanning of the data table and improving the data query efficiency.
Example IV
Based on the same idea, the embodiment of the present disclosure further provides a data processing apparatus, as shown in fig. 7.
The data processing apparatus may vary considerably in configuration or performance and may include one or more processors 701 and memory 702, where the memory 702 may store one or more stored applications or data. Wherein the memory 702 may be transient storage or persistent storage. The application programs stored in the memory 702 may include one or more modules (not shown) each of which may include a series of computer executable instructions for use in a data processing apparatus. Still further, the processor 701 may be arranged to communicate with a memory 702 and execute a series of computer executable instructions in the memory 702 on a data processing apparatus. The data processing device may also include one or more power supplies 703, one or more wired or wireless network interfaces 704, one or more input/output interfaces 705, and one or more keyboards 706.
In particular, in this embodiment, the data processing apparatus includes a memory, and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may include a series of computer-executable instructions for the data processing apparatus, and the one or more programs configured to be executed by the one or more processors comprise instructions for:
Receiving a data query request aiming at a target data table, and carrying out field extraction processing on a data query statement carried in the data query request to obtain a target field corresponding to the data query request;
acquiring a non-primary key field in the target data table contained in the target field under the condition that the query cannot be performed according to the primary key field in the target data table based on the data query statement;
acquiring a target index table corresponding to a non-primary key field in the target data table in a secondary index table corresponding to the target data table, wherein the primary key field of the secondary index table is the non-primary key field in the target data table;
and carrying out query processing on the target index table based on the data query statement and the primary key field in the target index table to obtain a first query result, and carrying out query processing on the target data table based on the data query statement, the first query result and the primary key field in the target data table to obtain a data query result corresponding to the data query request.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for data processing apparatus embodiments, the description is relatively simple, as it is substantially similar to method embodiments, with reference to the description of method embodiments in part.
The embodiment of the specification provides data processing equipment, which receives a data query request aiming at a target data table, performs field extraction processing on a data query statement carried in the data query request to obtain a target field corresponding to the data query request, obtains a non-primary key field in the target data table contained in the target field under the condition that query cannot be performed according to a primary key field in the target data table based on the data query statement, obtains a target index table corresponding to the non-primary key field in the target data table in a secondary index table corresponding to the target data table, wherein the primary key field of the secondary index table is the non-primary key field in the target data table, performs query processing on the target index table based on the data query statement and the primary key field in the target index table to obtain a first query result, and performs query processing on the target data table based on the data query statement, the first query result and the primary key field in the target data table to obtain a data query result corresponding to the data query request. Therefore, the server can establish a secondary index table corresponding to the target data table through the non-primary key field of the target data table and combine the secondary index table to perform data query, thereby fully utilizing the index acceleration characteristic of the primary key field of the data table, avoiding the whole table scanning of the data table and improving the data query efficiency.
Example five
The embodiments of the present disclosure further provide a computer readable storage medium, where a computer program is stored, where the computer program when executed by a processor implements each process of the embodiments of the data processing method, and the same technical effects can be achieved, and for avoiding repetition, a detailed description is omitted herein. Wherein the computer readable storage medium is selected from Read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), magnetic disk or optical disk.
The embodiment of the specification provides a computer readable storage medium, which receives a data query request for a target data table, performs field extraction processing on a data query statement carried in the data query request to obtain a target field corresponding to the data query request, obtains a non-primary key field in the target data table contained in the target field when determining that query cannot be performed according to a primary key field in the target data table based on the data query statement, obtains a target index table corresponding to the non-primary key field in the target data table in a secondary index table corresponding to the target data table, wherein the primary key field of the secondary index table is the non-primary key field in the target data table, performs query processing on the target index table based on the data query statement and the primary key field in the target index table to obtain a first query result, and performs query processing on the target data table based on the data query statement, the first query result and the primary key field in the target data table to obtain a data query result corresponding to the data query request. Therefore, the server can establish a secondary index table corresponding to the target data table through the non-primary key field of the target data table and combine the secondary index table to perform data query, thereby fully utilizing the index acceleration characteristic of the primary key field of the data table, avoiding the whole table scanning of the data table and improving the data query efficiency.
The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
In the 90 s of the 20 th century, improvements to one technology could clearly be distinguished as improvements in hardware (e.g., improvements to circuit structures such as diodes, transistors, switches, etc.) or software (improvements to the process flow). However, with the development of technology, many improvements of the current method flows can be regarded as direct improvements of hardware circuit structures. Designers almost always obtain corresponding hardware circuit structures by programming improved method flows into hardware circuits. Therefore, an improvement of a method flow cannot be said to be realized by a hardware entity module. For example, a programmable logic device (Programmable Logic Device, PLD) (e.g., field programmable gate array (Field Programmable Gate Array, FPGA)) is an integrated circuit whose logic function is determined by the programming of the device by a user. A designer programs to "integrate" a digital system onto a PLD without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Moreover, nowadays, instead of manually manufacturing integrated circuit chips, such programming is mostly implemented by using "logic compiler" software, which is similar to the software compiler used in program development and writing, and the original code before the compiling is also written in a specific programming language, which is called hardware description language (Hardware Description Language, HDL), but not just one of the hdds, but a plurality of kinds, such as ABEL (Advanced Boolean Expression Language), AHDL (Altera Hardware Description Language), confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), lava, lola, myHDL, PALASM, RHDL (Ruby Hardware Description Language), etc., VHDL (Very-High-Speed Integrated Circuit Hardware Description Language) and Verilog are currently most commonly used. It will also be apparent to those skilled in the art that a hardware circuit implementing the logic method flow can be readily obtained by merely slightly programming the method flow into an integrated circuit using several of the hardware description languages described above.
The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, application specific integrated circuits (Application Specific Integrated Circuit, ASIC), programmable logic controllers, and embedded microcontrollers, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, atmel AT91SAM, microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic of the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller in a pure computer readable program code, it is well possible to implement the same functionality by logically programming the method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Such a controller may thus be regarded as a kind of hardware component, and means for performing various functions included therein may also be regarded as structures within the hardware component. Or even means for achieving the various functions may be regarded as either software modules implementing the methods or structures within hardware components.
The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. One typical implementation is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being functionally divided into various units, respectively. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing one or more embodiments of the present description.
It will be appreciated by those skilled in the art that embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, one or more embodiments of the present description may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Moreover, one or more embodiments of the present description can take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
Embodiments of the present description are described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the specification. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
It will be appreciated by those skilled in the art that embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, one or more embodiments of the present description may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Moreover, one or more embodiments of the present description can take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
One or more embodiments of the present specification may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. One or more embodiments of the present description may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.
The foregoing is merely exemplary of the present disclosure and is not intended to limit the disclosure. Various modifications and alterations to this specification will become apparent to those skilled in the art. Any modifications, equivalent substitutions, improvements, or the like, which are within the spirit and principles of the present description, are intended to be included within the scope of the claims of the present description.
Claims (10)
1. A data processing method, comprising:
receiving a data query request aiming at a target data table, and carrying out field extraction processing on a data query statement carried in the data query request to obtain a target field corresponding to the data query request;
acquiring a non-primary key field in the target data table contained in the target field under the condition that the query cannot be performed according to the primary key field in the target data table based on the data query statement;
acquiring a target index table corresponding to a non-primary key field in the target data table in a secondary index table corresponding to the target data table, wherein the primary key field of the secondary index table is the non-primary key field in the target data table;
and carrying out query processing on the target index table based on the data query statement and the primary key field in the target index table to obtain a first query result, and carrying out query processing on the target data table based on the data query statement, the first query result and the primary key field in the target data table to obtain a data query result corresponding to the data query request.
2. The method of claim 1, wherein the querying the target index table based on the data query statement and the primary key field in the target index table to obtain a first query result, and the querying the target data table based on the data query statement, the first query result, and the primary key field in the target data table to obtain a data query result corresponding to the data query request, comprises:
Determining the number of items of the data item corresponding to the data query statement in the target index table based on the data query statement and a primary key field in the secondary index table;
under the condition that the number of items is not greater than a preset item number threshold value, inquiring the target index table based on the data inquiry statement and a main key field in the target index table in a sub-inquiry mode to obtain the first inquiry result, and inquiring the target data table based on the data inquiry statement, the first inquiry result and the main key field in the target data table to obtain the data inquiry result corresponding to the data inquiry request;
and under the condition that the number of items is larger than a preset threshold value of the number of items, inquiring the target index table based on the data inquiry statement and the main key field in the target index table in an associated inquiry mode to obtain the first inquiry result, and inquiring the target data table based on the data inquiry statement, the first inquiry result and the main key field in the target data table to obtain the data inquiry result corresponding to the data inquiry request.
3. The method of claim 2, wherein the querying, by means of sub-querying, the target index table based on the data query statement and the primary key field in the target index table to obtain the first query result, and the querying, based on the data query statement, the first query result and the primary key field in the target data table, the target data table to obtain the data query result corresponding to the data query request, includes:
updating the data query statement based on the data query statement, a main key field in the target index table and a main key field in the target data table to obtain a first data query statement, wherein the first data query statement comprises a first sub-data query statement for obtaining the first query result based on the main key field in the target index table, and a second sub-query statement for obtaining the data query result corresponding to the data query request based on the first query result and the main key field in the target data table;
And executing the first data query statement to obtain the data query result corresponding to the data query request.
4. The method of claim 2, wherein the querying, by means of the association query, the target index table based on the data query statement and the primary key field in the target index table to obtain the first query result, and the querying, based on the data query statement, the first query result, and the primary key field in the target data table, the target data table to obtain the data query result corresponding to the data query request, includes:
updating the data query statement based on the data query statement, the primary key field in the target index table and the primary key field in the target data table to obtain a second data query statement;
executing the second data query statement, carrying out query processing on the target index table based on a main key field in the target index table to obtain the first query result, and carrying out query processing on a data table obtained by associating the target index table with the target data table based on the first query result and the main key field in the target data table to obtain the data query result corresponding to the data query request.
5. The method of claim 1, further comprising, prior to the obtaining the target index table corresponding to the non-primary key field in the target data table in the secondary index table corresponding to the target data table:
acquiring a non-primary key field in the target data table;
performing field screening processing on non-primary key fields in the target data table based on service processing requirements corresponding to the target data table to obtain target non-primary key fields;
and determining the target non-primary key field as the primary key field of the secondary index table, and determining the primary key field in the target data table as the non-primary key field of the secondary index table to obtain a secondary index table corresponding to the target data table.
6. The method of claim 5, wherein the performing field extraction processing on the data query statement carried in the data query request to obtain the target field corresponding to the data query request includes:
acquiring a filtering condition expression in the data query statement;
determining a preset relation operator with a field on one side and a non-field on the other side adjacent to the preset relation operator in the filtering condition expression as a target operator;
And determining the field corresponding to the target operator as the target field corresponding to the data query request.
7. The method of claim 6, wherein the target field includes a plurality of non-primary key fields in the target data table, and the obtaining a target index table corresponding to the non-primary key fields in the target data table in a secondary index table corresponding to the target data table comprises:
based on a leftmost prefix matching rule, combining non-primary key fields in the plurality of target data tables to obtain a plurality of field combinations, wherein the field combinations comprise one or more non-primary key fields in the target data tables;
and carrying out matching processing on the field combination and the primary key field of the secondary index table to obtain a matching result, and determining a target index table corresponding to a non-primary key field in the target data table in the secondary index table corresponding to the target data table based on the matching result.
8. The method of claim 1, further comprising, prior to the obtaining a non-primary key field in the target data table contained in the target field, in a case where it is determined that a query cannot be made from a primary key field in the target data table based on the data query statement:
Determining that query cannot be performed according to the primary key field in the target data table under the condition that the target field does not contain the primary key field in the target data table; or (b)
And under the condition that a filtering condition expression corresponding to the main key field in the target data table in the data query statement is a preset condition expression, determining that query cannot be performed according to the main key field in the target data table, wherein the preset condition expression comprises but is not limited to a fuzzy search expression and a search expression which is not equal to the fuzzy search expression.
9. A data processing apparatus comprising:
the request receiving module is used for receiving a data query request aiming at a target data table, and carrying out field extraction processing on a data query statement carried in the data query request to obtain a target field corresponding to the data query request;
the first acquisition module is used for acquiring non-primary key fields in the target data table contained in the target field under the condition that the query cannot be performed according to the primary key fields in the target data table based on the data query statement;
the second acquisition module is used for acquiring a target index table corresponding to a non-primary key field in the target data table in a secondary index table corresponding to the target data table, wherein the primary key field of the secondary index table is the non-primary key field in the target data table;
And the result determining module is used for carrying out query processing on the target index table based on the data query statement and the primary key field in the target index table to obtain a first query result, and carrying out query processing on the target data table based on the data query statement, the first query result and the primary key field in the target data table to obtain a data query result corresponding to the data query request.
10. A data processing apparatus, the data processing apparatus comprising:
a processor; and
a memory arranged to store computer executable instructions that, when executed, cause the processor to:
receiving a data query request aiming at a target data table, and carrying out field extraction processing on a data query statement carried in the data query request to obtain a target field corresponding to the data query request;
acquiring a non-primary key field in the target data table contained in the target field under the condition that the query cannot be performed according to the primary key field in the target data table based on the data query statement;
acquiring a target index table corresponding to a non-primary key field in the target data table in a secondary index table corresponding to the target data table, wherein the primary key field of the secondary index table is the non-primary key field in the target data table;
And carrying out query processing on the target index table based on the data query statement and the primary key field in the target index table to obtain a first query result, and carrying out query processing on the target data table based on the data query statement, the first query result and the primary key field in the target data table to obtain a data query result corresponding to the data query request.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310617476.9A CN116628010A (en) | 2023-05-26 | 2023-05-26 | Data processing method, device and equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310617476.9A CN116628010A (en) | 2023-05-26 | 2023-05-26 | Data processing method, device and equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116628010A true CN116628010A (en) | 2023-08-22 |
Family
ID=87636399
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310617476.9A Pending CN116628010A (en) | 2023-05-26 | 2023-05-26 | Data processing method, device and equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116628010A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117521150A (en) * | 2024-01-04 | 2024-02-06 | 极术(杭州)科技有限公司 | Data collaborative processing method based on multiparty security calculation |
-
2023
- 2023-05-26 CN CN202310617476.9A patent/CN116628010A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117521150A (en) * | 2024-01-04 | 2024-02-06 | 极术(杭州)科技有限公司 | Data collaborative processing method based on multiparty security calculation |
CN117521150B (en) * | 2024-01-04 | 2024-04-09 | 极术(杭州)科技有限公司 | Data collaborative processing method based on multiparty security calculation |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107038207B (en) | Data query method, data processing method and device | |
CN109800222B (en) | HBase secondary index self-adaptive optimization method and system | |
CN105975617A (en) | Multi-partition-table inquiring and processing method and device | |
US11188552B2 (en) | Executing conditions with negation operators in analytical databases | |
CN111382155B (en) | Data processing method of data warehouse, electronic equipment and medium | |
CN110399359B (en) | Data backtracking method, device and equipment | |
CN116628010A (en) | Data processing method, device and equipment | |
US10372736B2 (en) | Generating and implementing local search engines over large databases | |
US20240256613A1 (en) | Data processing method and apparatus, readable storage medium, and electronic device | |
CN116521705A (en) | Data query method and device, storage medium and electronic equipment | |
CN110580255A (en) | method and system for storing and retrieving data | |
CN107451204B (en) | Data query method, device and equipment | |
CN110245137B (en) | Index processing method, device and equipment | |
CN117252183B (en) | Semantic-based multi-source table automatic matching method, device and storage medium | |
CN111125216B (en) | Method and device for importing data into Phoenix | |
CN116644090B (en) | Data query method, device, equipment and medium | |
CN110083602B (en) | Method and device for data storage and data processing based on hive table | |
CN111666278B (en) | Data storage method, data retrieval method, electronic device and storage medium | |
CN110245136B (en) | Data retrieval method, device, equipment and storage equipment | |
CN109697234B (en) | Multi-attribute information query method, device, server and medium for entity | |
CN116521733A (en) | Data query method and device | |
US20200218748A1 (en) | Multigram index for database query | |
CN116662367A (en) | Analysis method, storage medium and processor for data blood edges | |
CN116010419A (en) | Method and device for creating unique index and optimizing logic deletion | |
CN113468529B (en) | Data searching method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |