CN106599095A - Pruning method based on complete historical record - Google Patents
Pruning method based on complete historical record Download PDFInfo
- Publication number
- CN106599095A CN106599095A CN201611056390.XA CN201611056390A CN106599095A CN 106599095 A CN106599095 A CN 106599095A CN 201611056390 A CN201611056390 A CN 201611056390A CN 106599095 A CN106599095 A CN 106599095A
- Authority
- CN
- China
- Prior art keywords
- query
- complete historical
- historical
- complete
- result
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention provides a pruning method based on a complete historical record. The method comprises a first step that a client sends a query request, and a server receives the query request; a second step that the server parses the query request, and decomposes a query statement into small steps to execute; a third step that a query process is executed according to small query steps, so as to obtain query middle results, and corresponding pruning operations are performed on the middle results, wherein the pruning operations comprise a simple pruning operation and a pruning operation based on the complete historical record; and a fourth step that a post-pruning result and all historical results are simultaneously added to a new historical record table so as to be passed to the next small query step to continue pruning. Compared with the prior art, the method has the advantages that useless middle results are removed as early as possible according to the complete historical record, the characteristic of a high performance network (RDMA) is taken into account fully, and communication overhead is reduced. Compared with a traditional one-step pruning method, the pruning method prevents a final result combination operation with high overhead, thereby greatly improving the performance of a search system.
Description
Technical field
The present invention relates to a kind of figure inquiry subtracts a method, it is more particularly to a kind of that branch side is subtracted based on complete historical
Method.
Background technology
Graph structure data are increasingly common in large-scale network application, and especially mass data all presents freedom
And abundant relevance, the network application in each face of each side is widely used in strongly connected diagram data, such as some are commercially searched
Index is held up including Google (Google) and must answer (Bing) using RDF (Resource Description Framework) to show
The content of the expression webpage of formula.And for the network application of these process magnanimity diagram datas, the execution of user's online query
Speed is unusual the key link, wherein it is to reduce one of important means for postponing to subtract a method to possible result, efficiently
Subtracting a method can earlier reject incorrect result, reduce communication-cost, improve the overall performance of inquiry system.
It is a kind of high performance network that remote direct memory accesses (RDMA, Remote Direct Memory Access)
Mechanics of communication, can directly access remote memory address, including direct read operation and write operation, and because RDMA can be complete
Bypass the CPU of target machine, it is not necessary to which the participation of target machine CPU is assisted, therefore show low delay and high handling capacity,
Big advantage is shown compared to traditional network communication.Mono- significant properties of RDMA is, in certain transmission data size
Under, the delay of RDMA keeps relatively low delay to be basically unchanged, this is because little data volume can't take high Netowrk tape
It is wide.
System is when the inquiry request of user is performed, it will usually many useless intermediate results are produced, if these results
Remain into always and finally rejected again, inherently cause the huge waste and larger communication-cost of resource, thus it is existing
System typically can specifically subtract a method and useless intermediate result is rejected using some.Existing RDF query system is led to
Often adopt single step and subtract the end product of method that branch and final result union operation combine needed for obtain user, this method exists
The result of previous step is only included when each step is performed, therefore can not completely reject useless result, bring extra communication
Expense, further, since each step still includes useless result, it is therefore desirable to which all of result is focused on one by last in execution
Union operation is carried out on machine, and this process readily becomes the performance bottleneck of whole system.
Therefore how to design one and efficiently subtract a method, the useless result of rejecting as early as possible reduces the expense of communication, and to the greatest extent
Amount avoids last time-consuming amalgamation result operation, and then the overall performance of lifting distributed Query Processing System, accelerates the inquiry of user
Process, it has also become those skilled in the art's technical barrier urgently to be resolved hurrily.
The content of the invention
For defect of the prior art, it is an object of the invention to provide a kind of subtract branch side based on complete historical
Method, it can make full use of the characteristic of high performance network, useless result in the middle of rejecting as early as possible, it is to avoid last time-consuming merging behaviour
Make, reduce the delay of user's inquiry request.
A kind of according to present invention offer subtracts a method based on complete historical, including:
Step 1:Client sends inquiry request, and server receives inquiry request;
Step 2:Server parses inquiry request, the query statement in inquiry request is resolved into into multistep and is performed, wherein, institute
The each step stated in multistep is designated as small step;
Step 3:Query script is performed according to small step, inquiry intermediate result is drawn, intermediate result is carried out to subtract branch operation, obtained
To subtracting the result after branch;
Step 4:The result after branch will be subtracted and all of historical results together add new history table, by new history
Record sheet passes to next small step and inquires about for continuing to subtract branch.
Preferably, the step 1 includes:Client selects a server to send inquiry request, and server monitors inquiry
Request, and initial interrogation related data, clear history record sheet, are to perform query script to prepare.
Preferably, the step 2 includes:Server is received after inquiry request, and inquiry request is parsed, and inquiry please
Ask including multiple queries sentence, query statement is resolved into multiple small steps and performed by server according to different query statements.
Preferably, the step 3 includes:
Step 3.1:According to the query statement of small step, matching operation is carried out to the data that data are concentrated, if complete history note
Entry number in record is sky, then Data Matching is carried out by the constant in query statement, draws intermediate result;If complete history
Entry number in record for sky, then by the value of simultaneous variable in query statement and historical record, to data
Collection carries out matching operation and draws intermediate result;
Step 3.2:The intermediate result drawn to small step execution carries out the different branches that subtract and operates according to following different situation:
- when the corresponding query interface of the intermediate result for newly increasing does not exist in complete historical, held according to small step
Constant in row carries out subtracting branch operation, and the intermediate result for being unsatisfactory for the constant condition is disallowable;
- when the corresponding query interface of the intermediate result for newly increasing has record in complete historical, then according to history
Variate-value in record carries out subtracting branch with whether the variate-value in intermediate result matches, specially:For the centre for newly increasing
As a result each record in corresponding complete historical, to the variate-value in historical record and the intermediate variable for newly increasing
In corresponding variate-value carry out judging whether equal, the record is rejected if unequal, otherwise retain the record;
Wherein, the query interface, refers to:Unknown quantity in query statement, it is unknown that needs return this in Query Result
Measure corresponding value.
Preferably, the step 4 includes:
Step 4.1:The result after branch will be subtracted to add in complete historical table, a row are increased newly in complete historical table
To represent the query interface of new addition, and the entry number in complete historical table also accordingly increases or reduces;
Step 4.2:Complete historical table follows query script to pass to next small step query statement to subtract for next step
Branch, according to the situation for being related to data of next small step following different transmission operation is performed:
- when the data involved by the implementation procedure of next small step are in local machine, complete historical table is carried out locally
Transmission, is not related to network transmission;
- when the data involved by the implementation procedure of next small step are in REMOTE MACHINE, complete historical table follows inquiry
Request is sent to the server of distal end and continues executing with.
Compared with prior art, the present invention has following beneficial effect:
1. it is proposed by the invention that a method is subtracted based on complete historical, can be according to complete historical, as early as possible
The useless intermediate result of rejecting, reduce the expense of communication, subtract a method compared to a traditional step, expense can be avoided huge
End product union operation, therefore show larger performance advantage.
2. the present invention has fully taken into account the characteristic of high performance network (RDMA), and reduces transmission as far as possible using its characteristic
The communication-cost of complete historical so that communication delay can keep a relatively low level, take full advantage of high network
Bandwidth.
3. it is proposed by the present invention that a method is subtracted based on complete historical, distributed Query Processing System is widely portable to,
Limited resource is fully dispatched, the waste of resource is reduced, the delay of inquiry request is reduced as far as possible, and improve whole inquiry system
Performance.
Description of the drawings
The detailed description by reading non-limiting example made with reference to the following drawings, the further feature of the present invention,
Objects and advantages will become more apparent upon:
Fig. 1 is that the present invention uses the flow chart for subtracting a method based on complete historical.
Specific embodiment
With reference to specific embodiment, the present invention is described in detail.Following examples will be helpful to the technology of this area
Personnel further understand the present invention, but the invention is not limited in any way.It should be pointed out that the ordinary skill to this area
For personnel, without departing from the inventive concept of the premise, some changes and improvements can also be made.These belong to the present invention
Protection domain.
A kind of according to present invention offer subtracts a method based on complete historical, comprises the steps:
Step 0:Data are distributed by the parallel loading initial data of multiple servers, carry out some initialization operations;
Step 1:Server receives the inquiry request that client sends;
Step 2:Server parses inquiry request, and query statement is resolved into into several small steps (being usually no more than 15 small steps)
Perform;
Step 3:Query script is performed according to inquiry small step, inquiry intermediate result is drawn, intermediate result is carried out accordingly
Subtract branch operation;
Step 4:Result and all of historical results after branch will be subtracted and together add new history table, passed to next little
Step inquiry is used for continuing to subtract branch.
The step 1 includes:Client selects one to load relatively low server (according to the request that server is carrying out
Quantity) send inquiry request, server monitors request, and by initial interrogation related data, clear history record result table,
Corresponding preparation is done to perform inquiry.
The step 2 includes:Server is received after inquiry request, and inquiry request is parsed, and inquiry request is general
It is made up of a plurality of query statement, query statement is resolved into multiple execution small steps by server according to different query statements,
Under RDF data form, inquiry request is usually to be made up of multiple triples, therefore is here to be held to divide according to triple
Row small step.
The step 3 includes:
Step 3.1:According to the sentence of inquiry small step, matching operation is carried out to the data that data are concentrated, if complete history note
Entry number in record is sky, such case typically performing first inquiry small step, then simply by query statement in it is normal
Count to send out and carry out Data Matching, draw intermediate result;If the entry number in complete historical is not sky, by query statement
The value of simultaneous variable is set out in middle historical record, is carried out matching operation to data set and is drawn intermediate result, here in
Between result refer to the other end corresponding value of the triple relative to the variable;
Step 3.2:The centre drawn to small step execution carries out the different branches that subtract and operates according to different situations:
- when the corresponding query interface of the result for newly increasing does not exist in complete historical, according in inquiry small step
Constant simply subtracted branch operation, i.e., it is whether equal with the constant entering by judging intermediate result that step 3.1 draws
Row subtracts branch, and the result for being unsatisfactory for the constant condition is disallowable;
- when the corresponding query interface of the result for newly increasing has record in complete historical, now will be according to history
Whether the variate-value in record matches to carry out subtracting branch with new value, specially:The new complete historical produced corresponding to result
In each record, the corresponding value of variable in the corresponding value of variable in historical record and new result is carried out judging equal behaviour
Make, the record is rejected if unequal;
Wherein, query interface, refers to:Unknown quantity in query statement, inquiry system needs to return it in Query Result
Corresponding value.
The step 4 includes:
Step 4.1:The new result after branch will be subtracted to add in complete historical table, at this moment needed in complete historical table
In increase a row newly to represent the query interface of new addition, while the entry number in complete historical table also accordingly increases or subtracts
Few (may cause the reduction of intermediate result due to subtracting branch), here entry number refers to the line number in complete historical table;
Step 4.2:After new result to be added complete historical table, complete historical will follow query script to pass
Pass next small step query statement and subtract branch for next step, performed according to the concrete condition for being related to data of next small step different
Transmission operation:
- when the data involved by the implementation procedure of next small step are in local machine, complete historical is only needed simply
Local transmission, is not related to network transmission;
- when the data involved by the implementation procedure of next small step are in REMOTE MACHINE, complete historical needs to follow to look into
Asking son asks the server for being sent to distal end to continue executing with;
Further specifically, the transmission complete historical of this step make use of the characteristic of high performance network (RDMA),
One outstanding feature of RDMA communication modes is:(such as less than 2000 bytes), the delay of transmission when transmission data size is less
Keep relatively low level and be basically unchanged.The present invention transmits the less complete historical of data volume using this characteristic, can
To reach higher efficiency of transmission and relatively low transmission delay.This is because the step number inquired about in RDF query and query interface
It is generally less, and historical record is typically converted into digital ID to represent, size of data is smaller, and if next small step is at this
Ground is performed then equivalent to local transmission complete historical, and this operation can avoid the process for communicating.
Subtract a method based on complete historical to realize in the present invention, complete historical is used herein dynamic
The table structure of state is designated as complete historical table storing, and complete historical table is made up of columns and rows, wherein row are used for representing
Query interface included in user's inquiry request, row is used for storing the record entry of historical results.Process of the table in inquiry
In be dynamic change, i.e., its line number and columns may increase in query script or reduce, this is because in query script
In may increase result (such as perform certain small step inquiry) newly, it is also possible to carry out subtracting branch to result (when the sentence of inquiry is present back
During the situation of road), complete historical is stored with such dynamic table structure and seems succinct, convenient, and can guarantee that on table to row
Carry out deleting the high efficiency for increasing operation with row.
The present invention using based on complete historical subtract a method rather than traditional single step subtracts a method, main cause
It is that traditional method that subtracts often causes larger overhead.Traditional single step subtracts a method following problem:
(1) higher communication-cost, the useless intermediate result of redundancy is caused often to remain in single step subtracts a method
Finally, therefore there is substantial amounts of intermediate result disallowable, so as to the waste for causing to communicate;
(2) time-consuming last amalgamation result operation, can only be judged because single step subtracts branch by the result of previous step,
All of useless result cannot be rejected, therefore after query statement has been performed, still there are some results to be unsatisfactory for last requirement,
Finally need to focus on all of result and last amalgamation result operation is carried out on one machine, this is likely to become whole system
Performance bottleneck.
And the present invention takes subtracts a method based on complete historical, have following excellent compared to traditional method that subtracts
Gesture:
(1) effectively prevent tradition and subtract last amalgamation result operation time-consuming in a method, by transmitting complete history note
Recording, all of useless result just can be rejected all in the process of implementation, and need not wait for execution finally carries out again result merging,
Result in historical record needed for included all of user;
(2) it is efficient to communicate to reduce the expense of transmission complete historical using RDMA, using the friendly communications of RMDA
Mode, effectively using the network bandwidth, reduces the delay of transmission, it is to avoid the wasting of resources in conventional method.
In sum, proposed by the present invention to subtract a method based on complete historical, rejecting that can be as early as possible is useless
As a result, the network bandwidth is saved, and makes full use of the characteristic of high performance network transmission, relatively low communication delay can be kept, originally finally
Invention is avoided that tradition subtracts the expense of the very time-consuming last amalgamation result that a method is brought, therefore can utilize to greatest extent
Limited resource, improves the overall performance of inquiry system.
The specific embodiment of the present invention is described above.It is to be appreciated that the invention is not limited in above-mentioned
Particular implementation, those skilled in the art can within the scope of the claims make a variety of changes or change, this not shadow
Ring the flesh and blood of the present invention.In the case where not conflicting, the feature in embodiments herein and embodiment can any phase
Mutually combination.
Claims (5)
1. it is a kind of that a method is subtracted based on complete historical, it is characterised in that to include:
Step 1:Client sends inquiry request, and server receives inquiry request;
Step 2:Server parses inquiry request, the query statement in inquiry request is resolved into into multistep and is performed, wherein, it is described many
Each step in step is designated as small step;
Step 3:Query script is performed according to small step, inquiry intermediate result is drawn, intermediate result is carried out to subtract branch operation, subtracted
Result after branch;
Step 4:The result after branch will be subtracted and all of historical results together add new history table, by new historical record
Table passes to next small step and inquires about for continuing to subtract branch.
2. it is according to claim 1 that a method is subtracted based on complete historical, it is characterised in that the step 1 includes:
Client selects a server to send inquiry request, and server is monitored inquiry request, and initial interrogation related data, emptied
History table, is to perform query script to prepare.
3. it is according to claim 1 that a method is subtracted based on complete historical, it is characterised in that the step 2 includes:
Server is received after inquiry request, and inquiry request is parsed, inquiry request include multiple queries sentence, server according to
Different query statements, resolves into query statement multiple small steps and performs.
4. it is according to claim 1 that a method is subtracted based on complete historical, it is characterised in that the step 3 includes:
Step 3.1:According to the query statement of small step, matching operation is carried out to the data that data are concentrated, if in complete historical
Entry number for sky, then Data Matching is carried out by the constant in query statement, draw intermediate result;If complete historical
In entry number for sky, then by the value of simultaneous variable in query statement and historical record, data set is entered
Row matching operation draws intermediate result;
Step 3.2:The intermediate result drawn to small step execution carries out the different branches that subtract and operates according to following different situation:
- when the corresponding query interface of the intermediate result for newly increasing does not exist in complete historical, in being performed according to small step
Constant carry out subtracting branch operation, the intermediate result for being unsatisfactory for the constant condition is disallowable;
- when the corresponding query interface of the intermediate result for newly increasing has record in complete historical, then according to historical record
In variate-value carry out subtracting branch with whether the variate-value in intermediate result matches, specially:For the intermediate result for newly increasing
Each in corresponding complete historical record, in the variate-value in historical record and the intermediate variable for newly increasing
Corresponding variate-value carries out judging whether equal, and the record is rejected if unequal, otherwise retains the record;
Wherein, the query interface, refers to:Unknown quantity in query statement, needs return the unknown quantity pair in Query Result
The value answered.
5. it is according to claim 1 that a method is subtracted based on complete historical, it is characterised in that the step 4 includes:
Step 4.1:The result after branch will be subtracted to add in complete historical table, a row are increased newly in complete historical table and carried out table
Show the query interface of new addition, and the entry number in complete historical table also accordingly increases or reduces;
Step 4.2:Complete historical table follows query script to pass to next small step query statement to subtract branch, root for next step
Following different transmission operation is performed according to the situation for being related to data of next small step:
- when the data involved by the implementation procedure of next small step are in local machine, complete historical table is locally transmitted,
It is not related to network transmission;
- when the data involved by the implementation procedure of next small step are in REMOTE MACHINE, complete historical table follows inquiry request
The server for being sent to distal end is continued executing with.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611056390.XA CN106599095B (en) | 2016-11-24 | 2016-11-24 | Branch reduction method based on complete historical record |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611056390.XA CN106599095B (en) | 2016-11-24 | 2016-11-24 | Branch reduction method based on complete historical record |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106599095A true CN106599095A (en) | 2017-04-26 |
CN106599095B CN106599095B (en) | 2020-07-14 |
Family
ID=58591987
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611056390.XA Active CN106599095B (en) | 2016-11-24 | 2016-11-24 | Branch reduction method based on complete historical record |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106599095B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108491274A (en) * | 2018-04-02 | 2018-09-04 | 深圳市华傲数据技术有限公司 | Optimization method, device, storage medium and the equipment of distributed data management |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102254001A (en) * | 2011-07-14 | 2011-11-23 | 青岛海信网络科技股份有限公司 | Efficient data management method and system |
CN102546247A (en) * | 2011-12-29 | 2012-07-04 | 华中科技大学 | Massive data continuous analysis system suitable for stream processing |
CN103455556A (en) * | 2013-08-08 | 2013-12-18 | 成都市欧冠信息技术有限责任公司 | Intelligent storage unit data clipping process |
CN103593435A (en) * | 2013-11-12 | 2014-02-19 | 河海大学 | Approximate treatment system and method for uncertain data PT-TopK query |
US20140143281A1 (en) * | 2012-11-20 | 2014-05-22 | International Business Machines Corporation | Scalable Summarization of Data Graphs |
-
2016
- 2016-11-24 CN CN201611056390.XA patent/CN106599095B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102254001A (en) * | 2011-07-14 | 2011-11-23 | 青岛海信网络科技股份有限公司 | Efficient data management method and system |
CN102546247A (en) * | 2011-12-29 | 2012-07-04 | 华中科技大学 | Massive data continuous analysis system suitable for stream processing |
US20140143281A1 (en) * | 2012-11-20 | 2014-05-22 | International Business Machines Corporation | Scalable Summarization of Data Graphs |
CN103455556A (en) * | 2013-08-08 | 2013-12-18 | 成都市欧冠信息技术有限责任公司 | Intelligent storage unit data clipping process |
CN103593435A (en) * | 2013-11-12 | 2014-02-19 | 河海大学 | Approximate treatment system and method for uncertain data PT-TopK query |
Non-Patent Citations (1)
Title |
---|
LEI ZOU 等: "gStore: Answering SPARQL Queries via Subgraph Matching", 《PROCEEDINGS OF THE VLDB EENOWMENT》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108491274A (en) * | 2018-04-02 | 2018-09-04 | 深圳市华傲数据技术有限公司 | Optimization method, device, storage medium and the equipment of distributed data management |
Also Published As
Publication number | Publication date |
---|---|
CN106599095B (en) | 2020-07-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9928113B2 (en) | Intelligent compiler for parallel graph processing | |
CA2562281C (en) | Partial query caching | |
WO2018035799A1 (en) | Data query method, application and database servers, middleware, and system | |
US20140280020A1 (en) | System and Method for Distributed SQL Join Processing in Shared-Nothing Relational Database Clusters Using Self Directed Data Streams | |
US9229961B2 (en) | Database management delete efficiency | |
US20220358178A1 (en) | Data query method, electronic device, and storage medium | |
CN114356971A (en) | Data processing method, device and system | |
CN108829740A (en) | Date storage method and device | |
CN104423982A (en) | Request processing method and device | |
US20190327342A1 (en) | Methods and electronic devices for data transmission and reception | |
US10747773B2 (en) | Database management system, computer, and database management method | |
CN106484694B (en) | Full-text search method and system based on distributed data base | |
CN109117426A (en) | Distributed networks database query method, apparatus, equipment and storage medium | |
CN113568938A (en) | Data stream processing method and device, electronic equipment and storage medium | |
CN106484826A (en) | A kind of method and device of operating database | |
CN115757477A (en) | Database query processing method, device, equipment and storage medium | |
WO2024159628A1 (en) | Ldap-based memory management method and apparatus, device, and storage medium | |
CN106599095A (en) | Pruning method based on complete historical record | |
US20170371927A1 (en) | Method for predicate evaluation in relational database systems | |
CN103366014B (en) | System for cloud computing data handling system and method based on cluster | |
CN111221860A (en) | Mixed query optimization method and device based on big data | |
CN112817799B (en) | Method and device for accessing multiple data sources based on Spring framework | |
CN107506473A (en) | A kind of big data search method based on cloud computing | |
CN104376054B (en) | A kind of processing method and processing device of persisted instances object | |
CN103891244B (en) | A kind of method and device carrying out data storage and search |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |