US8312008B2 - Docbase management system and implementing method thereof - Google Patents
Docbase management system and implementing method thereof Download PDFInfo
- Publication number
- US8312008B2 US8312008B2 US12/391,495 US39149509A US8312008B2 US 8312008 B2 US8312008 B2 US 8312008B2 US 39149509 A US39149509 A US 39149509A US 8312008 B2 US8312008 B2 US 8312008B2
- Authority
- US
- United States
- Prior art keywords
- execution plan
- execution
- plans
- operations
- management system
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related, expires
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
Definitions
- the present invention relates to electronic document processing technologies and particularly to a docbase management system and an implementing method thereof.
- a docbase management system provides the functions of organizing, managing, securing, displaying and storing massive documents.
- the operations to be performed on a document by the application include operations on a predefined universal document model.
- the application issues instructions to the docbase management system via the standard invocation interface, the process of which also may be called as invocation from the application, the docbase management system performs corresponding operations on data of the docbase in the storage device according to the received instructions.
- the docbase management system involves a great amount of logic concepts and operations and supports many functions, it is very difficult to create a well extendable, scalable and maintainable docbase management system. The problem can only be approached in a perspective of the system architecture; otherwise the docbase management system cannot be satisfactorily extendable, scalable and maintainable.
- the objective of the present invention is to provide a docbase management system and an implementing method thereof.
- the docbase management system provided by the present invention includes:
- a first module adapted to parse a received invocation from an application and generate an execution plan which comprises operations on physical storage
- a second module adapted to execute the execution plan to schedule a third module to execute the operations on physical storage in the execution plan
- the third module is adapted to execute the operations on physical storage in the execution plan under the scheduling of the executor.
- the first module includes:
- a first unit adapted to parse the received invocation from an application to build an intermediate form which comprises objects and/or operations of a universal document model
- a second unit adapted to convert the intermediate form into the execution plan which comprises operations on physical storage.
- the docbase management system provided by the present invention further includes a fourth module, which is adapted to select a preferable execution plan from execution plans generated by the first module according to a judgment criterion, and then, the second module executes the preferable execution plan to schedule the third module to execute the operations on physical storage in the preferable execution plan.
- the fourth module in the docbase management system is adapted to optimize the execution plans generated by the first module, and the fourth module selects the preferable execution plan from the optimized execution plans.
- the third module supports the operations on physical storage, wherein the physical storage may include a logical disk partition or physical drive or virtual storage or memory.
- the virtual storage includes remote storage or network storage.
- the remote storage includes a network file system or distributive file system, and the network storage includes a storage area network, GRID, or Peer-to-Peer (P2P) network.
- the above technical scheme has provided a specific structure of the docbase management system. It can be seen from the technical scheme that, in the present invention, the implementation of docbase management system is divided into a plurality of hierarchies. The hierarchies are independent of each other, which makes the docbase management system well extendable, scalable and maintainable.
- the fourth module provided by the present invention is adapted to select the preferable execution plan from execution plans so as to improve the execution performance and eventually improve the performance of the whole docbase management system. And the partial optimization of the initial execution plans further lowers the cost of the selected preferable execution plan and improves performance of the whole docbase management system.
- the method for implementing the docbase management system provided by the present invention includes:
- the process of parsing an invocation from an application and generating an execution plan which comprises operations on physical storage includes:
- the process of converting the intermediate form into an execution plan which comprises operations on physical storage includes:
- the process of scheduling and executing the execution plan includes scheduling and executing the preferable execution plan.
- the process of selecting a preferable execution plan from the execution plans according to a judgment criterion includes optimizing the execution plans and selecting the preferable execution plan from the optimized execution plans.
- the process of optimizing the execution plans includes: optimizing based on any one or any combination of a genetic algorithm, evolutionary algorithm, simulated annealing algorithm, branch and bound algorithm, hill climbing algorithm, heuristic algorithm, artificial neural network algorithm or dynamic programming algorithm.
- the invocation from an application is in a XML format or a customized format which is in compliance with a LALR grammar.
- the intermediate form includes a syntax tree or a document object model tree.
- the judgment criterion includes experience rules, a time cost or space cost of the execution plan, or the combination of the time cost and the space cost of the execution plan.
- the process of selecting a preferable execution plan from the execution plans according to a judgment criterion includes: selecting a preferable execution plan from the execution plans according to an algorithm based on priorities of the experience rules or an algorithm based on weights of the experience rules.
- the above technical scheme has provided a specific method for implementing the docbase management system. It can be seen from the above technical scheme that, in the present invention, the implementation of docbase management system is divided into a plurality of hierarchies. The hierarchies are independent of each other, which makes the docbase management system well extendable, scalable and maintainable. Also in the present invention, the preferable execution plan is selected from execution plans so as to improve the execution performance and eventually improve the performance of the whole docbase management system. In addition, the initial execution plans generated by the first module is partially optimized, so that the cost of the selected preferable execution plan is lowered and performance of the whole docbase management system is improved.
- FIG. 1 is a schematic illustrating hierarchical structure of the docbase management system in accordance with the present invention.
- FIG. 2 is a schematic illustrating the docbase management system in accordance with the present invention.
- FIG. 3 is a flow chart of the method for implementing the docbase management system in accordance with the present invention.
- the implementation of the docbase management system is divided into multiple hierarchies and standards for interfaces between hierarchies are defined.
- FIG. 1 is a schematic illustrating hierarchical structure of the docbase management system in accordance with the present invention.
- the implementation of the docbase management system is divided into multiple hierarchies, which specifically includes: parsing an invocation from an application to build an intermediate form which comprises logical operations, converting the intermediate form which comprises logical operations into an execution plan which comprises operations on physical storage, and executing the execution plan.
- the hierarchies may be implemented in different ways, and the docbase management system can be well extendable, scalable and maintainable.
- FIG. 2 shows a docbase management system in accordance with the present invention.
- the docbase management system includes a parser, a planner, an executor and a storage manipulating module.
- the parser is adapted to parse a received invocation from an application to build an intermediate form consisting of objects and/or operations of a universal document model.
- the planner is adapted to convert the intermediate form parsed by the parser into an execution plan consisting of operations on physical storage.
- the logical operations which constitute the intermediate form are high level concept.
- a logical operation may be mapped to one single physical operation or a sequence of physical operations, and there are maybe more than one mapping possibilities. Therefore an intermediate form may be converted into one of plurality of execution plans. So each time the planner is invoked, it may generate different execution plans based on the same intermediate form, however, those different execution plans are equivalent to one another.
- the executor is adapted to execute the execution plan converted by the planner to schedule the storage manipulating module to execute the operations on physical storage in the execution plan.
- the storage manipulating module is adapted to execute the operations on physical storage in the execution plan under the scheduling of the executor.
- the above is a specific structure of the docbase management system. As long as outputs of the hierarchies conform to the corresponding interface standards, the hierarchies may be implemented in different ways, and the docbase management system can be well extendable, scalable and maintainable.
- the intermediate form outputted by the parser conforms to interface standard.
- the intermediate form may include a syntax tree or a Document Object Model (DOM) tree.
- the invocation from the application to the docbase management system via a standard interface is processed by the parser first.
- the standard interface may be an Unstructured Operation Markup Language (UOML) interface using an Extensible Markup Language (XML), as explained in the prior application of the docbase management system, or may be in form of command strings, or may be in other forms, all of which should conform to the universal document model explained in the prior application of the docbase management system.
- UOML Unstructured Operation Markup Language
- XML Extensible Markup Language
- the invocation from the application is parsed by the parser based on lexis and syntax and converted into the intermediate form which consists of objects and/or operations of the universal document model and conforms to the interface standard.
- the parser in the docbase management system may be an XML parser which is adapted to parse the invocation from the application and generate a DOM tree.
- the parser in the docbase management system may be a lexical and syntax parser created by a Lexical complier (Lex) and a Yet Another Compiler Compiler (YACC).
- the Lex is a tool used for generating a scanner, i.e., a tool for generating a syntax analyzer.
- the YACC is an automatic tool used for generating a LALR(1) analyzer and the first version of YACC was published in early 1970s by Bell Laboratory (author of which is S. C. Johnson). The two tools are widely employed in platforms such as UNIX and DOS. The XML parsing and the Lex and YACC parsing processes are a part of the prior art.
- the above codes indicate a standard interface invocation in XML.
- the interface method is named Appendline and the task of the method is to append a line to a path object whose handle is 0xabcd1234, the coordinates of the two ends of the line are (1000.23, 2193.324) and (3233.234, 2342.234) respectively.
- the parser parses the standard interface invocation in XML and the result of the parsing is a DOM tree, which includes a root element named “call”, and three sub elements two named “stringVal” and one named “compoundVal”.
- the structure of the DOM tree is illustrated as follows:
- a standard interface invocation in a customized language which conforms to LALR(1) grammar is as follows:
- the parser parses the customized invocation from the application by using a corresponding lexical and syntax parser and then generates a syntax tree.
- the lexical and syntax parser can be created by invoking Lex and YACC in advance to process lexis and syntax defined by the customized language of Lex and YACC respectively.
- the syntax tree can be expressed with C structure:
- the tree structure is similar to the structure of the preceding DOM tree.
- the following example illustrates the conversion from logical operations to physical operations by the planner when the intermediate form includes a syntax tree.
- All logical operations L_OP in the syntax tree are enumerated; herein the logical operations also may be sequences of logical operations. Firstly, a physical operation set (P_OP 1 , P_OP 2 , . . . , P_OP m ) which corresponds to L_OP is obtained; herein the physical operation P_OP j also may be a sequence of physical operations. And then, a physical operation P_OP i is chosen for the L_OP. Finally, the preceding steps to choose a physical operation for every logical operation are repeated until all the logical operations in the syntax tree are replaced with corresponding physical operations and an execution plan is thus generated.
- the intermediate form that includes the DOM tree described above is converted by the planner into an execution plan as follows:
- the root node AppendLine of the execution plan is an operation
- the first sub node PathObj is the handle of object Path
- the second sub node CreateLine is also an operation used for creating a line object
- the two sub nodes of CreateLine respectively indicate the starting point and the ending points of the line to be created.
- the result of the operation CreateLine includes a line object, and the operation Appendline will add the line object to the object Path.
- an execution plan usually includes a tree which comprises operations on physical storage, so the executor executes the whole execution plan by performing recursion from the root node of the tree corresponding to the execution plan to the leaf nodes of the tree, and scheduling the storage manipulating module to execute the actual operations from the leaf nodes of the tree to the root node.
- OP 1 , OP 2 and OP 3 are three operations and Para 1 to Para 6 are six parameters of the operations respectively.
- the executor executes the execution plan according to the following order:
- the storage manipulating module in the docbase management system shown in FIG. 2 may be built on varieties of physical or virtual physical storage layers and be restrained by different performances and scales accordingly.
- an interface provided by the physical storage layer i.e., an interface between the storage manipulating module and the physical storage layer, may affect that what kinds of physical operations can be put in the execution plan, so the execution plan generated by the planner also needs to depend on the preset interface.
- the physical storage layer provides only the read/write functions of binary streams
- the physical operations in the execution plan possibly include only two physical operations: read and write.
- the execution plan may include more physical operations.
- the basic objects that the physical storage layer needs to provide include a docbase, document set, document, etc., and the physical storage layer also needs to provide functions of allocating, recycling and reading/writing physical storage.
- the storage manipulating module may be built based on: a file system provided by the operating system, or a logical disk partition provided by the operating system, or an interface provided by the operating system for accessing the physical disk, or an interface directly accessing the physical disk bypassing the operating system, or an interface provided by the operating system for accessing the virtual memory or physical memory, or an interface directly accessing the physical memory bypassing the operating system, or the virtual storage device.
- the objects on the physical storage layer, such as docbase, document set and document can be built accordingly.
- the virtual storage may include remote storage, i.e., a physical storage in another computer device accessible through a system such as Network File System (NFS) or Distributive File System (DFS).
- the virtual storage may also include network storage, i.e., a storage provided by a network, such as the storage in a Storage Area Network (SAN), GRID, Peer-to-Peer (P2P) network, etc.
- SAN Storage Area Network
- P2P Peer-to-Peer
- the storage manipulating module performs the following operations:
- the directory may finally have a structure shown as follows, wherein the documents are shown as the files under the doclist directory:
- An intermediate form may be converted into different execution plans by the planner in the docbase management system.
- the execution plans are equivalent to one another, however, the time and space needed for executing the execution plans usually differ greatly. Therefore, whether the execution plan chosen from an execution plan set is preferable will greatly influence the performance of the docbase management system.
- the docbase management system shown in FIG. 2 may further include an optimizer, which is adapted to select a preferable execution plan from the execution plan set corresponding to the intermediate form according to a preset judgment criterion.
- the optimizer selects the optimum execution plan from the generated execution plan set according to the judgment criterion.
- the “optimum” execution plan is selected based on the judgment criterion or practical requirements. For example, an optimum execution selected to meet the judgment criterion which require shortest execution time may need large execution space, therefore the execution plan will not be the “optimum” when the judgment criterion require smallest execution space.
- the judgment criterion may be based on experience rules or the cost of the execution plan, i.e., the time or space cost of the execution plan or the combination of the time cost and the space cost of the execution plan.
- the optimizer may be implemented in many ways and the following is examples.
- the optimizer in the docbase management system shown in FIG. 2 may select the optimum execution plan according to priorities of the experience rules.
- the judgment criterion of the optimizer includes L experience rules, namely R 1 , R 2 , . . . , R L , and without loss of generality, the priorities of the experience rules follow the inequality R 1 >R 2 > . . . >R L , the optimizer will work as follows.
- Step a2 whether the execution plans in the execution plan set meet the judgment criterion R i is determined in turn. If an execution plan does not meet the judgment criterion R i , the execution plan is marked and deleted from the execution plan set.
- Step a3 if the execution plan set becomes empty, the execution plans marked in Step a2 are put into the execution plan set and whether i equals to L is determined, if i equals to L, an execution plan is selected from the execution plan set at random as the optimum execution plan based on priorities of the experience rules; otherwise 1 is added to i and Step a2 is repeated.
- the optimizer in the docbase management system shown in FIG. 2 also may select the optimum execution plan according to weights of the experience rules.
- the judgment criterion of the optimizer includes L experience rules, namely R 1 , R 2 , . . . , R L , without loss of generality, the weight of the rule R i is identified as PR i , and every execution plan has a weight, the optimizer will work as follows.
- Step b1 the initial weights of all the execution plans are set to 0.
- Step b3 an execution plan with the largest weight is selected as the optimum execution plan according to the weights of all the execution plans.
- any one of these execution plans may be selected as the optimum execution plan based on the weights of the experience rule.
- the optimizer selects the optimum execution plan based on experience rules.
- the optimizer also may select the optimum execution plan based on the cost of the execution plan.
- the cost of the execution plan includes time cost and space cost.
- the time cost includes the time spent on executing the whole execution plan, which mainly includes the disk I/O time.
- the space cost includes the maximum space that may possibly be occupied by a final result and intermediate results during the execution of the whole execution plan. The space cost is calculated based on the memory and disk space to be occupied.
- the optimizer divide an execution plan into basic operations, the time cost of each of the base operations is multiplied by the executing times of each of the base operations and the total time of the execution plan can be calculated by summing the multiplying results of the base operations.
- the optimizer traverses the whole execution plan in recursion to learn how many times each of the basic operations will be carried out and then calculates the total time needed for the execution plan.
- the calculation of space cost usually refers the maximum space needed during the execution.
- the optimizer calculates from the bottom to the top in recursion, compares the space needed for current operation with current maximum space value, if the former one is larger, the optimizer replaces the current maximum space value with the space needed for the current operation.
- the maximum space needed for the execution plan i.e., the space cost of the execution plan.
- the optimizer may select the optimum execution plan depending on the time costs of the execution plans.
- an execution plan has a tree structure and the basic operations of the execution plan include (OP 1 , OP 2 , . . . , OP n ) and the time cost function of the execution plan is indicated as TIME_CALC(NODE node), the calculation of TIME_CALC is show as follows.
- T T+ ⁇ TIME_CALC(SUB i ) is calculated, wherein SUB 1 , SUB 2 , . . . , SUB m , are the sub nodes of node and the dummy variable i ranges from 1 to m.
- the optimizer also may select the optimum execution plan based the space costs of the execution plans.
- Provided an execution plan has a tree structure and the basic operations of the execution plan include (OP 1 , OP 2 , . . . , OP n ), and the space cost function of the execution plan is indicated as SPACE_CALC(NODE node), the calculation of SPACE_CALC is show as follows.
- S MAX(S, SPACE_CALC(SUB i )) is executed, wherein SUB 1 , SUB 2 , . . . , SUB m are sub nodes of node and the dummy variable i ranges from 1 to m.
- the optimum execution plan is selected from the execution plans by the optimizer according to the judgment criterion, so the selected optimum execution plan usually requires lower time or space cost, therefore the performance of the whole docbase management system is improved.
- the optimizer may select the optimum execution plan directly from the execution plans generated by the planner, as mentioned above.
- the optimizer also may optimize the execution plans generated by the planner by using an artificial intelligence algorithm, e.g., a genetic algorithm or an artificial neural network algorithm, and then select the optimum execution plan from the optimized execution plans.
- an artificial intelligence algorithm e.g., a genetic algorithm or an artificial neural network algorithm
- the execution plans are optimized by associating the cost or other measurement parameters of the execution plans as a measurement function with a measurement in the intelligence algorithm, e.g., adaptability in the genetic algorithm or energy in a simulated annealing algorithm, and the space of the execution plans is searched by using those algorithms to get the partially optimized execution plans.
- a measurement in the intelligence algorithm e.g., adaptability in the genetic algorithm or energy in a simulated annealing algorithm
- a method for optimizing the initial execution plans with the genetic algorithm is described as follows. For every initial execution plan, following steps are performed.
- an execution plan tree (a tree structure of the execution plan) is coded into strings to get a string set as the initial population for the genetic algorithm
- e2 the execution time or space is considered as a measurement function of adaptability, and the evolution of the initial population is started;
- a method for optimizing the initial execution plans with the simulated annealing algorithm is described as follows. For every initial execution plan in an execution plan set, following steps are performed.
- C is used to indicate the present execution plan and B is used to indicate the optimized execution plan.
- B is set as C;
- an initial temperature decrease factor ALPHA is set as a value between 0 and 1;
- the cost of selected optimum execution plan is further lowered, and performance of the whole docbase management system is further improved.
- any one or any combination of the parser, planner, optimizer, executor and the storage manipulating module in the present invention may be implemented as an independent module.
- the modules may be implemented as individual DLLs respectively or be combined into one DLL.
- the modules may be implemented as individual .so files respectively or be combined into one .so file.
- the modules may be implemented as individual .class files respectively or be combined into one .class file.
- the modules may be developed with any of the programming languages including C, C++, Python, Ruby, Perl, SmallTalk, Ada, Simula, Pascal, Haskell, etc.
- the optimizer in the docbase management system provided by the present invention is further adapted to optimize the selected preferable execution plan.
- the executor executes the optimized preferable execution plan to schedule the storage manipulating module to execute the operations on physical storage in the optimized preferable execution plan.
- the method of optimizing the preferable execution plan is similar with the process of optimizing the execution plans generated by the planner described above.
- the process for obtaining the execution plan executed by the executor may include:
- the optimizer optimizes the execution plans and selects the preferable execution plan from the optimized execution plans, at the time, the executor executes the preferable execution plan; or,
- the optimizer selects the preferable execution plan from the execution plans and then optimize the preferable execution plan, at the time, the executor executes the optimized preferable execution plan;
- the optimizer optimizes the execution plans, selects the preferable execution plan from the optimized execution plans, and then optimize the selected preferable execution plan, at the time, the executor executes the optimized preferable execution plan.
- the optimizer may directly optimize the only execution plan and the executor executes the optimized execution plan.
- FIG. 3 is a flow chart of a method for implementing the docbase management system in accordance with the present invention. As shown in FIG. 3 , the method for implementing the docbase management system includes following steps.
- Step 301 an invocation from an application is parsed to build an intermediate form consisting of objects and/or operations of a universal document model.
- the invocation from the application to the docbase management system via a standard interface may use the UOML described in a prior patent application document on the docbase management system, or may use command strings, whatever, the invocation from the application should confirm to the universal document model given in the prior patent application document on the docbase management system.
- the invocation from the application is parsed based on the lexis and the syntax and is converted into the intermediate form which comprises objects and/or operations of the universal document model and in compliance with a standard interface.
- the standard interface uses XML, an XML parser may be adopted to generate a DOM tree.
- the standard interface users command strings which usually follow LALR(1) grammar, as long as the definition of the grammar is given, the command strings can be parsed by a lexical and syntax parser created by Lex and YACC.
- Step 302 the intermediate form is converted into an execution plan which comprises operations on physical storage.
- the objects and/or operations of the universal document model which constitute the intermediate form are logical operations and the logical operations are high level concepts, therefore a logical operation may be mapped to one operation on physical storage or a sequence of operations on physical storage, one logical operation may be mapped to different operations or sequences. Therefore an intermediate form may be converted into execution plans. Different execution plans may be generated based on the same intermediate form.
- the process of converting the intermediate form into an execution plan includes following steps.
- logical operations L_OP in the syntax tree are enumerated.
- the logical operations also may be sequences of logical operations.
- a physical operation set (P_OP 1 , P_OP 2 , . . . , P_OP m ) that corresponds to L_OP is obtained, in which the physical operation P_OP j also may be a sequence of physical operations.
- Step 303 the execution plan is scheduled and executed.
- Recursion starts from the root node of the tree corresponding to the execution plan and goes from top to the bottom until leaf nodes of the tree are reached, and then the actual operations are performed from bottom to the top of the tree to complete the whole execution plan.
- Step 302 further includes the following steps.
- Step 3021 the intermediate form which comprises objects and/or operations of the universal document model is converted into execution plans.
- the objects and/or operations of the universal document model which constitute the intermediate form are logical operations, the logical operations are high level concepts, therefore a logical operation may be mapped to one physical operation or a sequence of physical operations, one logical operation may also be mapped to different physical operations or sequences. Therefore an intermediate form may be converted into execution plans. And the execution plans may be generated at random based the intermediate form which comprises the logical operation.
- Step 3022 an optimum execution plan is selected from the execution plans according to a judgment criterion.
- the optimum execution plan is selected from a generated execution plan set according to the judgment criterion. It should be pointed out that the “optimum” execution plan is selected based on the judgment criterion or practical requirements. For example, an optimum execution selected to meet the judgment criterion which require shortest execution time may need large execution space, therefore the execution plan will not be the “optimum” when the judgment criterion require smallest execution space.
- the judgment criterion may be based on experience rules or the cost of the execution plan, i.e., the time or space cost of the execution plan or the combination of the time cost and the space cost of the execution plan.
- Step 3022 operations in Step 3022 may be implemented in many ways and the following is examples.
- a method for selecting the optimum execution plan according to priorities of the experience rules is described as follows.
- the judgment criterion includes L experience rules, namely R 1 , R 2 , . . . , R L , and without loss of generality, the priorities of the experience rules follow the inequality R 1 >R 2 > . . . >R L , the selection process is explained as follows.
- Step b2 If the execution plan set becomes empty, the execution plans marked in Step b2 are put into the execution plan set and whether i equals to L is determined. If i equals to L, proceed to the next step; otherwise 1 is added to i and Step b2 is repeated.
- An execution plan is selected from the execution plan set at random as the optimum execution plan.
- a method for selecting the optimum execution plan according to weights of the experience rules is described as follows. Provided the judgment criterion includes L experience rules, namely R 1 , R 2 , . . . , R L , and without loss of generality, the weight of the rule R i is identified as PR i , the selection process is explained as follows.
- the initial weights of all the execution plans are set to 0.
- An execution plan with the largest weight is selected as the optimum execution plan according to the weights of all the execution plans.
- any one of the execution plans may be selected as the optimum execution plan.
- the cost of the execution plan includes time cost and space cost.
- the time cost includes the time spent on executing the whole execution plan and the space cost includes the maximum space that may possibly be occupied by a final result and intermediate results during the execution of the whole execution plan.
- the disk I/O time involved in the execution makes up the main part of the time cost, so the calculation of the time cost mainly includes the calculation of the disk I/O time.
- the space cost is calculated based on the memory and disk space to be occupied.
- the cost of the optimum execution plan is relatively lower. Therefore, the performance of the docbase management system is improved.
- the method may further include the process of optimizing the execution plans. And after the optimizing process, partially optimized execution plans may be obtained.
- the optimum execution plan may be selected from the optimized execution plans.
- the execution plans are optimized by associating the cost or other measurement parameters of the execution plans as a measurement function with measurement in an intelligence algorithm, e.g., adaptability in the genetic algorithm or energy in the simulated annealing algorithm, and then the space of the execution plans is searched by using those algorithms to get the partial optimized execution plans.
- an intelligence algorithm e.g., adaptability in the genetic algorithm or energy in the simulated annealing algorithm
- the algorithm used for optimizing the execution plans may include the genetic algorithm, the simulated annealing algorithm, etc., and the specific process is explained in the preceding description of the optimizer.
- the cost of selecting optimum execution plan is further lowered, and performance of the whole docbase management system is further improved.
- the implementation of docbase management system is divided into a plurality of hierarchies and the hierarchies are independent of each other, which makes the docbase management system well extendable, scalable and maintainable.
- the optimum execution plan is selected from execution plans so as to improve the execution performance and eventually improve the performance of the whole docbase management system.
- the initial execution plans generated by the planner is partially optimized, so that the cost of the selected optimum execution plan is further lowered and performance of the whole docbase management system is further improved.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Devices For Executing Special Programs (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US12/391,495 US8312008B2 (en) | 2006-08-25 | 2009-02-24 | Docbase management system and implementing method thereof |
| US13/645,382 US20130031085A1 (en) | 2005-12-05 | 2012-10-04 | Docbase management system and implenting method thereof |
Applications Claiming Priority (7)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN200610126538A CN101131695B (zh) | 2006-08-25 | 2006-08-25 | 一种文档库系统及其实现方法 |
| CN200610126538.2 | 2006-08-25 | ||
| CN200610126538 | 2006-08-25 | ||
| CNPCT/CN2007/070476 | 2007-08-14 | ||
| PCT/CN2007/070476 WO2008025281A1 (en) | 2006-08-25 | 2007-08-14 | Document base system and realizing method thereof |
| US12/133,309 US20090320141A1 (en) | 2005-12-05 | 2008-06-04 | Document data security management method and system therefor |
| US12/391,495 US8312008B2 (en) | 2006-08-25 | 2009-02-24 | Docbase management system and implementing method thereof |
Related Parent Applications (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2007/070476 Continuation WO2008025281A1 (en) | 2005-12-05 | 2007-08-14 | Document base system and realizing method thereof |
| US12/133,309 Continuation-In-Part US20090320141A1 (en) | 2005-12-05 | 2008-06-04 | Document data security management method and system therefor |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US13/645,382 Continuation-In-Part US20130031085A1 (en) | 2005-12-05 | 2012-10-04 | Docbase management system and implenting method thereof |
Publications (3)
| Publication Number | Publication Date |
|---|---|
| US20090157992A1 US20090157992A1 (en) | 2009-06-18 |
| US20120117352A9 US20120117352A9 (en) | 2012-05-10 |
| US8312008B2 true US8312008B2 (en) | 2012-11-13 |
Family
ID=39128965
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US12/391,495 Expired - Fee Related US8312008B2 (en) | 2005-12-05 | 2009-02-24 | Docbase management system and implementing method thereof |
Country Status (5)
| Country | Link |
|---|---|
| US (1) | US8312008B2 (de) |
| EP (1) | EP2058742A4 (de) |
| JP (1) | JP2010501948A (de) |
| CN (1) | CN101131695B (de) |
| WO (1) | WO2008025281A1 (de) |
Families Citing this family (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN106161319A (zh) * | 2015-04-13 | 2016-11-23 | 中南大学 | 融合遗传和爬山算法降低vlc-ofdm系统峰均功率比 |
| US10635974B2 (en) | 2015-11-12 | 2020-04-28 | Deepmind Technologies Limited | Neural programming |
| US11769150B2 (en) * | 2017-10-11 | 2023-09-26 | International Business Machines Corporation | Transaction scheduling for block space on a blockchain |
| CN110968594B (zh) * | 2018-09-30 | 2023-04-07 | 阿里巴巴集团控股有限公司 | 数据库查询优化方法、引擎及存储介质 |
| CN111723104B (zh) * | 2019-03-22 | 2025-09-09 | 阿里巴巴集团控股有限公司 | 一种数据处理系统中语法分析的方法、装置及系统 |
Citations (14)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5544355A (en) * | 1993-06-14 | 1996-08-06 | Hewlett-Packard Company | Method and apparatus for query optimization in a relational database system having foreign functions |
| US5600844A (en) * | 1991-09-20 | 1997-02-04 | Shaw; Venson M. | Single chip integrated circuit system architecture for document installation set computing |
| JPH1021125A (ja) | 1996-06-28 | 1998-01-23 | Nec Corp | 分散データベースシステムの所在管理方式 |
| US6341281B1 (en) * | 1998-04-14 | 2002-01-22 | Sybase, Inc. | Database system with methods for optimizing performance of correlated subqueries by reusing invariant results of operator tree |
| US6370522B1 (en) * | 1999-03-18 | 2002-04-09 | Oracle Corporation | Method and mechanism for extending native optimization in a database system |
| US20030093410A1 (en) * | 2001-08-31 | 2003-05-15 | Tanya Couch | Platform-independent method and system for graphically presenting the evaluation of a query in a database management system |
| US20030177137A1 (en) * | 1998-12-16 | 2003-09-18 | Microsoft Corporation | Graphical query analyzer |
| CN1457459A (zh) | 2001-02-22 | 2003-11-19 | 索尼公司 | 发送设备,接收设备,发送和接收设备,发送、接收方法 |
| CN1573759A (zh) | 2003-06-23 | 2005-02-02 | 微软公司 | 公共查询运行期系统以及应用编程接口 |
| US20050071331A1 (en) * | 2003-09-30 | 2005-03-31 | Dengfeng Gao | Estimating the compilation time of a query optimizer |
| US20050187917A1 (en) * | 2003-09-06 | 2005-08-25 | Oracle International Corporation | Method for index tuning of a SQL statement, and index merging for a multi-statement SQL workload, using a cost-based relational query optimizer |
| US7080062B1 (en) * | 1999-05-18 | 2006-07-18 | International Business Machines Corporation | Optimizing database queries using query execution plans derived from automatic summary table determining cost based queries |
| US7448022B1 (en) * | 2004-02-10 | 2008-11-04 | Prasad Ram | Dynamic software composition in a component-based software system |
| US7945557B2 (en) * | 2003-11-25 | 2011-05-17 | International Business Machines Corporation | Method, system, and program for query optimization with algebraic rules |
Family Cites Families (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| DE19515020A1 (de) * | 1994-07-01 | 1996-01-04 | Hewlett Packard Co | Verfahren und Vorrichtung zum Optimieren von Abfragen mit Gruppieren-nach-Operatoren |
| US6112304A (en) * | 1997-08-27 | 2000-08-29 | Zipsoft, Inc. | Distributed computing architecture |
| US7412644B2 (en) * | 2000-02-04 | 2008-08-12 | Aol Llc, A Delaware Limited Liability Company | System and process for delivering and rendering scalable web pages |
| GB2367977B (en) * | 2000-06-29 | 2004-05-12 | Hutchison Telephone Company Lt | Messaging system |
| US7007231B2 (en) * | 2002-01-07 | 2006-02-28 | Chi Hung Dang | Document management system employing multi-zone parsing process |
| JP4313652B2 (ja) * | 2003-11-12 | 2009-08-12 | 川崎マイクロエレクトロニクス株式会社 | スケジューリング装置 |
-
2006
- 2006-08-25 CN CN200610126538A patent/CN101131695B/zh not_active Expired - Fee Related
-
2007
- 2007-08-14 JP JP2009525902A patent/JP2010501948A/ja active Pending
- 2007-08-14 EP EP07800952A patent/EP2058742A4/de not_active Withdrawn
- 2007-08-14 WO PCT/CN2007/070476 patent/WO2008025281A1/zh not_active Ceased
-
2009
- 2009-02-24 US US12/391,495 patent/US8312008B2/en not_active Expired - Fee Related
Patent Citations (14)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5600844A (en) * | 1991-09-20 | 1997-02-04 | Shaw; Venson M. | Single chip integrated circuit system architecture for document installation set computing |
| US5544355A (en) * | 1993-06-14 | 1996-08-06 | Hewlett-Packard Company | Method and apparatus for query optimization in a relational database system having foreign functions |
| JPH1021125A (ja) | 1996-06-28 | 1998-01-23 | Nec Corp | 分散データベースシステムの所在管理方式 |
| US6341281B1 (en) * | 1998-04-14 | 2002-01-22 | Sybase, Inc. | Database system with methods for optimizing performance of correlated subqueries by reusing invariant results of operator tree |
| US20030177137A1 (en) * | 1998-12-16 | 2003-09-18 | Microsoft Corporation | Graphical query analyzer |
| US6370522B1 (en) * | 1999-03-18 | 2002-04-09 | Oracle Corporation | Method and mechanism for extending native optimization in a database system |
| US7080062B1 (en) * | 1999-05-18 | 2006-07-18 | International Business Machines Corporation | Optimizing database queries using query execution plans derived from automatic summary table determining cost based queries |
| CN1457459A (zh) | 2001-02-22 | 2003-11-19 | 索尼公司 | 发送设备,接收设备,发送和接收设备,发送、接收方法 |
| US20030093410A1 (en) * | 2001-08-31 | 2003-05-15 | Tanya Couch | Platform-independent method and system for graphically presenting the evaluation of a query in a database management system |
| CN1573759A (zh) | 2003-06-23 | 2005-02-02 | 微软公司 | 公共查询运行期系统以及应用编程接口 |
| US20050187917A1 (en) * | 2003-09-06 | 2005-08-25 | Oracle International Corporation | Method for index tuning of a SQL statement, and index merging for a multi-statement SQL workload, using a cost-based relational query optimizer |
| US20050071331A1 (en) * | 2003-09-30 | 2005-03-31 | Dengfeng Gao | Estimating the compilation time of a query optimizer |
| US7945557B2 (en) * | 2003-11-25 | 2011-05-17 | International Business Machines Corporation | Method, system, and program for query optimization with algebraic rules |
| US7448022B1 (en) * | 2004-02-10 | 2008-11-04 | Prasad Ram | Dynamic software composition in a component-based software system |
Also Published As
| Publication number | Publication date |
|---|---|
| US20120117352A9 (en) | 2012-05-10 |
| WO2008025281A1 (en) | 2008-03-06 |
| EP2058742A4 (de) | 2012-04-11 |
| JP2010501948A (ja) | 2010-01-21 |
| EP2058742A1 (de) | 2009-05-13 |
| CN101131695B (zh) | 2010-05-26 |
| US20090157992A1 (en) | 2009-06-18 |
| CN101131695A (zh) | 2008-02-27 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN112559554B (zh) | 一种查询语句优化方法及装置 | |
| KR102485179B1 (ko) | 설명 정보 확정 방법, 장치, 전자 기기 및 컴퓨터 저장 매체 | |
| US5822747A (en) | System and method for optimizing database queries | |
| US9275146B2 (en) | Expressing and executing semantic queries within a relational database | |
| US5819255A (en) | System and method for database query optimization | |
| CN110688397B (zh) | 一种基于sql的分布式数据统一访问系统及方法 | |
| CN110309196A (zh) | 区块链数据存储和查询方法、装置、设备及存储介质 | |
| WO2019237333A1 (en) | Converting database language statements between dialects | |
| US8312008B2 (en) | Docbase management system and implementing method thereof | |
| US8539442B2 (en) | Reverse engineering for code file refactorization and conversion | |
| CN117555925B (zh) | 数据库访问代码转换方法、装置及电子设备 | |
| US20220058032A1 (en) | Generation apparatus, program, and generation method | |
| US20170017475A1 (en) | Information processing apparatus and compile method | |
| CN111078705A (zh) | 基于Spark平台建立数据索引方法及数据查询方法 | |
| US12332894B2 (en) | Query runtime for multi-layer composition of queries | |
| CN113608748A (zh) | C语言转换Java语言的数据处理方法、装置及设备 | |
| CN117290377A (zh) | 一种关系型数据库间sql语句的转换方法以及装置 | |
| CN119759598A (zh) | 基于大模型意图感知的算力网络资源智能调度方法 | |
| WO2016194401A1 (ja) | 計算機、データベース処理方法、及び集積回路 | |
| CN110309214A (zh) | 一种指令执行方法及其设备、存储介质、服务器 | |
| CN121143766A (zh) | 一种基于多源知识图谱与大模型协同编排的智能代码生成方法 | |
| US8417701B2 (en) | Generation of a categorization scheme | |
| CN119884192A (zh) | 一种数据库查询任务执行方法、装置、设备及存储介质 | |
| RU2605387C2 (ru) | Способ и система для хранения данных графов | |
| US20130031085A1 (en) | Docbase management system and implenting method thereof |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AS | Assignment |
Owner name: SURSEN CORP., CHINA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:GUO, XU;WANG, DONGLIN;REEL/FRAME:022302/0259 Effective date: 20090223 |
|
| ZAAA | Notice of allowance and fees due |
Free format text: ORIGINAL CODE: NOA |
|
| ZAAB | Notice of allowance mailed |
Free format text: ORIGINAL CODE: MN/=. |
|
| STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
| FEPP | Fee payment procedure |
Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY |
|
| CC | Certificate of correction | ||
| FPAY | Fee payment |
Year of fee payment: 4 |
|
| MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YR, SMALL ENTITY (ORIGINAL EVENT CODE: M2552); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY Year of fee payment: 8 |
|
| AS | Assignment |
Owner name: BEIJING KEY FLOW APPLICATION SCIENCE AND TECHNOLOGY CO., LTD., CHINA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:SURSEN CORP.;REEL/FRAME:052837/0609 Effective date: 20200604 |
|
| AS | Assignment |
Owner name: TIANJIN SURSEN INVESTMENT CO., LTD., CHINA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:BEIJING KEY FLOW APPLICATION SCIENCE AND TECHNOLOGY CO., LTD.;REEL/FRAME:058451/0280 Effective date: 20211221 |
|
| LAPS | Lapse for failure to pay maintenance fees |
Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY |
|
| STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
|
| FP | Lapsed due to failure to pay maintenance fee |
Effective date: 20241113 |