CN106372177A - Query expansion method supporting correlated query and fuzzy grouping of mixed data type - Google Patents
Query expansion method supporting correlated query and fuzzy grouping of mixed data type Download PDFInfo
- Publication number
- CN106372177A CN106372177A CN201610783143.3A CN201610783143A CN106372177A CN 106372177 A CN106372177 A CN 106372177A CN 201610783143 A CN201610783143 A CN 201610783143A CN 106372177 A CN106372177 A CN 106372177A
- Authority
- CN
- China
- Prior art keywords
- fuzzy
- data
- query
- type
- sentence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 14
- 230000002596 correlated effect Effects 0.000 title abstract 2
- 238000013500 data storage Methods 0.000 claims abstract description 6
- 238000013467 fragmentation Methods 0.000 claims description 9
- 238000006062 fragmentation reaction Methods 0.000 claims description 9
- 238000009434 installation Methods 0.000 claims description 9
- 101100129590 Schizosaccharomyces pombe (strain 972 / ATCC 24843) mcp5 gene Proteins 0.000 claims description 7
- 238000000354 decomposition reaction Methods 0.000 claims description 6
- 238000003780 insertion Methods 0.000 claims description 6
- 230000037431 insertion Effects 0.000 claims description 6
- 238000007792 addition Methods 0.000 claims description 3
- 238000012217 deletion Methods 0.000 claims description 3
- 230000037430 deletion Effects 0.000 claims description 3
- 238000012216 screening Methods 0.000 claims description 3
- 238000013461 design Methods 0.000 claims description 2
- 238000002156 mixing Methods 0.000 claims description 2
- 230000006870 function Effects 0.000 abstract description 7
- 238000004220 aggregation Methods 0.000 abstract 2
- 230000002776 aggregation Effects 0.000 abstract 2
- 238000010276 construction Methods 0.000 abstract 1
- 238000004806 packaging method and process Methods 0.000 abstract 1
- 230000006978 adaptation Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 239000004744 fabric Substances 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 230000004899 motility Effects 0.000 description 1
- 238000006116 polymerization reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2468—Fuzzy queries
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Fuzzy Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Automation & Control Theory (AREA)
- Probability & Statistics with Applications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a query expansion method supporting correlated query and fuzzy grouping of a mixed data type. The query expansion method comprises the following steps of: S1, architecture construction; S2, data storage; S3, query expansion; S4, query parsing; S5, mixed connection; S6, fuzzy grouping; S7, result packaging and returning. Aiming at the problems that in a distributed database environment, mixed types of data cannot be connected by a certain rules and designated types of data is limited to an aggregation operation function, the invention provides an aggregation and connection SQL (Structured Query Language) expansion syntax for a user, so that the user can complete the query expansion method comprising fuzzy grouping, fuzzy connection and the like by designated statements. Functionality and adaptability of a distributed database are expanded.
Description
Technical field
The present invention relates to a kind of mixed type data query method supporting fuzzy connection and fuzzy packet.
Background technology
Fast development with computer and information technology and the becoming increasingly popular in every profession and trade application, have daily
The data often reaching hundreds of tb even tens of to hundreds of pb scales produces and collects, and the mass property of data and isomery characteristic are to biography
System database technology particularly centralized data base brings huge challenge.In order to the mysql being widely used at present,
The centralized data base of increasing income such as postgresql provides distributed support, and volume of data storehouse middleware arises at the historic moment, in these
Between part provided the user the scheme of transparent structure data-base cluster, can smooth by existing unit centralized data base
Move to " cloud " end with application, become a kind of important DDM solution.Meanwhile, distributed data base
Middleware can be by different types of underlying database and application integration, if carrying out relevant database with nosql in bottom
Unification is integrated, will be expected to carry out self adaptation storage and searching and managing to the blended data of separate sources and different structure, thus real
Effective management of existing isomery big data.Query function yet with current sql sentence limited it is impossible to support mixed type data
Include connect, packet etc. most-often used inquiry operation.Therefore it is directed to mixed type data, realize the unification using middleware
Storage, and query function is extended so as to support that the correlation inquiry of blended data just seems very necessary.
Content of the invention
The purpose of the present invention is: based on distributed data base middleware, realize the Function Extension of sql sentence, complete including
The blended data inquiry such as fuzzy packet and fuzzy connection.
In order to achieve the above object, the technical scheme is that and provide a kind of pass joint investigation supporting mixed data type
The enquiry expanding method inquiring about fuzzy packet is it is characterised in that comprise the following steps:
Step 1, blended data storage architecture are built;
Data is stored in corresponding node data base for unit according to data type with row by step 2, mixed type data storage
In, this step includes:
Step 2.1, build table, the field type according to specified by configuration file and sql sentence, designated word hop count will be comprised
It is stored in correspondence database according to table, specifically include:
Step 2.1.1, acquisition configuration information, determine that table yet to be built is included in vertical fragmentation configuration information;
Step 2.1.2, analytical decomposition sentence, create, according to sql, field type and the field comprising field in table statement
Length, is divided into structuring and unstructured properties, and write file changes the index operation such as looking into, here as follow-up additions and deletions
On the basis of respectively build act on branch by substatement;
Step 2.1.3, route distribution, by the substatement obtaining in the step 2.1.2 number with corresponding types in configuration respectively
According to storehouse binding, and carry out route distribution, complete to build table;
Step 2.2, insertion data, are deposited into the corresponding relation of data base according to being inserted in data indexed file
In the table of correspondence database, specifically include:
Step 2.2.1, acquisition configuration information, determine that being inserted into table name is included in vertical fragmentation configuration information;
Step 2.2.2, analytical decomposition sentence, the index file that query steps 2.1.2 are generated, according to attribute in file with
The corresponding relation of table, build act on branch by substatement;
Step 2.2.3, route distribution, by the substatement obtaining in the step 2.2.2 data with configuration corresponding types respectively
Storehouse is bound, and carries out route distribution, completes data insertion.
Step 3, mixing query expansion.According to simplifying and functional principle, design sql sentence is as follows:
Select* | and expression [as output_name] [...] and from from_item
[group by column][contain r divided by d]|[start with num1 per num2]
[where condition]
Wherein expression represents field name or an expression formula;From_item represents table to be inquired about, i.e. each number
According to corresponding with table unit in storehouse, it is designated as table1;In this sentence, group by packet and where conditional statement respectively specify that
It is grouped or attended operation:
1) pass through group by and specify to treat grouping field column, and by contain...divided by... or
Start with...per... respectively specifies that and carries out fuzzy division operation to character string or integer row;
2) pass through where given query condition condition, include condition of contact and substatement here, in condition of contact
Including link field c1 in table1 and connected mode, comprise in substatement to do attended operation in the table table2 that inquires about and this table
Field c2;
Step 4, inquiry parsing, system parses to specified sql sentence before route distribution, and obtains relevant parameter,
Specifically include:
Step 4.1, return type parsing, obtain result appearance form after select keyword;
Step 4.2, fuzzy connection parsing, judge whether comprise fuzzy in keyword in sql sentence, comprise, execute step
Rapid 5, otherwise execution step 4.3;
Step 4.3, fuzzy packet parsing, judge whether comprise contain or start with keyword in sql sentence,
Comprise then execution step 6, otherwise judge the non-newly-increased sentence of this sentence, default route distribution simultaneously obtains final result.
Step 5, Hybrid connections, support the attended operation of multi-source heterogeneous data fuzzy matching, and this step includes:
Step 5.1, inquiry are torn open and are write, and prototype statement is split by system according to keyword fuzzy in, extract master respectively and look into
Ask the conditional statement with fuzzy in, as the query statement of table1 and table2, and connected mode is saved in internal memory;
Step 5.2, route binding, query configuration information, respectively by the query statement of table1 and table2 and corresponding road
Bound by node, and carry out route distribution;
Step 5.3, query execution, respectively in the inquiry operation of each node execution point sentence, obtain result set and return successively
To route distribution;
Step 5.4, fuzzy in connect, and obtain connected mode fuzzy in internal memory, to the result obtaining in table1
Collection is filtered with c1 Column Properties, only retains this and is classified as in table2 the result set of c2 row substring and returns;
Step 6, fuzzy packet, treat packet row and comprise designated character string or numeric type by character type and carry out by appointed interval
Packet, this step includes:
Step 6.1, determine packet type, if sentence comprises keyword start with, differentiate that it is numeric type by one
Determine interval to be grouped, execution step 6.2;If comprising contain keyword, determining that it is character type and entering by comprising character string
Row packet, execution step 6.3;
Step 6.2, numeric type packet, parse relevant parameter, according to parameter setting rule of classification, obtain group result collection,
This step includes:
Step 6.2.1, parsing relevant parameter, extract the initial value s=num1 specified by sentence and spacing value δ=num2;
Step 6.2.2, the inquiry of initial results collection, filter group by and start...with... correlative, to data
Storehouse sends inquiry request, obtains initial results collection t;
In step 6.2.3, traversal initial results collection t, each records v, according to formulaDetermine affiliated group of result
Number, and encapsulated by " k:v " form;
Step 6.3, character type packet, parse relevant parameter, according to parameter setting rule of classification, obtain group result collection,
This step includes:
Step 6.3.1, parsing relevant parameter, extract character string r specified by sentence and string delimiter d, by r root
It is divided into multiple substrings according to d, each substring belongs to one group, and distributes group number k;
Step 6.3.2, the inquiry of initial results collection, filter group by and start...with... correlative, to data
Storehouse sends inquiry request, obtains initial results collection t;
In step 6.3.3, traversal initial results collection t, each records v, and screening comprises each substring in step 6.3.1
Record, and encapsulated with " k:v " form.
Step 6.4, group result return, and " k:v " result set that packet execution is returned is encapsulated into one with tabular form
In resultset objects, it is back at route distribution;
Step 7, encapsulated result simultaneously return, and according to the returning result type obtaining in step 4.1, arrange Table Header information, and
Corresponding with returning result content for the Table Header information form being encapsulated as byte stream successively is returned.
Preferably, described step 1 includes:
Step 1.1, build database environment, installation relation type and non-relational data in the environment of unit or multimachine
Storehouse;
Step 1.2, build mycat middleware platform, added different types of data base to centre by configuration file
Part bottom layer node, and specify each node database type, comprise the following steps:
Step 1.2.1, installation mycat, by importing, by mycat source code, the installation completing software in eclipse;
Step 1.2.2, set, by specified database access necessary to jar bag pass through in eclipse
Build path is added in system running environment;
Step 1.2.3, configuration node information, add table table and node in configuration file " schema.xml "
Datanode information, it is intended that the corresponding relation of table table and node datanode, adds vertical fragmentation rule, and will be to be added
Database address and the information such as user name password be added in this configuration file.
The invention provides one kind is extended in sql sentence faced by distributed data base middleware layer, and according to sql language
Sentence carries out route distribution, obtains, in data base's bottom or middleware aspect, the strategy meeting conditional outcome collection.
The invention provides one kind carries out route distribution in distributed data base middleware aspect according to specified sql sentence,
And obtain, in data base's bottom or middleware aspect, the strategy meeting conditional outcome collection according to specified requirementss.Its feature is to support
The fuzzy packet of true-to-shape and the fuzzy connection of blended data, are directed to blended data query function aspect to realize sql sentence
Extension.
Brief description
Fig. 1 is the process schematic of step 5 in the present invention.
Specific embodiment
For making the present invention become apparent, hereby it is described in detail below with preferred embodiment.
The invention provides a kind of extension by sql sentence is realized the fuzzy connection of mixed data type and is specified number
Method according to the fuzzy packet of type.The present invention is directed to mixed type data in distributed database environment and can not establish rules by one
Then connect and specified type data aggregate operating function confinement problems, provide the user polymerization and the sql expanded sentence being connected
Method so as to can method by specifying sentence to complete including query expansion such as fuzzy packet and fuzzy connection, extend point
The feature of cloth data base and adaptability.As a example using mysql and mongodb as bottom layer node data base, concrete steps are such as
Under:
Step 1, framework are built, and using mycat as database middleware, build distributed database environment, and arrange ring
Border variable and bottom layer node information, this step includes:
Step 1.1, build database environment, installation relation type and non-relational data in the environment of unit or multimachine
Storehouse.Here relevant database adopts mysql, and non-relational database adopts mongodb;
Step 1.2, build mycat middleware platform, configuration is added each data base's host in step 1.1 and is each point
Node, and specify the corresponding relation of each node and data base, specifically comprise the following steps that
Step 1.2.1, installation mycat, by importing, by mycat Open Source Code, the installation completing software in eclipse;
Step 1.2.2, set, by specified database access necessary to jar bag pass through in eclipse
Build path is added in system running environment;
Step 1.2.3, configuration node information, to add in configuration file " schema.xml " table (table) and
Datanode (node) information, it is intended that the corresponding relation of table and datanode, adds vertical fragmentation rule, and will be to be added
Database address and the information such as user name password be added in this configuration file.
Step 2, mixed type data storage.According to data type, data is stored in corresponding node data base for unit with row
In, this step includes:
Step 2.1, build table, the field type according to specified by configuration file and sql sentence, designated word hop count will be comprised
It is stored in correspondence database according to table, specifically include:
Step 2.1.1, acquisition configuration information, determine that table yet to be built is included in vertical fragmentation configuration information;
Step 2.1.2, analytical decomposition sentence, build the field type comprising field in table statement according to sql and field is long
Degree, is divided into structuring and destructuring field, builds substatement respectively;
Step 2.1.3, establishment index file, according to table in configuration file and data base's corresponding relation and each point of storehouse class
Type, will build all fields in table statement with " table name: { field name: database name } " form and write file, change as follow-up additions and deletions
The index of operation such as look into;
Step 2.1.4, route distribution, by the substatement obtaining in the step 2.1.2 number with corresponding types in configuration respectively
According to storehouse binding, carry out route distribution, complete to build table.
Step 2.2, insertion data, are deposited into the corresponding relation of data base according to being inserted in data indexed file
In the table of correspondence database, specifically include:
Step 2.2.1, acquisition configuration information, determine that being inserted into table name is included in vertical fragmentation configuration information;
Step 2.2.2, analytical decomposition sentence, the index file that query steps 2.1.2 are generated, according to attribute in file with
The corresponding relation of table, build act on branch by substatement;
Step 2.2.3, route distribution, by the substatement obtaining in the step 2.2.2 data with configuration corresponding types respectively
Storehouse is bound, and carries out route distribution, completes data insertion.
Step 3, query expansion, according to designed query expansion syntax, write specified type data respectively and obscure packet
And the class sql sentence of mixed type data fuzzy connection, specifically include:
The sql sentence that numeric type obscures packet is as follows:
select count(column)from table group by column start with num1 per num2; |
In inquiry table, column arranges the record starting from num1, and record value is divided into one group by every num2, returns each group
Record number;
The sql sentence that character type obscures packet is as follows:
select count(column)from table group by column contain r divided by d; |
In inquiry table, column row comprise the record of character in character string group r, and r, using d as separator, returns each group
Record number;
The sql sentence of fuzzy connection is as follows:
select c1 from table1 where column fuzzy in(select c2 from table2); |
C1 record and c2 record in table2 in inquiry table1, are classified as by the c1 that fuzzy in obtains table1 respectively
The record of the substring of c2 row in table2.
Step 4, inquiry parsing, system parses to specified sql sentence before route distribution, and obtains relevant parameter.
This step includes:
Step 4.1, return type parsing, obtain result appearance form after select keyword;
Step 4.2, fuzzy connection parsing, judge whether comprise fuzzy in keyword in sql sentence, comprise, execute step
Rapid 5, otherwise execution step 4.3;
Step 4.3, fuzzy packet parsing, judge whether comprise contain or start with keyword in sql sentence,
Comprise then execution step 6, otherwise judge the non-newly-increased sentence of this sentence, default route distribution simultaneously obtains final result.
Step 5, Hybrid connections, support the attended operation of multi-source heterogeneous data fuzzy matching, and this step includes:
Step 5.1, inquiry are torn open and are write, and prototype statement is split by system according to keyword fuzzy in, extract master respectively and look into
Ask the conditional statement with fuzzy in, as the query statement of table1 and table2, and connected mode is saved in internal memory;
Step 5.2, route binding, query configuration information, respectively by the query statement of table1 and table2 and corresponding road
Bound by node, and carry out route distribution;
Step 5.3, query execution, respectively in the inquiry operation of each node execution point sentence, obtain result set and return successively
To route distribution;
Step 5.4, fuzzy in connect, and obtain connected mode fuzzy in internal memory, to the result obtaining in table1
Collection is filtered with c1 Column Properties, only retains this and is classified as in table2 the result set of c2 row substring and returns.
Step 5 detailed process is as shown in Figure 1.
In FIG, index file creates in step 2.1.3, and the non-structural data of the type such as string and file exists
In mongodb, general type is stored in mysql.Fuzzy connection according to index file field and data base's corresponding relation, by sentence
It is distributed in corresponding point of storehouse, obtains the implementing result in point storehouse, by fuzzy work n condition of contact, filtering c1 is not c2 substring
Record, obtains final result.
Step 6, fuzzy packet, treat packet row and comprise designated character string or numeric type by character type and carry out by appointed interval
Packet, this step includes:
Step 6.1, determine packet type, if sentence comprises keyword start with, differentiate that it is numeric type by one
Determine interval to be grouped, execution step 6.2;If comprising contain keyword, determining that it is character type and entering by comprising character string
Row packet, execution step 6.3;
Step 6.2, numeric type packet, parse relevant parameter, according to parameter setting rule of classification, obtain group result collection,
This step includes:
Step 6.2.1, parsing relevant parameter, extract the initial value s=num1 specified by sentence and spacing value δ=num2;
Step 6.2.2, the inquiry of initial results collection, filter group by and start...with... correlative, to data
Storehouse sends inquiry request, obtains initial results collection t;
In step 6.2.3, traversal initial results collection t, each records v, according to formulaDetermine affiliated group of result
Number, and encapsulated by " k:v " form;
Step 6.3, character type packet, parse relevant parameter, according to parameter setting rule of classification, obtain group result collection,
This step includes:
Step 6.3.1, parsing relevant parameter, extract character string r specified by sentence and string delimiter d, by r root
It is divided into multiple substrings according to s, each substring belongs to one group, and distributes group number k;
Step 6.3.2, the inquiry of initial results collection, filter group by and start...with... correlative, to data
Storehouse sends inquiry request, obtains initial results collection t;
In step 6.3.3, traversal initial results collection t, each records v, and screening comprises each substring in step 6.3.1
Record, and encapsulated with " k:v " form;
Step 6.4, group result return, and " k:v " result set that packet execution is returned is encapsulated into one with tabular form
In resultset objects, it is back at route distribution.
Step 7, encapsulated result simultaneously return, and according to the returning result type obtaining in step 4.1, arrange Table Header information, and
Corresponding with returning result content for the Table Header information form being encapsulated as byte stream successively is returned.
As can be seen here, this technology is not high for user operation level requirement, and the motility being supplied to user is larger, and can
Give full play to the distinctive function of underlying database.
Claims (2)
1. the enquiry expanding method of a kind of correlation inquiry supporting mixed data type and fuzzy packet is it is characterised in that include
Following steps:
Step 1, blended data storage architecture are built;
Step 2, mixed type data storage, according to data type, data are stored in corresponding node data base with row for unit,
This step includes:
Step 2.1, build table, the field type according to specified by configuration file and sql sentence, designated field data table will be comprised
It is stored in correspondence database, specifically include:
Step 2.1.1, acquisition configuration information, determine that table yet to be built is included in vertical fragmentation configuration information;
Step 2.1.2, analytical decomposition sentence, create, according to sql, field type and the field length comprising field in table statement,
It is divided into structuring and unstructured properties, write file changes as follow-up additions and deletions looks into the index waiting operation, and here is basic
Upper respectively build act on branch by substatement;
Step 2.1.3, route distribution, by the substatement obtaining in the step 2.1.2 data base with corresponding types in configuration respectively
Binding, and carry out route distribution, complete to build table;
Step 2.2, insertion data, according to be inserted into be deposited into the corresponding relation of data base in data indexed file corresponding
In the table of data base, specifically include:
Step 2.2.1, acquisition configuration information, determine that being inserted into table name is included in vertical fragmentation configuration information;
Step 2.2.2, analytical decomposition sentence, the index file that query steps 2.1.2 are generated, according to attribute in file and table
Corresponding relation, build act on branch by substatement;
Step 2.2.3, route distribution, the substatement obtaining in step 2.2.2 is tied up with the data base of configuration corresponding types respectively
Fixed, and carry out route distribution, complete data insertion.
Step 3, mixing query expansion.According to simplifying and functional principle, design sql sentence is as follows:
Select* | and expression [as output_name] [...] and from from_item
[group by column][contain r divided by d]|[start with num1 per num2]
[where condition]
Wherein expression represents field name or an expression formula;From_item represents table to be inquired about, i.e. each data base
In the unit corresponding with table, be designated as table1;In this sentence, group by packet and where conditional statement respectively specify that and carry out
Packet or attended operation:
1) specified by group by and treat grouping field column, and pass through contain...divided by... or start
With...per... respectively specify that and fuzzy division operation is carried out to character string or integer row;
2) pass through where given query condition condition, include condition of contact and substatement here, condition of contact includes
Link field c1 and connected mode in table1, comprise in substatement to do the word of attended operation in the table table2 that inquires about and this table
Section c2;
Step 4, inquiry parsing, system parses to specified sql sentence before route distribution, and obtains relevant parameter, specifically
Including:
Step 4.1, return type parsing, obtain result appearance form after select keyword;
Step 4.2, fuzzy connection parsing, judge whether comprise fuzzy in keyword in sql sentence, comprise then execution step 5,
Otherwise execution step 4.3;
Step 4.3, fuzzy packet parsing, judge whether comprise contain or start with keyword in sql sentence, comprise
Then execution step 6, otherwise judge the non-newly-increased sentence of this sentence, and default route distribution simultaneously obtains final result.
Step 5, Hybrid connections, support the attended operation of multi-source heterogeneous data fuzzy matching, and this step includes:
Step 5.1, inquiry are torn open and are write, and prototype statement is split by system according to keyword fuzzy in, extract respectively main inquiry and
The conditional statement of fuzzy in, as the query statement of table1 and table2, and connected mode is saved in internal memory;
Step 5.2, route binding, query configuration information, respectively by the query statement of table1 and table2 and corresponding route knot
Point binding, and carry out route distribution;
Step 5.3, query execution, respectively in the inquiry operation of each node execution point sentence, obtain result set and are back to road successively
By Issuing Office;
Step 5.4, fuzzy in connect, obtain internal memory in connected mode fuzzy in, to the result set obtaining in table1 with
C1 Column Properties are filtered, and only retain this and are classified as in table2 the result set of c2 row substring and return;
Step 6, fuzzy packet, treat packet row and comprise designated character string or numeric type by character type and carry out point by appointed interval
Group, this step includes:
Step 6.1, determine packet type, if sentence comprises keyword start with, differentiate its be numeric type press certain between
Every being grouped, execution step 6.2;If comprising contain keyword, determining that it is character type and carrying out point by comprising character string
Group, execution step 6.3;
Step 6.2, numeric type packet, parse relevant parameter, according to parameter setting rule of classification, obtain group result collection, this step
Rapid inclusion:
Step 6.2.1, parsing relevant parameter, extract the initial value s=num1 specified by sentence and spacing value δ=num2;
Step 6.2.2, the inquiry of initial results collection, filter group by and start...with... correlative, send out to data base
Go out inquiry request, obtain initial results collection t;
In step 6.2.3, traversal initial results collection t, each records v, according to formulaDetermine the affiliated group number of result, and
Encapsulated by " k:v " form;
Step 6.3, character type packet, parse relevant parameter, according to parameter setting rule of classification, obtain group result collection, this step
Rapid inclusion:
Step 6.3.1, parsing relevant parameter, extract character string r specified by sentence and string delimiter d, by r according to d
It is divided into multiple substrings, each substring belongs to one group, and distributes group number k;
Step 6.3.2, the inquiry of initial results collection, filter group by and start...with... correlative, send out to data base
Go out inquiry request, obtain initial results collection t;
In step 6.3.3, traversal initial results collection t, each records v, and screening comprises the record of each substring in step 6.3.1,
And encapsulated with " k:v " form.
Step 6.4, group result return, and " k:v " result set that packet execution is returned is encapsulated into a result with tabular form
In collection object, it is back at route distribution;
Step 7, encapsulated result simultaneously return, and according to the returning result type obtaining in step 4.1, arrange Table Header information, and by table
The header form that be successively encapsulated as byte stream corresponding with returning result content returns.
2. the query expansion side of a kind of correlation inquiry supporting mixed data type and fuzzy packet as claimed in claim 1
Method is it is characterised in that described step 1 includes:
Step 1.1, build database environment, installation relation type and non-relational database in the environment of unit or multimachine;
Step 1.2, build mycat middleware platform, added different types of data base to middleware bottom by configuration file
Node layer, and specify each node database type, comprise the following steps:
Step 1.2.1, installation mycat, by importing, by mycat source code, the installation completing software in eclipse;
Step 1.2.2, set, by specified database access necessary to jar bag pass through build in eclipse
Path is added in system running environment;
Step 1.2.3, configuration node information, add table table and node datanode in configuration file " schema.xml "
Information, it is intended that the corresponding relation of table table and node datanode, adds vertical fragmentation rule, and by data base to be added
The information such as address and user name password is added in this configuration file.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610783143.3A CN106372177B (en) | 2016-08-30 | 2016-08-30 | Support the correlation inquiry of mixed data type and the enquiry expanding method of fuzzy grouping |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610783143.3A CN106372177B (en) | 2016-08-30 | 2016-08-30 | Support the correlation inquiry of mixed data type and the enquiry expanding method of fuzzy grouping |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106372177A true CN106372177A (en) | 2017-02-01 |
CN106372177B CN106372177B (en) | 2019-09-27 |
Family
ID=57899072
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610783143.3A Expired - Fee Related CN106372177B (en) | 2016-08-30 | 2016-08-30 | Support the correlation inquiry of mixed data type and the enquiry expanding method of fuzzy grouping |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106372177B (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107291964A (en) * | 2017-08-16 | 2017-10-24 | 南京华飞数据技术有限公司 | A kind of method that fuzzy query is realized based on HBase |
CN107515887A (en) * | 2017-06-29 | 2017-12-26 | 中国科学院计算机网络信息中心 | A kind of interactive query method suitable for a variety of big data management systems |
CN108776678A (en) * | 2018-05-29 | 2018-11-09 | 阿里巴巴集团控股有限公司 | Index creation method and device based on mobile terminal NoSQL databases |
CN109783543A (en) * | 2019-01-14 | 2019-05-21 | 广州虎牙信息科技有限公司 | Data query method, apparatus, equipment and storage medium |
CN109885536A (en) * | 2019-02-26 | 2019-06-14 | 深圳众享互联科技有限公司 | One kind is based on the storage of distributed data fragment and fuzzy search method |
WO2019128978A1 (en) * | 2017-12-29 | 2019-07-04 | 阿里巴巴集团控股有限公司 | Database system, and method and device for querying database |
CN110019287A (en) * | 2017-07-20 | 2019-07-16 | 华为技术有限公司 | The method and apparatus for executing structured query language SQL instruction |
CN110162544A (en) * | 2019-05-30 | 2019-08-23 | 口碑(上海)信息技术有限公司 | Heterogeneous data source data capture method and device |
CN110472127A (en) * | 2019-07-17 | 2019-11-19 | 微梦创科网络科技(中国)有限公司 | A kind of data query method and system |
CN110597857A (en) * | 2019-08-30 | 2019-12-20 | 南开大学 | Online aggregation method based on shared sample |
CN111897824A (en) * | 2020-03-25 | 2020-11-06 | 上海云励科技有限公司 | Data operation method, device, equipment and storage medium |
CN112613302A (en) * | 2020-12-31 | 2021-04-06 | 天津南大通用数据技术股份有限公司 | Dynamic credibility judgment method for clauses executing select statement based on database |
CN115145999A (en) * | 2021-03-30 | 2022-10-04 | Sap欧洲公司 | Routing SQL statements to elastic compute nodes using workload classes |
CN117195248A (en) * | 2023-08-04 | 2023-12-08 | 中国科学院软件研究所 | Sectional organization and operation method and device for field encryption of embedded database |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150149155A1 (en) * | 2011-09-24 | 2015-05-28 | Lotfi A. Zadeh | Methods and Systems for Applications for Z-numbers |
CN105740374A (en) * | 2016-01-27 | 2016-07-06 | 国网上海市电力公司 | Distributed memory based three-dimensional platform data fuzzy query method |
CN105740344A (en) * | 2016-01-25 | 2016-07-06 | 中国科学院计算技术研究所 | Sql statement combination method and system independent of database |
-
2016
- 2016-08-30 CN CN201610783143.3A patent/CN106372177B/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150149155A1 (en) * | 2011-09-24 | 2015-05-28 | Lotfi A. Zadeh | Methods and Systems for Applications for Z-numbers |
CN105740344A (en) * | 2016-01-25 | 2016-07-06 | 中国科学院计算技术研究所 | Sql statement combination method and system independent of database |
CN105740374A (en) * | 2016-01-27 | 2016-07-06 | 国网上海市电力公司 | Distributed memory based three-dimensional platform data fuzzy query method |
Cited By (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107515887A (en) * | 2017-06-29 | 2017-12-26 | 中国科学院计算机网络信息中心 | A kind of interactive query method suitable for a variety of big data management systems |
CN107515887B (en) * | 2017-06-29 | 2021-01-08 | 中国科学院计算机网络信息中心 | Interactive query method suitable for various big data management systems |
CN110019287A (en) * | 2017-07-20 | 2019-07-16 | 华为技术有限公司 | The method and apparatus for executing structured query language SQL instruction |
CN110019287B (en) * | 2017-07-20 | 2021-09-14 | 华为技术有限公司 | Method and device for executing Structured Query Language (SQL) instruction |
CN107291964A (en) * | 2017-08-16 | 2017-10-24 | 南京华飞数据技术有限公司 | A kind of method that fuzzy query is realized based on HBase |
CN107291964B (en) * | 2017-08-16 | 2019-11-15 | 南京华飞数据技术有限公司 | A method of fuzzy query is realized based on HBase |
US11789957B2 (en) | 2017-12-29 | 2023-10-17 | Alibaba Group Holding Limited | System, method, and apparatus for querying a database |
WO2019128978A1 (en) * | 2017-12-29 | 2019-07-04 | 阿里巴巴集团控股有限公司 | Database system, and method and device for querying database |
CN108776678B (en) * | 2018-05-29 | 2020-07-03 | 阿里巴巴集团控股有限公司 | Index creation method and device based on mobile terminal NoSQL database |
CN108776678A (en) * | 2018-05-29 | 2018-11-09 | 阿里巴巴集团控股有限公司 | Index creation method and device based on mobile terminal NoSQL databases |
CN109783543A (en) * | 2019-01-14 | 2019-05-21 | 广州虎牙信息科技有限公司 | Data query method, apparatus, equipment and storage medium |
CN109783543B (en) * | 2019-01-14 | 2021-07-02 | 广州虎牙信息科技有限公司 | Data query method, device, equipment and storage medium |
CN109885536A (en) * | 2019-02-26 | 2019-06-14 | 深圳众享互联科技有限公司 | One kind is based on the storage of distributed data fragment and fuzzy search method |
CN109885536B (en) * | 2019-02-26 | 2023-06-16 | 深圳众享互联科技有限公司 | Distributed data fragment storage and fuzzy search method |
CN110162544A (en) * | 2019-05-30 | 2019-08-23 | 口碑(上海)信息技术有限公司 | Heterogeneous data source data capture method and device |
CN110162544B (en) * | 2019-05-30 | 2022-05-27 | 口碑(上海)信息技术有限公司 | Heterogeneous data source data acquisition method and device |
CN110472127A (en) * | 2019-07-17 | 2019-11-19 | 微梦创科网络科技(中国)有限公司 | A kind of data query method and system |
CN110597857A (en) * | 2019-08-30 | 2019-12-20 | 南开大学 | Online aggregation method based on shared sample |
CN110597857B (en) * | 2019-08-30 | 2023-03-24 | 南开大学 | Online aggregation method based on shared sample |
CN111897824A (en) * | 2020-03-25 | 2020-11-06 | 上海云励科技有限公司 | Data operation method, device, equipment and storage medium |
CN112613302B (en) * | 2020-12-31 | 2023-08-18 | 天津南大通用数据技术股份有限公司 | Dynamic credibility judging method for clauses of select statement based on database |
CN112613302A (en) * | 2020-12-31 | 2021-04-06 | 天津南大通用数据技术股份有限公司 | Dynamic credibility judgment method for clauses executing select statement based on database |
CN115145999A (en) * | 2021-03-30 | 2022-10-04 | Sap欧洲公司 | Routing SQL statements to elastic compute nodes using workload classes |
CN117195248A (en) * | 2023-08-04 | 2023-12-08 | 中国科学院软件研究所 | Sectional organization and operation method and device for field encryption of embedded database |
Also Published As
Publication number | Publication date |
---|---|
CN106372177B (en) | 2019-09-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106372177A (en) | Query expansion method supporting correlated query and fuzzy grouping of mixed data type | |
CN102693310B (en) | A kind of resource description framework querying method based on relational database and system | |
CN106372176B (en) | A method of it supports to carry out nested document unified SQL query | |
CN102722542B (en) | A kind of resource description framework graphic mode matching method | |
CN107291807B (en) | SPARQL query optimization method based on graph traversal | |
CN102270232B (en) | Semantic data query system with optimized storage | |
CN106934062A (en) | A kind of realization method and system of inquiry elasticsearch | |
Meimaris et al. | Extended characteristic sets: graph indexing for SPARQL query optimization | |
US20130006968A1 (en) | Data integration system | |
CN105630881B (en) | A kind of date storage method and querying method of RDF | |
CN106610999A (en) | Query processing method and device | |
CN104408159B (en) | A kind of data correlation, loading, querying method and device | |
CN103116625A (en) | Volume radio direction finde (RDF) data distribution type query processing method based on Hadoop | |
CA2973356A1 (en) | Distributed storage and distributed processing query statement reconstruction in accordance with a policy | |
CN106407302A (en) | Method for supporting function of calling specific functions of middleware database through simple SQL | |
CN105912595A (en) | Data origin collection method of relational databases | |
CN103177094B (en) | Cleaning method of data of internet of things | |
CN103646032A (en) | Database query method based on body and restricted natural language processing | |
CN109684349A (en) | A kind of querying method and system calculating interactive analysis based on SQL and figure | |
US20060161525A1 (en) | Method and system for supporting structured aggregation operations on semi-structured data | |
CN109739882A (en) | A kind of big data enquiring and optimizing method based on Presto and Elasticsearch | |
CN110019554B (en) | Data model, data modeling system and method for data driven applications | |
CN108241709A (en) | A kind of data integrating method, device and system | |
CN103902651B (en) | Cloud code query method and device based on MongoDB | |
CN113704575A (en) | SQL method, device, equipment and storage medium for analyzing XML and Java files |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20190927 |
|
CF01 | Termination of patent right due to non-payment of annual fee |